13  LaTeX and markdown

Module originally written by Shiro Kuriwaki.

Where are we? Where are we headed?

Up till now, you should have covered:

  • Statistical Programming in R

This is only the beginning of R – programming is like learning a language, so learn more as we use it. And yet R is of likely not the only programming language you will want to use. While we cannot introduce everything, we’ll pick out a few that we think are particularly helpful.

Here will cover

  • Markdown
  • LaTeX (and BibTeX)

as examples of a non-WYSIWYG editor

and the next chapter (you can read it without reading this LaTeX chapter) covers

  • command-line
  • git

command-line are a basic set of tools that you may have to use from time to time. It also clarifies what more complicated programs are doing. Markdown is an example of compiling a plain text file. LaTeX is a typesetting program and git is a version control program – both are useful for non-quantitative work as well.

Check your understanding

Check if you have an idea of how you might code the following tasks:

  • What does “WYSIWYG” stand for? How would a non-WYSIWYG format text?
  • How do you start a header in markdown?
  • What are some “plain text” editors?
  • How do you start a document in .tex?
  • How do you start a environment in .tex?
  • How do you insert a figure in .tex?
  • How do you reference a figure in .tex?
  • What is a .bib file?
  • Say you came across a interesting journal article. How would you want to maintain this reference so that you can refer to its citation in all your subsequent papers?

13.1 Motivation

Statistical programming is a fast-moving field. The beta version of R was released in 2000, ggplot2 was released on 2005, and RStudio started around 2010. Of course, some programming technologies are quite “old”: (C in 1969, C++ around 1989, TeX in 1978, Linux in 1991, Mac OS in 1984). But it is easy to feel you are falling behind in the recent developments of programming. Today we will do a brief and rough overview of some fundamental and new tools other than R, with the general aim of having you break out of your comfort zone so you won’t be shut out from learning these tools in the future.

13.2 Markdown

Markdown is the text we have been using throughout this course! At its core markdown is just plain text. Plain text does not have any formatting embedded in it. Instead, the formatting is coded up as text. Markdown is not a WYSIWYG (What you see is what you get) text editor like Microsoft Word or Google Docs. This will mean that you need to explicitly code for bold{text} rather than hitting Command+B and making your text look bold on your own computer.

Markdown is known as a “light-weight” editor, which means that it is relatively easy to write code that will compile. It is quick and easy and satisfies most presentation purposes; you might want to try LaTeX for more involved papers.

13.2.1 Markdown commands

For italic and bold, use either the asterisks or the underlines,

*italic*   **bold**
_italic_   __bold__

And for headers use the hash symbols,

# Main Header
## Sub-headers

13.2.2 Your own markdown

RStudio makes it easy to compile your very first markdown file by giving you templates. Got to New > R Markdown, pick a document and click Ok. This will give you a skeleton of a document you can compile – or “knit”.

Rmd is actually a slight modification of real markdown. It is a type of file that R reads and turns into a proper md file. Then, it uses a document-conversion called pandoc to compile your md into documents like PDF or HTML.

How Rmds become PDFs or HTMLs

13.2.3 Quarto

R Markdown (.Rmd) files have long been the go-to for reproducible writing workflows for R users. In 2022, Posit, PBC, who created R Markdown announced a new generation of markdown extensions, with Quarto. Quarto (.qmd) files are a variation on R Markdown which allows for including R, python, Observable, Julia, and more within a document. Quarto is largely compatible with older .Rmd files, just by changing the extension. As such, you can integrate LaTeX and markdown seamlessly.

Some benefits of using Quarto include:

13.2.4 A note on plain-text editors

Multiple software exist where you can edit plain-text (roughly speaking, text that is not WYSIWYG).

  • RStudio (especially for R-related links)
  • TeXMaker, TeXShop (especially for TeX)
  • emacs, aquamacs (general)
  • vim (general)
  • Sublime Text (general)

Each has their own keyboard shortcuts and special features. You can browse a couple and see which one(s) you like.

Since June 2021, RStudio has offered a visual editor which tries to bridge the gap between plain-text and WYSIWYG. While writing, it transforms plain markdown, RMarkdown, or Quarto documents into a “WYSISWYM” version, What You See Is What You Mean. Formatting choices, like bold or italicized text are shown as bold or italicized text, rather than as intermediate markdown. Lists, enumerations, and images are shown inline, rather than the code that includes them. This is not a final form though, as styling still occurs when rendering the final document.

13.3 LaTeX

LaTeX is a typesetting program. You’d engage with LaTeX much like you engage with your R code. You will interact with LaTeX in a text editor, and will writing code which will be interpreted by the LaTeX compiler and which will finally be parsed to form your final PDF.

13.3.1 Compile online

  1. Go to https://www.overleaf.com
  2. Scroll down and go to “CREATE A NEW PAPER” if you don’t have an account.
  3. Let’s discuss the default template.
  4. Make a new document, and set it as your main document. Then type in the Minimal Working Example (MWE):
\documentclass{article}
\begin{document}
Hello World
\end{document}

13.3.2 Compile your first LaTeX document locally

LaTeX is a very stable system, and few changes to it have been made since the 1990s. The main benefit: better control over how your papers will look; better methods for writing equations or making tables; overall pleasing aesthetic.

  1. Open a plain text editor. Then type in the MWE
\documentclass{article}
\begin{document}
Hello World
\end{document}
  1. Save this as hello_world.tex. Make sure you get the file extension right.
  2. Open this in your “LaTeX” editor. This can be TeXMaker, Aqumacs, etc..
  3. Go through the click/dropdown interface and click compile.

13.3.3 Main LaTeX commands

LaTeX can cover most of your typesetting needs, to clean equations and intricate diagrams.

Some main commands you’ll be using are below, and a very concise cheat sheet here: https://wch.github.io/latexsheet/latexsheet.pdf

Most involved features require that you begin a specific “environment” for that feature, clearly demarcating them by the notation \begin{figure} and then \end{figure}, e.g. in the case of figures.

\begin{figure}
\includegraphics{histogram.pdf}
\end{figure}

where histogram.pdf is a path to one of your files.

Notice that each line starts with a backslash \ – in LaTeX this is the symbol to run a command.

The following syntax at the endpoints are shorthand for math equations.

$$\int x^2 dx$$

these compile math symbols: \(\displaystyle \int x^2 dx.\)1

The align environment is useful to align your multi-line math, for example.

\begin{align}
P(A \mid B) &= \frac{P(A \cap B)}{P(B)}\\
&= \frac{P(B \mid A)P(A)}{P(B)}
\end{align}
\[\begin{align} P(A \mid B) &= \frac{P(A \cap B)}{P(B)}\\ &= \frac{P(B \mid A)P(A)}{P(B)} \end{align}\]

Regression tables should be outputted as .tex files with packages like xtable and stargazer, and then called into LaTeX by \input{regression_table.tex} where regression_table.tex is the path to your regression output.

Figures and equations should be labelled with the tag (e.g. label{tab:regression} so that you can refer to them later with their tag Table \ref{tab:regression}, instead of hard-coding Table 2).

For some LaTeX commands you might need to load a separate package that someone else has written. Do this in your preamble (i.e. before \begin{document}):

\usepackage[options]{package}

where package is the name of the package and options are options specific to the package.

Further Guides

For a more comprehensive listing of LaTeX commands, Mayya Komisarchik has a great tutorial set of folders: https://scholar.harvard.edu/mkomisarchik/tutorials-0

There is a version of LaTeX called Beamer, which is a popular way of making a slideshow. Slides in markdown is also a competitor. The language of Beamer is the same as LaTeX but has some special functions for slides.

13.4 BibTeX

BibTeX is a reference system for bibliographical tests. We have a .bib file separately on our computer. This is also a plain text file, but it encodes bibliographical resources with special syntax so that a program can rearrange parts accordingly for different citation systems.

13.4.1 What is a .bib file?

For example, here is the Nunn and Wantchekon article entry in .bib form.

@article{nunn2011slave,
  title={The Slave Trade and the Origins of Mistrust in Africa},
  author={Nunn, Nathan and Wantchekon, Leonard},
  journal={American Economic Review},
  volume={101},
  number={7},
  pages={3221--3252},
  year={2011}
}

The first entry, nunn2011slave, is “pick your favorite” – pick your own name for your reference system. The other slots in this @article entry are entries that refer to specific bibliographical text.

13.4.2 What does LaTeX do with .bib files?

Now, in LaTeX, if you type

  \textcite{nunn2011slave} argue that current variation in the trust among citizens of African countries has historical roots in the European slave trade in the 1600s.

as part of your text, then when the .tex file is compiled the PDF shows something like

in whatever citation style (APSA, APA, Chicago) you pre-specified!

Also at the end of your paper you will have a bibliography with entries ordered and formatted in the appropriate citation.

This is a much less frustrating way of keeping track of your references – no need to hand-edit formatting the bibliography to conform to citation rules (which biblatex already knows) and no need to update your bibliography as you add and drop references (biblatex will only show entries that are used in the main text).

13.4.3 Stocking up on your .bib files

You should keep your own .bib file that has all your bibliographical resources. Storing entries is cheap (does not take much memory), so it is fine to keep all your references in one place (but you’ll want to make a new one for collaborative projects where multiple people will compile a .tex file).

For example, Gary’s BibTeX file is here: https://github.com/iqss-research/gkbibtex/blob/master/gk.bib

Citation management software (Mendeley or Zotero) automatically generates .bib entries from your library of PDFs for you, provided you have the bibliography attributes right.

Exercise

Create a LaTeX document for a hypothetical research paper on your laptop and, once you’ve verified it compiles into a PDF, come show it to either one of the instructors.

You can also use overleaf if you have preference for a cloud-based system. But don’t swallow the built-in templates without understanding or testing them.

Each student will have slightly different substantive interests, so we won’t impose much of a standard. But at a minimum, the LaTeX document should have:

  • A title, author, date, and abstract
  • Sections
  • Italics and boldface
  • A figure with a caption and in-text reference to it.

Depending on your subfield or interests, try to implement some of the following:

  • A bibliographical reference drawing from a separate .bib file
  • A table
  • A math expression
  • A different font
  • Different page margins
  • Different line spacing

Concluding the Prefresher

Math may not be the perfect tool for every aspiring political scientist, but hopefully it was useful background to have at the least:

But we should be aware that too much slant towards math and programming can miss the point:

Keep on learning, trying new techniques to improve your work, and learn from others!

Your Feedback Matters

Please tell us how we can improve the Prefresher: The Prefresher is a work in progress, with material mainly driven by graduate students. Please tell us how we should change (or not change) each of its elements:

https://harvard.az1.qualtrics.com/jfe/form/SV_esbzN8ZFAOPTqiV


  1. Enclosing with $$ instead of $$ also has the same effect, so you may see it too. But this is now discouraged due to its inflexibility.↩︎