(the source code for this document is available here)

RMarkdown

What?

RMarkdown is a type of markup language that can be used to generate different types of documents that are common in academic research (.docx, .pdf, .html). You have probably used Microsoft Word most of your academic career to create documents for your classes. This type of program is classified as WYSIWYG (what you see is what you get), which means that the user can see exactly how the document looks at the same time that they are working on it. This differs from using a language to generate a document—in this case RMarkdown—because we have to give the program specific instructions regarding how we want to format different parameters (i.e., font size/type, bold type, cursive, etc.). Initially using a markup language like RMarkdown might seems like it adds unnecessary complexity to your workflow. You have to learn a new syntax before you will get the results you expect and you might think that typing what you want directly in word is much easier. However, as we will see below, using a markup language like RMarkdown can help us with numerous tasks that we have to do as part of conducting scientific research. For this reason we can consider RMarkdown a research tool, and you will see that it is well worth the initial effort it takes to learn to use it.

How?

RMarkdown has its own syntax, which is based on Markdown. Markdown is a language used to simplify creating content for the web. It translates simple content to HTML. RMarkdown uses the Markdown language and allows us to integrate R code.

Text

When we generate normal text (by normal I mean without adding frills like bold type) we create normal text in RMarkdown. Huh? By this I mean that by default the text we generate is pretty “standard”. We can end a line of text by adding two spaces at the end.

We wrap a single asterisk (*) around text if we want it to appear in italics. We wrap two asterisks (**) around text if we want to create bold text. We can also change the text in other ways, for examples usingsuperscripts or crossing words out. We can even include links (click that link to learn more about RMarkdown syntax).

Lists

We can create:

  • unordered lists
  • using a single dash (-)

as well as…

  1. ordered lists
  2. using numbers (1. , 2. , etc.)

This is how I did that in RMarkdown…

We can create: 

- unordered lists
- using a single dash (\-)

as well as...

1. ordered lists
2. using numbers (1. , 2. , etc.)

Tables

It is possible to create tables too…

Table Header Second Header
Table Cell Cell 2
Cell 3 Cell 4

Why?

We have seen a little big about how RMarkdown works. Now we will see a few of the features that makes it so useful for academia.

Functionality

RMarkdown is written in a .Rmd file, which is a simple text file. Writing in a text file has several advantages.

  • Text files are small
  • We don’t need expensive, proprietary software (we can open and edit a text file with any text editor)
  • It works on any operating system

Variable output

An .Rmd file can be converted to a word document, a pdf, or a standalone website. All we have to do is modify the yaml front matter. Specifically, we modify output: "html_document" to “pdf_document” or “word_document”.

Reproducible research

Normally our statistical analyses and the documents we use to present them are independent files/formats. For example, we might fit a model using SPSS and then report the results in our paper, which we later send to a journal to be published. This separation is the enemy. It is counterproductive if our goal is conduct reproducible investigations. Using RMarkdown we can include our analyses (scripts in R) in the same document we send for publication.

R

Comments

#############################
# Always comment your code! #
#############################

# This is a comment
2 + 2
## [1] 4

Basic math

#####################
# R as a calculator #
#####################

2 + 2
## [1] 4
4^2
## [1] 16
(12 * 15) / 2
## [1] 90

Assigment

# R uses objects

# Store the value '2' to the object 'x'
x <- 2
print(x)
## [1] 2
# Perform an operation on 'x'
x + 4
## [1] 6

Install packages

# install a package
install.packages("tidyverse")

Analyze data

library(tidyverse)
glimpse(cars)
## Rows: 50
## Columns: 2
## $ speed <dbl> 4, 4, 7, 7, 8, 9, 10, 10, 10, 11, 11, 12, 12, 12, 12, 13, 13, 1…
## $ dist  <dbl> 2, 10, 4, 22, 16, 10, 18, 26, 34, 17, 28, 14, 20, 24, 28, 26, 3…
my_analysis <- lm(speed ~ dist, data = cars)

summary(my_analysis)
## 
## Call:
## lm(formula = speed ~ dist, data = cars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.5293 -2.1550  0.3615  2.4377  6.4179 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  8.28391    0.87438   9.474 1.44e-12 ***
## dist         0.16557    0.01749   9.464 1.49e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.156 on 48 degrees of freedom
## Multiple R-squared:  0.6511, Adjusted R-squared:  0.6438 
## F-statistic: 89.57 on 1 and 48 DF,  p-value: 1.49e-12

Make a plot

ggplot(cars, aes(x = dist, y = speed)) + 
  geom_point() + 
  geom_smooth(method = 'lm')