rmarkdown
R commands will be presented in a gray box:
print('Hello world!')
## [1] "Hello world!"
White box following a gray box represents the R’s output. Blue box contains exercises to see if you have well understood notions:
EXERCISE
This box will contain some exercises that you must do to be sure that you have well understand all notions.
Solutions to exercises will be given in a green box (they will be only available after the session).
SOLUTION
This box will contain answers to the previous exercises.
Purple box contains some additional information (for advanced users), so it is not necessary to read or understand them in the first reading. You will probably understand them after you get some experiences with R.
INFO
This box will contain additional informations that are not necessary to understand in your first reading.
Orange box contains warning to care about in order to well use R and have good practices.
WARNING
This box will contain warnings or good practices, that you must take about.
Some programmatic languages is intended to generate textual documents, like HTML that is intended to display web pages or LaTeX that is intended to generate printable documents.
Here we will introduce the Markdown language that is a light-weighted markup language and easy readable suitable to generate documents without knowing a lot of syntaxes. This language can be coupled with R to form the Rmarkdown syntax (Allaire et al. 2022).
INFO
This document is just an overview of the possibility that offer Rmarkdown and will only present the main functionalities and options. For a full documention you can refer to the R Markdown: The Definitive Guide.
A Rmarkdown document is a text document containing a sequence between markdown and R blocks of code, with an optional header block at the top level of the document in YAML 2.1). The file extension is “.Rmd” and it is a regular text file. You can create and edit them with any text editor. Nevertheless, RStudio is the recommended ones as it provides a lot of options to deal with such kind of file.
Figure 2.1: Overview of a R markdown text document. The next block code after the HEADER block can also be a R block.
INFO
It exists also the Sweave documents (“.Rnw”) that combine R and LaTeX, instead of Markdown syntaxe. Its principle is identical to Rmarkdown but with a different syntax concerning the code generating the document.
The header section is optional (but greatly recommended) and indicates general options of the document when it will be compiled as the output format, the title/author/date of the document, the link towards a bibliographic file, inserting of a table of contents, etc. The header is in YAML format and look like:
+++
title: Le titre du document
author: Matthieu Jung
date: January 23, 2022
output: pdf_document
params:
country: France
---
We see here, that the header section must be declared between +++
and ---
lines and could contains the following keys:
Key | Description |
---|---|
title |
Indicates the title of the document. |
author |
Indicates the name of the author(s). |
date |
Indicates the date of the document. You could also used R code to generate it dynamically, for instance: `r Sys.Date()` to get today’s date. |
output |
Indicates the desired output format. For instance: pdf_document to generate a PDF document or html_document to generate an HTML document. Other formats are also available like Word, PowerPoint, etc. |
params |
Indicates a list of key/value pairs that will be available in the params R list. So, to access to these values you can use the params$country syntax. This is useful if you want to generalize your report, for instance you can have a parameter indicating the path of the input files, the options for function, etc. |
WARNING
As in YAML :
has a special meaning, if your title (or other value) include this character wrap it into two "
characters:
title: "Report: second phase"
INFO
Remember that the header section is optional. You can compile a document without setting any header option. In this case all default values will be applied.
output
keyThe output
key is special in that it is possible to pass more than juste one value, ie defining many type of the output format for the same document and each of them with its own set of key/value pairs options. For instance:
+++
title: Le titre du document
author: Matthieu Jung
date: January 23, 2022
output:
pdf_document:
toc: yes
toc_depth: 3
number_sections: yes
html_document:
toc: no
number_sections: yes
---
In this example a table of contents with the three first levels of headings will be displayed if the document will be compiled in PDF, but not in HTML (the default value is FALSE
so it is not necessary to specify this option, but it is just for the example…) and we want that the headings are automatically numbered in both format.
To see the complete list of available options and their definition you can do
help(pdf_document)
or ?html_document
.
WARNING
Each type of output format has its own set of options with its own default values. For instance, the toc_depth
option which indicates the depth of header to include in the table of contents has its default value to 3 for html_document
and 2 for pdf_document
.
Including a bibliography needs to specify a supplementary file that will contain the bibliography references formatted in a certain manner. Many tools like Zotero, EndNote or websites (PubMed, journal website, HAL, etc.) give you the possibility to export references formatted in the right way.
You have many possibility to format your bibliography file:
Format | File extension |
---|---|
BibLaTeX | .bib |
BibTeX | .bibtex |
CSL JSON | .json |
CSL YAML | .yaml |
RIS | .ris |
INFO
See https://pandoc.org/MANUAL.html#citations for more information about the different available formats.
Here an example at the BibTex format:
@Manual{rmarkdown,
title = {rmarkdown: Dynamic Documents for R},
author = {JJ Allaire and Yihui Xie and Jonathan McPherson and Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley Wickham and Joe Cheng and Winston Chang and Richard Iannone},
year = {2022},
note = {R package version 2.12},
url = {https://github.com/rstudio/rmarkdown},
}
If your bibliography file bibliography.bib
is in the same directory you can add the bibliography
key in the header with the name of the file. Otherwise, you should indicate the absolute path were your file is or its relative path from where the document will be compiled.
In your document you must also specify a section call References
to indicate where you want to put the bibliography:
+++
title: Le titre du document
author: Matthieu Jung
date: January 23, 2022
bibliography: bibliography.bib
---
# References
To cite an entry, use @key
or [@key]
(the latter puts the citation in braces), eg @rmarkdown
is rendered as Allaire et al. (2022), and [@rmarkdown]
generates (Allaire et al. 2022).
WARNING
By default, only references that are cited in the document will appear in your document. So you can have a bibliographic reference file with all of your favorite papers and cite only those that you need for a given document.
If you want to cite all referencies included in your bibliographic file file, you can use the option nocite: '@*'
in header.
INFO
In R you have the citation()
function that gives you the list of citations to reference the package you used in your publications. Moreover, you can also get the reference in BibTex format with the toBibtex()
function. For instance with the rmarkdown
package:
citation('rmarkdown')
##
## To cite the 'rmarkdown' package in publications, please use:
##
## JJ Allaire and Yihui Xie and Jonathan McPherson and Javier Luraschi
## and Kevin Ushey and Aron Atkins and Hadley Wickham and Joe Cheng and
## Winston Chang and Richard Iannone (2022). rmarkdown: Dynamic
## Documents for R. R package version 2.12. URL
## https://rmarkdown.rstudio.com.
##
## Yihui Xie and J.J. Allaire and Garrett Grolemund (2018). R Markdown:
## The Definitive Guide. Chapman and Hall/CRC. ISBN 9781138359338. URL
## https://bookdown.org/yihui/rmarkdown.
##
## Yihui Xie and Christophe Dervieux and Emily Riederer (2020). R
## Markdown Cookbook. Chapman and Hall/CRC. ISBN 9780367563837. URL
## https://bookdown.org/yihui/rmarkdown-cookbook.
##
## To see these entries in BibTeX format, use 'print(<citation>,
## bibtex=TRUE)', 'toBibtex(.)', or set
## 'options(citation.bibtex.max=999)'.
toBibtex(citation('rmarkdown'))
## @Manual{,
## title = {rmarkdown: Dynamic Documents for R},
## author = {JJ Allaire and Yihui Xie and Jonathan McPherson and Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley Wickham and Joe Cheng and Winston Chang and Richard Iannone},
## year = {2022},
## note = {R package version 2.12},
## url = {https://github.com/rstudio/rmarkdown},
## }
##
## @Book{,
## title = {R Markdown: The Definitive Guide},
## author = {Yihui Xie and J.J. Allaire and Garrett Grolemund},
## publisher = {Chapman and Hall/CRC},
## address = {Boca Raton, Florida},
## year = {2018},
## note = {ISBN 9781138359338},
## url = {https://bookdown.org/yihui/rmarkdown},
## }
##
## @Book{,
## title = {R Markdown Cookbook},
## author = {Yihui Xie and Christophe Dervieux and Emily Riederer},
## publisher = {Chapman and Hall/CRC},
## address = {Boca Raton, Florida},
## year = {2020},
## note = {ISBN 9780367563837},
## url = {https://bookdown.org/yihui/rmarkdown-cookbook},
## }
It is also possible to indicate many bibliographic files to the bibliography
key, eg to separate citations by domain (bioinformatics, biology, mathematics, my own papers, etc.). To do that, just put the list of your files into brackets [ ]
:
+++
bibliography: [jung.bib, biology.bib, bioinfo.bib]
---
The text in a Rmarkdown document is written with the Markdown syntax.
There are many flavors of Markdown invented by different people, but rmarkdown
uses the Pandoc’s Markdown. For a full description of the available you can have a look at https://pandoc.org/MANUAL.html#pandocs-markdown.
We present here a summary of the frequently used commands.
Headings is preceded by the #
character, how many #
you use indicates the level of the heading from 1 to 6.
# H1
## H2
### H3
#### H4
##### H5
###### H6
INFO
If you have activated the number_sections
option in the header, then all
sections will be automatically prefixed by section numbers (like this document).
To prevent that for a given heading, you can add {-}
at the end of the header:
# H1 {-}
INFO
Alternatively, for H1 and H2 levels, you can use the underline style:
Alt-H1
======
Alt-H2
------
Markdown uses two line breaks to indicate a new paragraph. A line that ends with two spaces indicates a new line inside a paragraph.
This is a paragraph.
This is another on.
And a new line inside a paragraph.
Horizontal rules are inserted when you use only three (or more) underscores _
, asterisks *
or hyphens -
in a paragraph.
This paragraph is display before the first brek line.
___
This one between two.
***
This one too.
---
And this one is the last one.
Text between one asterisks *
is displayed in italic, two **
in bold, and ***
in bold and italic. Text between two tildes ~~
is strike-throught, between one tilde ~
is subscripted and between one circumflex ^
is upperscripted.
This paragraph contains text in *italic*, **bold** and ***bold and italic***.
Moreover both ~ indicates some ~~strike-through texts~~.
You can also indicates molecule H~2~0 or exponential number 10^2^.
A link is indicated between <
and >
. If you want to link a text, you must indicates the text between [ ]
followed by the link in ( )
. Use !
to display an picture.
The best search engine is <http://www.google.com>, but I prefere this
[one](https://www.ecosia.org).
You can also include picture with a similar maner: 
Unordered lists are introduced with -
, ordered lists with a number 1.
(the
final dot is optional, is like you prefere) and checked list with - [ ]
.
Ordered list:
1. First item
2. Second item
- First sub-item
3. Third item
1.1. First sub-item
Unordered list:
- First item
- Second item
- Third item:
1. Sub-first item
2. Sub-second item
Checked list:
- [ ] First item
- [x] Second item
- [ ] Third item
In-line code are wrapped between two backquotes `
, and paragraph code between two lines composed of only three backquotes ```
. Each line of a blockquote must be prexifexd by >
character.
I insert commands inside a pragraphe `if then else`.
```
if (r>0) {
print('R is positive')
} else {
print('r is negative or null')
}
```
> This is a blockquote.
> On two lines.
When you create a table, columns must be separated by a pipe character |
and header must be separated of the body by a line of characters -
.
You can use the :
character to indicate if the content must be left, center
or right align. Spaces between separator are optional and only for human
reading. For instance:
Column | Column
------ | ------
Cell | Cell
Letter|Digit|Character
---|---|---
a|4|$
|365|(
b| |^
Column | Column | Column
:----- | :----: | -----:
Left | Center | Right
align | align | align
Syntax for writing R code that will be interpreted is similar to the syntaxe of code citation, except that we add {r}
after the three first backquotes. For instance:
```{r}
a <- 1:10
cat(a)
```
The length of the variable a is `r length(a)`.
A such kind of block is called a chunk. You can also execute little piece of
code inside text using the `r `
syntax. Note the later is more often used
when you want to insert the content of a variable inside a paragraph, not for doing a task.
Keep in mind that all variables declared inside a chunk are seen through all next chunks. So you can split your code in many chunk as you need, each chunk eventually separated by markdown’s code.
INFO
Rmarkdown supports also many other programming languages. So it is possible to declare a python chunk (with ```{python}
syntax), and the code inside the chunk must therefore be in python and will be interpreted by a python interpreter.
See https://bookdown.org/yihui/rmarkdown/language-engines.html for more information.
Warning. Objects are not shared between different programming languages. So variables declared in a R chunk are only available in next R chunks, the same stands for python and julia programming languages which have the particularity to share the same session throught all chunks. All other languages do not share session between chunks, so all objects declared in a chunk are only available for the given chunk.
For each chunk, you can set options to parametrize a certain number of elements. A complete list of options can be found here https://yihui.org/knitr/options/, we present below only the most often used:
Option | Default | Description |
---|---|---|
eval |
TRUE |
If FALSE, the code in the code chunk will not be run. |
echo |
TRUE |
If FALSE, do not display the code in the code chunk in the final document. |
results |
'markup' |
If 'hide' , do not display the code’s results in the final document. If 'hold' , display all output pieces at the end of the chunk. If 'asis' , pass through results without reformatting them (useful if results return raw HTML, etc.) |
error |
TRUE |
If FALSE , do not display any error messages generated by the code. |
message |
TRUE |
If FALSE , do not display any messages generated by the code. |
warning |
TRUE |
If FALSE , do not display any warning messages generated by the code. |
cache |
FALSE |
If TRUE , the results will be cached to reuse in future. Results will be reused until the code chunk is altered. Warning: If you change previous chunks that have an impact to a cached chunk, the chunk will not be recomputed. |
Moreover, chunks can have a name to reference it in the document or called it again later. Each option must be separated by a comma ,
and arguments to be passed to each one must be written in R (ie it is also possible to provide R variables, define in previous chunk or in header section). For instance:
+++
params:
echo: no
---
```{r name1, echo=params$echo, results='asis'}
a <- 1:10
cat(a)
```
It is possible to set default options for all later chunks using this trick:
```{r}
knitr::opts_chunk$set(echo=FALSE)
```
WARNING
Chunk name must be unique. An error will arrive when you compile a document with at least two chunk with the same name… Be careful when you copy/paste.
Previously we have seen how we can introduce picture with markdown. But what about graphics that we generate in the document with R?
Each graphic generated inside a chunk will be automatically included in the final document (through adequate chunk options) and where the chunk is.
```{r}
library(ggplot2)
ggplot(data.frame(x=1:10, y=1:10), aes(x=x, y=y)) + geom_line()
```
To manipulate figures, some supplementary chunk options are available:
Option | Default | Description |
---|---|---|
fig.align |
'default' |
How to align graphics in the final document. One of 'left' , 'right' , or 'center' . |
fig.cap |
NULL |
A character string to be used as a figure caption. |
fig.height , fig.width |
7 |
The width and height to use in R for plots created by the chunk (in inches). |
out.height , out.width |
NULL |
The width and height to scale plots to in the final output. |
As for figures, it is also possible to automatically generate tables from data.frame
or matrix
R objects. For this you could use the kable
function of the knitr
package (come with rmarkdown
package).
INFO
It exists a variety of other R packages that help you to generate table from your data.
Another often used package is xtable
; see https://bookdown.org/yihui/rmarkdown-cookbook/table-other.html for a more exhaustive list.
You can extend the possibility that offers you kable
with the kableExtra
package.
For instance:
```{r}
knitr::kable(head(mtcars[, 1:4]))
```
WARNING
Don’t try to print in the document your whole dataset…
To compile a Rmarkdown document in a R session you must use the render
function of the rmarkdown
package:
rmarkdown::render('my_document.Rmd')
or if you are in RStudio, you can use the button just over the file.
INFO
During the process compilation, rmarkdown
will use the knitr
package (it is for this reason that chunk options depend of this one) to generate a markdown
file, then it will use pandoc
software to transform your markdown
document into a PDF, HTML, etc. one (6.1).
Figure 6.1: Pipeline realized by rmardown
when a document is compiled. From https://rmarkdown.rstudio.com/lesson-2.html.
sessionInfo()
command which gives you the version of all R packages used to generate it.You can visit many websites or read books to inspire you, to learn more about rmarkdown
and associated packages you can have a look to:
You have also a cheat sheet that is very useful because it synthesize all functions:
EXERCISE
Write and generate a document compiling the two previous exercice session.