Advanced R Markdown: from templates to Pandoc
This article picks up where the basic R Markdown guide leaves off. You already know what a chunk is, how YAML metadata works, and how to knit an .Rmd into HTML or PDF. The next questions are about scale and control: how do you render dozens of reports from one template, cache long computations safely, customise the output beyond theme: cosmo, and reach into the Pandoc layer to reshape the document itself. That is the territory this guide covers.
Parameterised Reports
The params: block in your YAML header turns an R Markdown file into a template. Each parameter gets a label, a default value, and an input widget that controls how RStudio’s “Knit with Parameters” dialog presents it.
---
title: "Country Report: `r params$country`"
output:
html_document:
toc: true
toc_float: true
theme: cosmo
df_print: paged
params:
country:
label: "Country:"
value: "Norway"
input: select
choices: [Norway, Sweden, Denmark, Finland]
year:
label: "Year:"
value: 2024
input: slider
min: 2010
max: 2024
step: 1
---
Inside the document, every parameter is available as params$<name>. Inline code (r params$country) and chunk code read from the same list, so headers, captions, and dataset filters all reflect the chosen values.
The real payoff comes when you call rmarkdown::render() from a script, where you can loop over parameters and render many variants in a single run without ever opening the file. The function signature you need most often looks like this:
render(
input,
output_format = NULL,
output_file = NULL,
output_dir = NULL,
params = NULL,
...
)
params accepts a named list matching the YAML block. The loop below renders a separate HTML file per country, writing each one into out/ with a date-stamped filename and a quiet log so the console output stays clean when many reports run back-to-back:
library(rmarkdown)
for (country in c("Norway", "Sweden", "Denmark")) {
rmarkdown::render(
input = "report.Rmd",
output_file = paste0("report-", country, "-", Sys.Date(), ".html"),
output_dir = "out/",
params = list(country = country, year = 2024),
quiet = TRUE
)
}
Two things trip people up. First, the input: widget only matters for the RStudio dialog. Programmatic render(..., params = list(...)) skips widget validation entirely, so coerce types yourself. Second, a YAML value: 2024 becomes a numeric; value: "2024" becomes a string. Mismatches break downstream code silently.
Controlling knitr globally
The first chunk of almost every real R Markdown document is a setup block that sets defaults for the rest of the file. The chunk header is ```{r setup, include=FALSE} and the body looks like this:
# ```{r setup, include=FALSE}
knitr::opts_chunk$set(
echo = FALSE,
message = FALSE,
warning = FALSE,
fig.width = 7,
fig.height = 4,
fig.align = "center",
dpi = 300,
out.width = "85%"
)
# ```
opts_chunk$set() applies to every chunk in the document. Per-chunk options still win, so a single chunk can override the global default when needed. For knitr-package-level options such as working directory, progress bars, and upload functions, use opts_knit$set():
knitr::opts_knit$set(
root.dir = "project/",
progress = FALSE,
verbose = FALSE
)
Custom hooks are where this gets interesting. A hook receives the chunk body, options, and environment, and returns a string. The following wraps a chunk in a Pandoc callout div, ready for bookdown-style callout styling:
# ```{r hooks, include=FALSE}
knitr::knit_hooks$set(
box = function(before, options, envir) {
if (before) {
"\n::: {.callout-note}\n"
} else {
"\n:::\n"
}
}
)
# ```
Then any chunk with box = TRUE gets wrapped automatically. You can build callout boxes, syntax-highlighted callouts, themed code blocks, or anything else your Pandoc template supports.
Caching done right
cache = TRUE stores the chunk’s result and skips re-evaluation when the source code has not changed. The two options that keep caching honest are dependson and autodep:
# ```{r raw, cache=TRUE}
raw_data <- read.csv("large.csv") # slow
# ```
# ```{r clean, cache=TRUE, dependson="raw"}
clean_data <- clean(raw_data) # depends on raw
# ```
dependson declares that the current chunk depends on another chunk by label. The cache is invalidated if either chunk’s source changes. autodep = TRUE is a heuristic that scans for object names. It is convenient but conservative: it over-invalidates more often than necessary.
The honest caveat is that cache invalidation is genuinely hard. A cached chunk re-uses old data even if a global object outside the chunk changes. dependson only handles chunk-to-chunk dependencies. If you have a heavy pipeline, the targets package gives you stronger guarantees than chunk caching.
Child documents and dynamic content
The child = chunk option statically embeds another .Rmd at knit time. For dynamic content, use knitr::knit_child() inside a loop. The child file looks like a normal Rmd but reads its variables from the envir argument that the caller passes in, so a single template can produce many variants.
child-region.Rmd:
## Region: `r region`
The data for `r region` is summarised below.
The main Rmd loops over the regions, renders the child for each, and concatenates the rendered fragments back into the parent document. With results = 'asis', the stitched output flows in as if it had been written there by hand, which is what makes per-region or per-customer reports practical from one template:
# ```{r region-loop, echo=FALSE, results='asis'}
regions <- c("North", "South", "East", "West")
res <- character()
for (r in regions) {
res <- c(res, knitr::knit_child("child-region.Rmd", envir = list(region = r)))
}
cat(res, sep = "\n")
# ```
knit_child() returns the rendered child as a string. The envir argument scopes the variables visible to the child; pass quiet = TRUE to suppress progress output. This pattern is the right tool for per-region or per-customer reports built from a single template. Package vignettes use a closely related approach; see the vignette-writing tutorial for the package-author version of this idea.
For templated inline text, knitr::knit_expand() substitutes {{var}} placeholders with values from a list. It is lighter than knit_child() when you do not need a whole mini-document.
Customising output
The two workhorses are rmarkdown::html_document() and rmarkdown::pdf_document(). Useful options to know:
| Option | Output | Purpose |
|---|---|---|
theme | HTML | Bootstrap/bslib theme name ("cosmo", "flatly", "cerulean") |
toc, toc_float | HTML | Table of contents, sticky on scroll |
code_folding | HTML | "hide", "show", or "none" |
df_print | HTML | "paged", "kable", "tibble" |
includes | Both | Inject raw HTML/LaTeX in in_header, before_body, after_body |
css | HTML | Path to a custom stylesheet |
pandoc_args | Both | Pass through to Pandoc (--lua-filter=, --variable=, etc.) |
latex_engine | "pdflatex", "xelatex", "lualatex", "tectonic" | |
keep_tex | Keep the intermediate .tex for debugging |
The includes option is the right tool for adding a corporate header, a custom footer, or a preamble. The pandoc_args option is the door into the Pandoc layer.
Pandoc Lua filters
A Lua filter is a small Lua script that walks the Pandoc AST and modifies elements in place. R Markdown passes filters through pandoc_args, and you can list several filters in order if you need a chain of small transformations:
---
output:
html_document:
pandoc_args:
- "--lua-filter=raise-header.lua"
- "--toc-depth=2"
---
The Lua file itself is a tiny program against the AST. This filter walks every Header element, errors out if the level would drop below 1 (which would mean rewriting the document title to nothing), and otherwise decrements the level by 1. The result is that an ## Section becomes # Section in the output:
function Header(el)
if (el.level <= 1) then
error("I don't know how to raise the level of h1")
end
el.level = el.level - 1
return el
end
The rmarkdown cookbook has a full chapter on Lua filters. The practical takeaway is that any transformation you would otherwise do with regex on the output, such as fixing heading levels, cross-referencing, or code-block styling, can be done upstream in the AST where it is safer and more accurate.
Multi-format reports
One .Rmd can declare multiple output formats. The output: field accepts a list, and each format gets its own configuration block. Pandoc runs once per format and writes the result next to the source file by default. If you want a different working directory, pass output_dir to render():
---
output:
html_document:
toc: true
theme: cosmo
pdf_document:
latex_engine: xelatex
keep_tex: true
word_document:
reference_docx: template.docx
---
Render every declared format with a single call by passing the "all" shortcut to output_format. Pandoc iterates over the YAML, applies each format’s options, and writes the output with a matching extension next to the source file:
rmarkdown::render("report.Rmd", output_format = "all")
output_format = "all" is a string shortcut for “every format declared in YAML.” If you only want some, pass a list of format names or output_format() objects.
Interactive R Markdown
For documents that need to react to user input, set runtime: shiny:
---
output: html_document
runtime: shiny
---
The catch is that a runtime: shiny document must be served by a Shiny server. It will not work as a static HTML file. For dashboards with multiple panels, use the flexdashboard output format instead; it accepts the same runtime: shiny setting and provides a layout grid. htmlwidgets packages such as plotly, leaflet, and DT work in static HTML and are usually the right answer for self-contained reports.
Common gotchas
A few things to watch for:
- Underscores in chunk labels cause trouble in some output formats and downstream packages. Stick to alphanumerics and dashes.
include = FALSEversusecho = FALSE:include = FALSEruns the chunk and suppresses both source and output;echo = FALSEshows output but hides source. To run a setup chunk silently, useinclude = FALSE.results = "asis"is required when you want R-generated markdown to render as markdown rather than be wrapped in a code block.- PDF float placement can be controlled with
fig.pos = "H"(from thefloatpackage) if LaTeX insists on moving your figure. self_contained = FALSEgenerates a folder of dependencies. Good for production sites with caching, bad for emailing a report.- Lua filters require Pandoc 2.0 or later. The
output_format()API has been stable since 2018.
Conclusion
The basic R Markdown workflow covers most needs. The advanced features kick in when the same template has to serve many users, when the document has to look exactly right, or when the report is part of a longer pipeline. Parameterised reports, hooks, caching, Lua filters, and the pandoc_args door are the tools that turn R Markdown from a notebook into a publishing system.
If you are hitting the limits of R Markdown itself, the natural next step is Quarto; it absorbs most of the patterns above and adds multi-language support. For a side-by-side comparison, see Quarto vs R Markdown. For a worked parameterised example in Quarto, the Quarto parameterised reports tutorial maps directly onto the patterns in this article.
See Also
- R Markdown guide. The basics this article builds on.
- R Markdown getting started. Tutorial series entry.
- Quarto guide. The next-generation successor to R Markdown.
- Quarto vs R Markdown. When to migrate.
- Writing R package vignettes. Vignettes are R Markdown under the hood.