rguides

Reproducible Analysis Reports with Quarto

If you have ever struggled with keeping your analysis, code, and narrative in sync, Quarto is about to make your life much easier. It is a publishing system that lets you combine markdown, code, and output into beautiful documents that anyone can reproduce.

What is Quarto?

Quarto is an open-source scientific and technical publishing system. Think of it as the evolution of R Markdown—same idea (code plus narrative equals document), but with a cleaner design, better defaults, and support for Python, R, Julia, and Observable JavaScript all in one project.

The key advantage for R users is that Quarto handles code chunks the same way R Markdown did, but with improved rendering and a more consistent syntax. Your existing R knowledge transfers directly.

Prerequisites

Before you start, make sure you have:

  • R installed (version 4.0 or later)
  • RStudio (optional but recommended) or a code editor
  • Quarto CLI installed on your system

You can check if Quarto is installed by running:

quarto --version

If it is not installed, grab the installer from quarto.org.

Setting up a Quarto document

Creating a new Quarto project is straightforward. In RStudio, go to File > New File > Quarto Document. You will see options to create an HTML, PDF, or Word document. Start with HTML—it is the easiest to preview and debug.

A basic Quarto document starts with a YAML header:

---
title: "My Analysis"
format: html
---

Below the YAML, you write regular markdown. Here is a minimal example that includes an R code chunk:

# This is a simple R code chunk
message("Hello from Quarto!")

The {r} syntax starts an R code chunk. The lines after are chunk options—echo = shows the code in the output, and fig.cap gives the chunk a name.

Including R code chunks

Code chunks are where the magic happens. You can execute R code directly in your document, and the results appear inline or as figures. Here is a more complete example with data manipulation:

library(dplyr)

# Create sample data
data <- tibble(
  name = c("Alice", "Bob", "Charlie"),
  age = c(25, 30, 35),
  score = c(85, 92, 78)
)

# Transform the data
result <- data %>%
  mutate(score_pct = score / 100)

print(result)

Quarto executes the code when you render the document and captures the output automatically. This means your document is always up to date with your latest analysis—no copy-pasting required.

Adding data visualizations

Visualizations go in code chunks too. Quarto automatically captures ggplot2 plots and includes them in your output:

library(ggplot2)

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  labs(
    title = "Car Weight vs. Miles Per Gallon",
    x = "Weight (1000 lbs)",
    y = "Miles per Gallon"
  )

Key chunk options here:

  • echo = FALSE hides the R code (only the plot shows)
  • fig.cap adds a caption below the figure

You can also control figure dimensions, alignment, and whether to include the code in the output.

Rendering to HTML or PDF

When you are ready to publish, render your document with the Quarto CLI:

quarto render my-document.qmd

This creates my-document.html (or .pdf if you changed the format). For PDF output, you will need a LaTeX distribution installed—Quarto can use TinyTeX, which is easy to set up:

quarto install tinytex

RStudio users can also click the Render button in the toolbar for a one-click preview.

Best practices for reproducibility

A reproducible report is one that anyone (including future you) can re-run from scratch. Here is how to make that happen:

  1. Seed your random numbers, If your analysis involves randomness, set a seed:
set.seed(123)  # Makes results reproducible
  1. Use relative paths, Keep your data files in the same project folder so the document works anywhere.

  2. Document dependencies, Include a chunk that prints your R environment:

sessionInfo()
  1. Clean up intermediate files, Do not leave temporary objects hanging around between chunks unless necessary.

  2. Test your document, Render it on a fresh machine or in a container to catch missing dependencies.

Report structure and reproducibility

A well-structured Quarto analysis report follows a standard pattern: (1) a data import and cleaning section, (2) an exploratory analysis section with summary statistics and plots, (3) the modeling or analysis section, and (4) a conclusions section. This structure maps to how analysts actually work and makes the report easy to navigate.

freeze: auto in _quarto.yml or per-document YAML caches computation results and only re-executes code when the source changes. This makes reports fast to rebuild when only prose changes, which is important for iterative writing. cache: true at the chunk level provides finer-grained caching.

For large analysis projects, a targets pipeline can generate the key R objects (cleaned data, fitted models, summary tables), and the Quarto document can tar_read() these objects rather than executing long computations inline. This separates the compute pipeline from the presentation layer and avoids re-running hours of computation when the report prose changes.

Parameterized reports

params in the YAML frontmatter create report parameters that can be set at render time. A report with params: {year: 2024, region: "North"} generates a customized report for each combination of parameters. quarto::quarto_render("report.qmd", execute_params = list(year = 2025, region = "South")) renders with specific parameter values. Batch rendering multiple parameter combinations can be automated with purrr::walk() over a parameter grid.

For collaborative reports, Quarto documents in a shared Git repository enable version-controlled collaboration. freeze: auto in _quarto.yml ensures that collaborators who do not have all the required R packages installed can still build the report using cached computation results.

Analysis reports vs notebooks

An analysis report is a finished document: it has a narrative structure, hides implementation details, and presents conclusions for a specific audience. A notebook is a working document: it shows all the code, all the intermediate outputs, and the exploratory path. Quarto supports both modes, but they require different authoring choices.

For analysis reports, set echo: false globally to hide code from readers who care about conclusions, not implementation. Use inline code to embed computed values directly in prose, this keeps numbers in text synchronized with the code that computes them. Structure the document around the analytical questions you are answering, not around the sequence of operations you performed.

Reproducibility and caching

Analysis reports should be fully reproducible: running the source file from scratch produces the same document. This requires that all data reading, all computations, and all figure generation happen inside the document rather than in separate scripts whose outputs are loaded. When a report loads pre-computed results rather than computing them, the connection between data and conclusions is broken, and the report can silently become stale when the underlying data changes.

Caching with execute: cache: true in the YAML header stores the output of code chunks. Chunks that have not changed are not re-run on the next render, which dramatically speeds up iteration for documents with slow computations. The cache is keyed on the chunk code, so changing any part of a chunk invalidates its cache and forces re-execution. For dependent chunks, where chunk B uses the output of chunk A, mark them with the same cache.path or use explicit dependency declarations to ensure cache invalidation propagates correctly.

Communicating uncertainty

Good analysis reports communicate uncertainty, not just point estimates. Include confidence intervals alongside effect sizes. When showing model predictions, show prediction intervals in addition to fitted values. Use language that reflects the strength of the evidence, “the data are consistent with” rather than “the data prove.” Quarto makes it easy to include formatted uncertainty quantifications inline: the same computed value that enters the table can enter the prose.

Visualizations of uncertainty are often more effective than numerical intervals. Error bars, shaded bands on line charts, and violin plots all communicate distributional information that a single number cannot. Choose the visualization that matches the claim being made, error bars for a comparison, a prediction interval band for a fitted curve, a distribution for a Bayesian posterior.

Conclusion

Quarto gives you a clean, powerful way to combine analysis and narrative into shareable documents. Start with HTML to iterate quickly, then switch to PDF for final reports. The key is keeping your code in the document from day one—it is the easiest way to ensure your results are always reproducible.

The workflow takes a bit of adjustment if you are used to running R scripts separately, but the payoff is worth it. Your colleagues will thank you when they can reproduce your analysis with a single command.

See also

  • quarto-getting-started, Getting Started with Quarto tutorial
  • rmarkdown-guide, R Markdown: Reproducible Reports guide quarto::quarto_render('report.qmd', output_format = 'pdf') renders from R code, enabling integration into pipelines and automation scripts.