Benchmarking R Code

· 5 min read · Updated March 18, 2026 · intermediate
r performance benchmarking optimization microbenchmark

Benchmarking is essential for identifying performance bottlenecks and comparing alternative implementations in R. While you could use Sys.time() for rough timing, the microbenchmark package provides nanosecond-precision measurements that capture meaningful differences between code variants.

Installing microbenchmark

The microbenchmark package is available on CRAN and installs like any other R package:

install.packages("microbenchmark")
library(microbenchmark)

How microbenchmark Works

The microbenchmark() function runs your R expressions multiple times and records how long each execution takes. By default, it runs each expression 100 times, which gives you statistically meaningful results without taking excessive time.

Here’s the basic syntax:

microbenchmark(
  expression_name = some_r_code(),
  another_expression = alternative_code(),
  times = 100L  # number of iterations
)

The function returns a microbenchmark object that you can print directly or plot for visual comparison.

Your First Benchmark: Custom vs Built-in median()

Let’s start with a practical example. Imagine you’ve written a custom median function and want to compare it to R’s built-in median():

custom_median <- function(x) {
  sorted_x <- sort(x)
  n <- length(sorted_x)
  
  if (n %% 2 == 1) {
    return(sorted_x[(n + 1) / 2])
  } else {
    return((sorted_x[n / 2] + sorted_x[n / 2 + 1]) / 2)
  }
}

# Create test data
set.seed(42)
random_ints <- sample(1:1000, 500, replace = TRUE)

# Verify both produce the same result
all.equal(custom_median(random_ints), median(random_ints))
# [1] TRUE

# Run the benchmark
bench_median <- microbenchmark(
  custom_median = custom_median(random_ints),
  built_in_median = median(random_ints)
)

print(bench_median)

Typical output looks like:

Unit: microseconds
            expr      min       lq     mean   median       uq      max
 custom_median  150.212  165.103  180.456  172.345  188.234  350.123
 built_in_median  35.421   38.567   42.891   40.123   44.567  120.345

The built-in median() is consistently faster—about 4-5x in this case. This makes sense since your custom version calls sort() (an O(n log n) operation), while R’s implementation uses optimized C code.

Visualizing Results

The microbenchmark package integrates with ggplot2 through autoplot(), making it easy to visualize performance differences:

library(ggplot2)
autoplot(bench_median)

This creates a violin plot showing the distribution of timings across all iterations. You’ll typically see that built-in functions have tighter distributions and lower medians than custom implementations.

Real-World Example: Reading CSV Files

Benchmarking becomes critical when comparing packages that solve the same problem differently. A common scenario is reading CSV files—should you use base R’s read.csv() or data.table::fread()?

library(data.table)
library(microbenchmark)

# Benchmark CSV reading
bench_read <- microbenchmark(
  read_base = read.csv("your_data.csv"),
  read_data_table = fread("your_data.csv"),
  times = 50L
)

print(bench_read)

In benchmarks with millions of rows, fread() is typically 5-10x faster than read.csv(). The difference comes from fread() using multi-threaded file reading and intelligent parsing, while read.csv() is single-threaded and more general-purpose.

Real-World Example: Data Aggregation

Another common bottleneck is data aggregation. Let’s compare dplyr and data.table for grouping and summarizing:

library(dplyr)
library(data.table)

# Assume data is already loaded as data_dt (data.table) and data_dplyr (data.frame)

agg_dplyr <- function(data) {
  data %>%
    group_by(category) %>%
    summarise(
      count = n(),
      mean_value = mean(value, na.rm = TRUE),
      sum_value = sum(value, na.rm = TRUE)
    )
}

agg_datatable <- function(data) {
  data[, .(
    count = .N,
    mean_value = mean(value, na.rm = TRUE),
    sum_value = sum(value, na.rm = TRUE)
  ), by = category]
}

bench_agg <- microbenchmark(
  dplyr_approach = agg_dplyr(data_dplyr),
  datatable_approach = agg_datatable(data_dt),
  times = 10L
)

print(bench_agg)

For large datasets (millions of rows), data.table is often 10-100x faster than dplyr for grouped aggregations. However, for small datasets (thousands of rows), the difference may be negligible.

Interpreting Benchmark Results

When reading microbenchmark output, focus on the median time rather than the mean. The median is more robust to outliers caused by system interrupts or garbage collection.

Key metrics:

  • min: Fastest execution time (best case)
  • median: Typical execution time (most reliable)
  • mean: Average (can be skewed by outliers)
  • max: Slowest execution (often system-related)

For decision-making, compare medians. If one approach is consistently 2x faster across multiple runs, that’s a real performance difference worth exploiting.

Best Practices for Accurate Benchmarks

  1. Warm up runs: microbenchmark runs 2 warm-up iterations by default to spin up the CPU from idle states. Don’t disable this unless you’re measuring cold-start performance.

  2. Control the environment: Close other applications and avoid system load. Background processes introduce variability.

  3. Use realistic data: Test with data sizes similar to production. Benchmarks on tiny vectors don’t reflect real-world performance.

  4. Increase iterations for stable results: For quick estimates, 10-100 iterations is fine. For publication or critical decisions, run 1000+ iterations.

  5. Repeat the benchmark: Run your benchmark multiple times to ensure consistency. If results vary wildly, investigate the cause.

Common Pitfalls

Comparing apples to oranges: Make sure your compared functions produce identical results. Use all.equal() or identical() to verify before benchmarking.

Benchmarking the wrong thing: If you’re measuring IO, include file loading in the benchmark. If you’re measuring computation, use pre-loaded data.

Ignoring memory allocation: Creating large objects repeatedly adds overhead that may dominate your timing. Consider whether your benchmark reflects realistic usage patterns.

Comparing Multiple Approaches

You can compare as many expressions as needed:

bench_compare <- microbenchmark(
  base_sapply = sapply(1:1000, function(x) x^2),
  base_vapply = vapply(1:1000, function(x) x^2, numeric(1)),
  purrr_map = purrr::map_dbl(1:1000, ~ .x^2),
  times = 1000L
)

print(bench_compare)

This pattern is useful for comparing multiple packages or approaches to the same problem.

When to Use microbenchmark

microbenchmark is ideal for:

  • Comparing two or more functions that do the same thing
  • Measuring the performance impact of code changes
  • Deciding between base R, tidyverse, or data.table approaches
  • Identifying which parts of your code are slowest

For profiling specific functions or understanding where time is spent in complex code, use R’s built-in Rprof() instead.

Conclusion

The microbenchmark package gives you accurate, repeatable performance measurements for R code. Start with simple comparisons to understand the performance characteristics of different approaches, then use those insights to optimize the parts of your code that matter most.

For most workflows, focus on I/O operations (reading/writing files) first—these typically offer the biggest performance gains. Then optimize in-memory computations where needed.


See Also