Benchmarking R Code
Benchmarking is essential for identifying performance bottlenecks and comparing alternative implementations in R. While you could use Sys.time() for rough timing, the microbenchmark package provides nanosecond-precision measurements that capture meaningful differences between code variants.
Installing microbenchmark
The microbenchmark package is available on CRAN and installs like any other R package:
install.packages("microbenchmark")
library(microbenchmark)
How microbenchmark Works
The microbenchmark() function runs your R expressions multiple times and records how long each execution takes. By default, it runs each expression 100 times, which gives you statistically meaningful results without taking excessive time.
Here’s the basic syntax:
microbenchmark(
expression_name = some_r_code(),
another_expression = alternative_code(),
times = 100L # number of iterations
)
The function returns a microbenchmark object that you can print directly or plot for visual comparison.
Your First Benchmark: Custom vs Built-in median()
Let’s start with a practical example. Imagine you’ve written a custom median function and want to compare it to R’s built-in median():
custom_median <- function(x) {
sorted_x <- sort(x)
n <- length(sorted_x)
if (n %% 2 == 1) {
return(sorted_x[(n + 1) / 2])
} else {
return((sorted_x[n / 2] + sorted_x[n / 2 + 1]) / 2)
}
}
# Create test data
set.seed(42)
random_ints <- sample(1:1000, 500, replace = TRUE)
# Verify both produce the same result
all.equal(custom_median(random_ints), median(random_ints))
# [1] TRUE
# Run the benchmark
bench_median <- microbenchmark(
custom_median = custom_median(random_ints),
built_in_median = median(random_ints)
)
print(bench_median)
Typical output looks like:
Unit: microseconds
expr min lq mean median uq max
custom_median 150.212 165.103 180.456 172.345 188.234 350.123
built_in_median 35.421 38.567 42.891 40.123 44.567 120.345
The built-in median() is consistently faster—about 4-5x in this case. This makes sense since your custom version calls sort() (an O(n log n) operation), while R’s implementation uses optimized C code.
Visualizing Results
The microbenchmark package integrates with ggplot2 through autoplot(), making it easy to visualize performance differences:
library(ggplot2)
autoplot(bench_median)
This creates a violin plot showing the distribution of timings across all iterations. You’ll typically see that built-in functions have tighter distributions and lower medians than custom implementations.
Real-World Example: Reading CSV Files
Benchmarking becomes critical when comparing packages that solve the same problem differently. A common scenario is reading CSV files—should you use base R’s read.csv() or data.table::fread()?
library(data.table)
library(microbenchmark)
# Benchmark CSV reading
bench_read <- microbenchmark(
read_base = read.csv("your_data.csv"),
read_data_table = fread("your_data.csv"),
times = 50L
)
print(bench_read)
In benchmarks with millions of rows, fread() is typically 5-10x faster than read.csv(). The difference comes from fread() using multi-threaded file reading and intelligent parsing, while read.csv() is single-threaded and more general-purpose.
Real-World Example: Data Aggregation
Another common bottleneck is data aggregation. Let’s compare dplyr and data.table for grouping and summarizing:
library(dplyr)
library(data.table)
# Assume data is already loaded as data_dt (data.table) and data_dplyr (data.frame)
agg_dplyr <- function(data) {
data %>%
group_by(category) %>%
summarise(
count = n(),
mean_value = mean(value, na.rm = TRUE),
sum_value = sum(value, na.rm = TRUE)
)
}
agg_datatable <- function(data) {
data[, .(
count = .N,
mean_value = mean(value, na.rm = TRUE),
sum_value = sum(value, na.rm = TRUE)
), by = category]
}
bench_agg <- microbenchmark(
dplyr_approach = agg_dplyr(data_dplyr),
datatable_approach = agg_datatable(data_dt),
times = 10L
)
print(bench_agg)
For large datasets (millions of rows), data.table is often 10-100x faster than dplyr for grouped aggregations. However, for small datasets (thousands of rows), the difference may be negligible.
Interpreting Benchmark Results
When reading microbenchmark output, focus on the median time rather than the mean. The median is more robust to outliers caused by system interrupts or garbage collection.
Key metrics:
- min: Fastest execution time (best case)
- median: Typical execution time (most reliable)
- mean: Average (can be skewed by outliers)
- max: Slowest execution (often system-related)
For decision-making, compare medians. If one approach is consistently 2x faster across multiple runs, that’s a real performance difference worth exploiting.
Best Practices for Accurate Benchmarks
-
Warm up runs: microbenchmark runs 2 warm-up iterations by default to spin up the CPU from idle states. Don’t disable this unless you’re measuring cold-start performance.
-
Control the environment: Close other applications and avoid system load. Background processes introduce variability.
-
Use realistic data: Test with data sizes similar to production. Benchmarks on tiny vectors don’t reflect real-world performance.
-
Increase iterations for stable results: For quick estimates, 10-100 iterations is fine. For publication or critical decisions, run 1000+ iterations.
-
Repeat the benchmark: Run your benchmark multiple times to ensure consistency. If results vary wildly, investigate the cause.
Common Pitfalls
Comparing apples to oranges: Make sure your compared functions produce identical results. Use all.equal() or identical() to verify before benchmarking.
Benchmarking the wrong thing: If you’re measuring IO, include file loading in the benchmark. If you’re measuring computation, use pre-loaded data.
Ignoring memory allocation: Creating large objects repeatedly adds overhead that may dominate your timing. Consider whether your benchmark reflects realistic usage patterns.
Comparing Multiple Approaches
You can compare as many expressions as needed:
bench_compare <- microbenchmark(
base_sapply = sapply(1:1000, function(x) x^2),
base_vapply = vapply(1:1000, function(x) x^2, numeric(1)),
purrr_map = purrr::map_dbl(1:1000, ~ .x^2),
times = 1000L
)
print(bench_compare)
This pattern is useful for comparing multiple packages or approaches to the same problem.
When to Use microbenchmark
microbenchmark is ideal for:
- Comparing two or more functions that do the same thing
- Measuring the performance impact of code changes
- Deciding between base R, tidyverse, or data.table approaches
- Identifying which parts of your code are slowest
For profiling specific functions or understanding where time is spent in complex code, use R’s built-in Rprof() instead.
Conclusion
The microbenchmark package gives you accurate, repeatable performance measurements for R code. Start with simple comparisons to understand the performance characteristics of different approaches, then use those insights to optimize the parts of your code that matter most.
For most workflows, focus on I/O operations (reading/writing files) first—these typically offer the biggest performance gains. Then optimize in-memory computations where needed.
See Also
- R Documentation: microbenchmark — Official package documentation
- Advanced R: Performance — Deep dive into R performance characteristics
- data.table vs dplyr — Comprehensive data.table resources