Functional Iteration with purrr
If you’ve ever written a for loop in R to process each element of a vector, you’ve already done functional iteration. The purrr package makes this pattern cleaner, more consistent, and more powerful. It replaces verbose loops with concise functions that iterate for you—and does it better.
This tutorial covers the core purrr functions you’ll use daily: map() for single vectors, map2() for pairs, and pmap() for multiple arguments. You’ll also learn how to handle errors gracefully so a single failure doesn’t crash your entire pipeline.
Why purrr?
R’s base R has several iteration functions—lapply(), sapply(), mapply()—but they have inconsistent interfaces and unpredictable output types. purrr provides a unified system where:
- Every map variant returns a predictable type
- Functions are composable with the pipe operator
- Error handling is built-in
The core idea is functional programming: instead of writing loops, you pass a function to purrr, and it applies that function to each element.
The map() Family
The map() function applies a function to every element of a vector or list. It always returns a list.
map() for Single Vectors
library(purrr)
# Square each number
numbers <- c(1, 2, 3, 4, 5)
squared <- map(numbers, ~ .x^2)
squared
# [[1]]
# [1] 1
#
# [[2]]
# [1] 4
#
# [[3]]
# [1] 9
#
# [[4]]
# [1] 16
#
# [[5]]
# [1] 25
The formula ~ .x^2 is a shorthand anonymous function. .x represents the current element. You could also write map(numbers, function(x) x^2) or map(numbers, \(x) x^2).
Typed Variants
purrr provides type-specific variants that return atomic vectors instead of lists:
# Return a double vector
map_dbl(numbers, ~ .x^2)
# [1] 1 4 9 16 25
# Return a character vector
map_chr(letters[1:5], toupper)
# [1] "A" "B" "C" "D" "E"
# Return a logical vector
map_lgl(c(1, 0, 1, 0), as.logical)
# [1] TRUE FALSE TRUE FALSE
Use these when you know the output type. They fail informatively if the function returns something unexpected.
map2() for Two Inputs
When you have two vectors that need to be processed together, map2() is your tool:
# Element-wise addition
x <- c(1, 2, 3)
y <- c(10, 20, 30)
map2_dbl(x, y, `+`)
# [1] 11 21 31
# Combining first and last names
first_names <- c("John", "Jane", "Bob")
last_names <- c("Smith", "Doe", "Johnson")
map2_chr(first_names, last_names, ~ paste(.x, .y))
# [1] "John Smith" "Jane Doe" "Bob Johnson"
Both inputs must be the same length, or the shorter one gets recycled. This is useful for operations that need paired data.
pmap() for Multiple Inputs
When you need to iterate over three or more vectors simultaneously, pmap() handles any number of inputs by passing them as a list:
# Three vectors: name, age, score
names <- c("Alice", "Bob", "Charlie")
ages <- c(25, 30, 35)
scores <- c(85, 92, 78)
# Create a data frame for each person
pmap(list(name = names, age = ages, score = scores),
~ data.frame(name = ..1, age = ..2, score = ..3))
# [[1]]
# name age score
# 1 Alice 25 85
#
# [[2]]
# name age score
# 1 Bob 30 92
#
# [[3]]
# name age score
# 1 Charlie 35 78
The ..1, ..2, ..3 syntax refers to the first, second, and third list elements. You can also use named arguments in your function.
Working with Lists
Lists are purrr’s native currency. The real power shows when processing complex data:
# A list of data frames (simulating grouped data)
mtcars_list <- split(mtcars, mtcars$cyl)
# Fit a model to each group
models <- map(mtcars_list, ~ lm(mpg ~ wt, data = .x))
# Extract R-squared from each model
map_dbl(models, ~ summary(.x)$r.squared)
# 4 6 8
# 0.5086329 0.4645102 0.3929611
This pattern—split, map, combine—is incredibly common in data analysis. You’re not just iterating; you’re applying a transformation pipeline to each piece.
Error Handling with Adverbs
What happens when your function fails on one element? Without protection, the entire map operation crashes. purrr provides “adverbs” that modify functions to handle errors gracefully.
safely() - Capture Errors and Results
safely() wraps a function to always return a two-element list: result and error:
safe_divide <- safely(function(a, b) {
if (b == 0) stop("Division by zero")
a / b
})
# This works
safe_divide(10, 2)
# $result
# [1] 5
# $error
# NULL
# This fails gracefully
safe_divide(10, 0)
# $result
# NULL
# $error
# <simpleError in .f(...): Division by zero>
Now you can map over risky operations without fear:
numbers <- c(10, 20, 0, 40)
results <- map(numbers, ~ safe_divide(100, .x))
# Check which succeeded
succeeded <- map_lgl(results, ~ is.null(.x$error))
succeeded
# [1] TRUE TRUE FALSE TRUE
# Extract successful results
map_dbl(results, "result")
# [1] 5 5 NA 2.5
possibly() - Return a Default Value
If you don’t care about the error details and just want a fallback value, possibly() is simpler:
safe_sqrt <- possibly(sqrt, otherwise = NA_real_)
# Works fine
safe_sqrt(16)
# [1] 4
# Returns NA instead of error
safe_sqrt(-16)
# [1] NA
This is useful when you’re willing to accept missing values rather than investigate each failure.
quietly() - Capture Warnings and Messages
Sometimes you want to suppress output or capture warnings for later inspection:
quiet_log <- quietly(log)
# Returns a list with result, output, messages, and warnings
quiet_log(10)
# $result
# [1] 2.302585
#
# $output
# character(0)
#
# $messages
# character(0)
#
# $warnings
# character(0)
# With a warning
quiet_log(c(1, -1))
# $result
# [1] 0 -Inf
# $output
# character(0)
# $messages
# character(0)
# $warnings
# [1] "NaNs produced"
Practical Example: Processing Multiple Files
Here’s a real-world workflow that combines these concepts:
library(readr)
# Imagine you have multiple CSV files in a folder
file_paths <- c("data/sales_2021.csv", "data/sales_2022.csv", "data/sales_2023.csv")
# Safe reader that handles missing files
safe_read_csv <- possibly(read_csv, otherwise = NULL)
# Read all files, filter for valid data
all_data <- file_paths |>
map(safe_read_csv) |>
compact() |>
map_dfr(~ filter(.x, amount > 0))
One line of defense handles missing files, bad parses, or empty dataframes without crashing your script.
Common Mistakes
A few things that trip people up:
Forgetting the suffix. map() returns a list, map_dbl() returns a double vector. If you assign map() output to a variable expecting a vector, you’ll get unexpected behavior.
Mismatched lengths with map2(). The shorter vector gets recycled, which can produce silent bugs. Check lengths explicitly if you’re unsure.
Ignoring errors. When iterating over external data—files, APIs, database queries—an error in one element stops everything. Always wrap risky functions with safely() or possibly().
Conclusion
purrr transforms iteration from a chore into a composable workflow. The key functions—map(), map2(), pmap()—handle single, paired, and multiple inputs. The error-handling adverbs—safely(), possibly(), quietly()—let you build robust pipelines that don’t fall apart when something goes wrong.
Start with map() for simple iterations, reach for map2() when you have paired data, and use pmap() for anything more complex. Wrap external functions with the error adverbs, and your code will handle edge cases gracefully.