Parallel purrr with furrr

· 4 min read · Updated March 12, 2026 · intermediate
r performance parallel purrr furrr future

The furrr package brings parallel processing to your tidyverse workflows. If you already use purrr, switching to furrr requires minimal changes but can give you massive speedups on CPU-intensive tasks.

Why furrr?

The purrr package gives you elegant functional iteration. But by default, purrr runs sequentially. Each iteration waits for the previous one to finish. When you have hundreds or thousands of items to process, this adds up.

furrr replaces purrr’s sequential mapping functions with parallel versions. You swap map() for future_map(), and your code runs across multiple cores automatically.

The magic comes from the future package, which handles the parallelism details. furrr translates your purrr-style code into future-based parallel execution.

Setup

Install both packages from CRAN:

install.packages(c("furrr", "future"))

Load them together:

library(furrr)
library(future)
plan(multisession)  # Use multiple R sessions

The plan() function controls how futures are resolved. multisession creates separate R processes on your machine. For a quick test on a single machine, this is usually the right choice.

Basic Parallel Mapping

Here’s the simplest example - transforming a numeric vector:

library(furrr)
library(future)
plan(multisession)

# Sequential (standard purrr)
slow_function <- function(x) {
  Sys.sleep(0.1)  # Simulate work
  x * 2
}

# This takes 1 second (10 * 0.1)
system.time(result <- map(1:10, slow_function))

# This takes ~0.2 seconds on a 4-core machine
system.time(result <- future_map(1:10, slow_function))

The interface is identical to purrr. Change the function name, get parallelism.

Different Output Types

Just like purrr, furrr provides variants for different output types:

  • future_map() - list output
  • future_map_lgl() - logical vector
  • future_map_int() - integer vector
  • future_map_dbl() - double vector
  • future_map_chr() - character vector
  • future_map_dfr() - row-bound data frame
  • future_map_dfc() - column-bound data frame

Example with data frames:

library(tidyverse)
library(furrr)
library(future)
plan(multisession)

# Apply transformation to each group in parallel
results <- iris %>%
  split(.$Species) %>%
  future_map_dfr(~.x %>% mutate(sepal_area = Sepal.Length * Sepal.Width))

Progress Bars

Parallel code can feel slow if you don’t see progress. The progressr package integrates with furrr:

library(furrr)
library(future)
library(progressr)

plan(multisession)
handlers(progressbar)

# Now you see a progress bar
with_progress({
  results <- future_map(1:100, ~.x^2, .options = furrr_options(seed = TRUE))
})

The .options argument also lets you set a random seed for reproducibility.

Error Handling

furrr works with purrr’s safety functions. Wrap your function with safely() or possibly():

# safely() returns a list with result and error
safe_divide <- safely(function(x, y) x / y, otherwise = NA)

results <- future_map(1:10, ~safe_divide(.x, sample(0:1, 1)))

# Extract results and errors separately
successes <- map_dbl(results, "result")
errors <- map(results, "error") %>% map_lgl(is.null)

The possibly() variant is simpler - it just returns a default value on error:

safe_log <- possibly(log, otherwise = NA_real_)
future_map(c(1, -1, 2), safe_log)  # NA for log(-1)

Performance Tips

Chunking

For very large iterables, process in chunks:

future_map(1:10000, slow_function, 
           .options = furrr_options(chunk.size = 100))

Limiting Workers

Don’t use more workers than you have cores:

plan(multisession(workers = 4))

Seed Setting

Always set a seed for reproducible results:

future_map(1:100, ~rnorm(1), .options = furrr_options(seed = TRUE))

Common Pitfalls

Shared State

Each worker is a separate R process. Global variables won’t be shared:

# This doesn't work as expected
global_lookup <- c(a = 1, b = 2)
future_map(c("a", "b"), ~global_lookup[.x])  # Won't find global_lookup

Pass data explicitly as function arguments instead.

Side Effects

Writing to files or modifying global state from inside future_map() can cause race conditions. Return the data and write after mapping:

# Bad
future_map(df$path, ~write.csv(read.csv(.x), "output.csv"))

# Good  
future_map(df$path, ~read.csv(.x)) %>%
  map(~write.csv(.x, "output.csv"))

Small Tasks

Parallelism adds overhead. If each iteration takes less than 10 milliseconds, sequential purrr might actually be faster.

Conclusion

furrr makes parallel processing accessible to tidyverse users. The learning curve is minimal if you already know purrr. The speedup can be dramatic for CPU-bound tasks.

Start with future_map() replacing your map() calls. Add progress bars with progressr. Use safely/possibly for error handling. Profile with microbenchmark to verify you’re actually gaining speed.

See Also