purrr::walk

Updated April 21, 2026 · Tidyverse

r purrr tidyverse side-effects functional-programming pipes

Overview

walk() is purrr’s function for when you want to call a function for its side effects, not its return value. While map() transforms data and returns results, walk() calls the function, discards the result, and returns the input unchanged. This makes walk() a natural fit at the end of a %>% pipeline where you need to inspect or export data without breaking the chain.

walk() vs map()

The core difference:

library(purrr)

# map() returns the results of calling toupper() on each element
result <- c("apple", "banana") |> map(toupper)
result
#> [[1]]
#> [1] "APPLE"
#>
#> [[2]]
#> [1] "BANANA"

# walk() calls toupper() for its side effect, returns input invisibly
c("apple", "banana") |> walk(print)
#> [1] "apple"
#> [1] "banana"

walk() calls the function on each element, discards the return value, and returns the original input. The original vector passes through unchanged.

Basic Usage

Printing

Inspect intermediate results in a pipeline without disrupting it:

library(dplyr)

mtcars |>
  filter(cyl > 4) |>
  walk(\(df) print(head(df, 3))) |>
  group_by(cyl) |>
  summarise(avg_mpg = mean(mpg))

Saving files

Save plots or data in a pipeline:

library(ggplot2)

mtcars |>
  split(~cyl) |>
  walk(\(df) {
    ggsave(
      paste0("plot_", unique(df$cyl), ".png"),
      plot = ggplot(df, aes(x = wt, y = mpg)) + geom_point()
    )
  })

Each group gets its own PNG file. The original data frames pass through walk() unchanged, so the pipeline can continue.

Saving CSVs with walk2()

When you have two parallel vectors — data and filenames — use walk2():

library(readr)

df1 <- tibble(x = 1:5, y = rnorm(5))
df2 <- tibble(x = 1:5, y = rnorm(5))
df3 <- tibble(x = 1:5, y = rnorm(5))

list(df1, df2, df3) |>
  set_names(c("alpha", "beta", "gamma")) |>
  walk2(
    c("alpha.csv", "beta.csv", "gamma.csv"),
    \(df, path) write_csv(df, path)
  )

Saving CSVs with pwalk()

When you have a tibble of metadata:

files <- tibble(
  df   = list(df1, df2, df3),
  path = c("alpha.csv", "beta.csv", "gamma.csv"),
  desc = c("First dataset", "Second dataset", "Third dataset")
)

files |> pwalk(\(df, path, desc) {
  message("Saving: ", desc)
  write_csv(df, path)
})
#> Saving: First dataset
#> Saving: Second dataset
#> Saving: Third dataset

Real-World Example: Export Multiple Sheets

With openxlsx, save each data frame to a separate sheet in one workbook:

library(openxlsx)

list(
  mtcars  = mtcars,
  iris    = iris,
  PlantGrowth = PlantGrowth
) |>
  set_names(c("Motor Trend Cars", "Fisher Iris", "Plant Growth")) |>
  imap(\(df, sheet_name) {
    wb <- createWorkbook()
    addWorksheet(wb, sheet_name)
    writeData(wb, sheet_name, df)
    saveWorkbook(wb, paste0(sheet_name, ".xlsx"), overwrite = TRUE)
  })

imap() feeds both the element (as df) and the name (as sheet_name) into the function.

Plotting in a Pipeline

Generate and save exploratory plots per group:

iris |>
  split(~Species) |>
  set_names(\(x) paste0("plot_", x, ".png")) |>
  walk(\(df) {
    p <- ggplot(df, aes(x = Sepal.Length, y = Sepal.Width)) +
      geom_point() +
      ggtitle(unique(df$Species))
    ggsave(names(df), plot = p, width = 5, height = 4)
  })

The pipe flows cleanly from data preparation through to plot generation and saving, with no intermediate objects cluttering the workspace.

Why Not Use map() for Side Effects?

You can use map() for side effects, but it returns a list of return values — which is wasteful if you don’t use them and signals the wrong intent to future readers:

# Works but returns a list of NULLs (invisible print returns NULL)
c("a", "b") |> map(print)
#> [1] "a"
#> [1] "b"
#> [[1]]
#> NULL
#>
#> [[2]]
#> NULL

# walk() returns the input — cleaner, signals intent
c("a", "b") |> walk(print)
#> [1] "a"
#> [1] "b"

The Invisible Return

walk() returns its input invisibly, which means it doesn’t print when used interactively. This is intentional — it lets you place walk() in a pipeline without generating distracting output:

x <- c("apple", "banana") |> walk(print)  # input is assigned, nothing printed
x
#> [1] "apple" "banana"

To capture the return value explicitly:

y <- c("a", "b")
identical(y, y |> walk(print))
#> [1] "a"
#> [1] "b"
#> [1] TRUE

Combining walk() with safely() and quietly()

Wrap the side-effect function with safely() to catch errors without stopping the pipeline:

files <- c("data1.csv", "data2.csv", "nonexistent.csv")

files |>
  walk(\(f) {
    tryCatch(
      read_csv(f) |> mutate(source = f),
      error = \(e) warning("Failed to read ", f, ": ", e$message)
    )
  })

For functions that produce both output and messages or warnings you want to suppress:

walk(quietly(some_function), ~ .x$result)