dplyr::mutate

mutate(.data, ..., .by = NULL, .keep = c('all','used','unused','none'), .before = NULL, .after = NULL)

Returns tibble· Updated May 12, 2026· Tidyverse

rdplyrmutatetidyversedata-wrangling

mutate() adds new columns or modifies existing ones directly within a tibble or data frame. It is one of dplyr’s core verbs, and it uses data masking so you can refer to columns by name without quoting or using $. New columns you create in the same mutate() call are available immediately for use in subsequent expressions within that call.

Syntax

mutate(.data, ..., .by = NULL, .keep = "all", .before = NULL, .after = NULL)

Parameters

Parameter	Type	Default	Description
`.data`	tibble / data.frame / lazy frame	Required	Input data to transform.
`...`	name-value pairs	Required	New columns to add or expressions that modify existing columns. Use `NULL` to remove a column.
`.by`	tidy-select	`NULL`	Grouping columns computed per-group for this operation only. Added in dplyr 1.0.0.
`.keep`	string	`"all"`	Controls which columns to retain. `"all"` keeps everything. `"used"` keeps only columns referenced in `...`. `"unused"` drops columns used to compute outputs. `"none"` drops all unused columns.
`.before`	tidy-select	`NULL`	Place new columns before this column. Added in dplyr 1.1.0.
`.after`	tidy-select	`NULL`	Place new columns after this column. Added in dplyr 1.1.0.

Examples

Basic column creation

Create a new column from existing ones. Columns can reference each other within the same mutate() call.

library(dplyr)

df <- tibble(
  name = c("Alice", "Bob", "Carol"),
  mass_kg = c(70, 85, 62),
  height_m = c(1.75, 1.82, 1.68)
)

df |> mutate(bmi = mass_kg / (height_m ^ 2))
#> # A tibble: 3 × 4
#>   name   mass_kg height_m   bmi
#>   <chr>    <dbl>    <dbl> <dbl>
#> 1 Alice       70     1.75  22.9
#> 2 Bob         85     1.82  25.6
#> 3 Carol       62     1.68  22.0

Reusing columns within a single call

Newly created columns are immediately available in the same mutate() call. Build up expressions step by step without repeating yourself.

df <- tibble(
  name = c("Luke", "Han", "Leia"),
  mass = c(77, 80, 55)
)

df |> mutate(
  mass_lb = mass * 2.205,
  mass_lb_sq = mass_lb ^ 2
)
#> # A tibble: 3 × 4
#>   name   mass mass_lb mass_lb_sq
#>   <chr> <dbl>   <dbl>      <dbl>
#> 1 Luke     77    170.    28900
#> 2 Han      80    176.    31050
#> 3 Leia     55    121.    14680

Per-group operations with .by

The .by argument lets you compute per-group statistics without wrapping the whole pipeline in group_by(). Groups are active only for this mutate() call.

df <- tibble(
  species = c("Human", "Human", "Droid", "Human", "Droid"),
  name = c("Luke", "Han", "R2-D2", "Leia", "C-3PO"),
  mass = c(77, 80, 32, 55, 75)
)

df |> mutate(
  mass_rank = min_rank(desc(mass)),
  .by = species
)
#> # A tibble: 5 × 4
#>   species name   mass mass_rank
#>   <chr>   <chr> <dbl>     <int>
#> 1 Human   Luke     77         2
#> 2 Human   Han      80         1
#> 3 Droid   R2-D2    32         2
#> 4 Human   Leia     55         3
#> 5 Droid   C-3PO    75         1

Controlling column retention with .keep

The .keep argument filters which columns are kept in the output. This is useful for cleaning up intermediate columns.

df <- tibble(
  name = c("Alice", "Bob", "Carol"),
  x = c(1, 2, 3),
  y = c(4, 5, 6),
  z = c(7, 8, 9)
)

# Keep only columns used in the computation
df |> mutate(total = x + y + z, .keep = "used")
#> # A tibble: 3 × 4
#>   name       x     y     z total
#>   <chr>  <dbl> <dbl> <dbl>  <int>
#> 1 Alice      1     4     7     12
#> 2 Bob        2     5     8     15
#> 3 Carol       3     6     9     18

Placing columns with .before and .after

New columns appear on the far right by default. Control placement with .before or .after.

df <- tibble(
  name = c("Alice", "Bob"),
  age = c(30, 25),
  city = c("NY", "LA")
)

df |> mutate(score = c(95, 88), .before = age)
#> # A tibble: 2 × 4
#>   name  score   age city
#>   <chr> <dbl> <dbl> <chr>
#> 1 Alice    95    30 NY
#> 2 Bob      88    25 LA

df |> mutate(rank = c(2, 1), .after = name)
#> # A tibble: 2 × 4
#>   name   rank   age city
#>   <chr> <dbl> <dbl> <chr>
#> 1 Alice     2    30 NY
#> 2 Bob       1    25 LA

Removing columns with NULL

Assign NULL to a column name to remove it from the tibble.

df <- tibble(
  name = c("Alice", "Bob"),
  age = c(30, 25),
  temp = c(NA, NA)
)

df |> mutate(temp = NULL)
#> # A tibble: 2 × 2
#>   name    age
#>   <chr> <dbl>
#> 1 Alice    30
#> 2 Bob      25

Using across() for multiple columns

Combine mutate() with across() to apply the same transformation to multiple columns at once.

df <- tibble(
  name = c("Alice", "Bob", "Carol"),
  score_a = c(85, 92, 78),
  score_b = c(90, 88, 95)
)

df |> mutate(across(starts_with("score"), round, .names = "{col}_rounded"))
#> # A tibble: 3 × 5
#>   name   score_a score_b score_a_rounded score_b_rounded
#>   <chr>    <dbl>   <dbl>           <dbl>           <dbl>
#> 1 Alice       85      90              85              90
#> 2 Bob         92      88              92              88
#> 3 Carol       78      95              78              95

Common Gotchas

Column reuse across grouped tibbles

When working with a grouped tibble, mutate() computes values within each group. If you need global statistics, either ungroup first or use .by on the specific column.

# Per-species computation — groups from group_by() are respected
starwars |>
  select(name, mass, species) |>
  group_by(species) |>
  mutate(mass_pct = mass / sum(mass, na.rm = TRUE) * 100) |>
  filter(species == "Human")

Setting to NULL vs filtering rows

mutate(col = NULL) removes the column entirely. It does not set values to NA or change row count. Use filter() to remove rows.

.keep does not affect rows

.keep only controls column retention. It never drops rows. For row-level filtering, use filter() before or after mutate().