dplyr::mutate
mutate(.data, ..., .by = NULL, .keep = c('all','used','unused','none'), .before = NULL, .after = NULL) tibble · Updated April 1, 2026 · Tidyverse mutate() adds new columns or modifies existing ones directly within a tibble or data frame. It is one of dplyr’s core verbs, and it uses data masking so you can refer to columns by name without quoting or using $. New columns you create in the same mutate() call are available immediately for use in subsequent expressions within that call.
Syntax
mutate(.data, ..., .by = NULL, .keep = "all", .before = NULL, .after = NULL)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
.data | tibble / data.frame / lazy frame | Required | Input data to transform. |
... | name-value pairs | Required | New columns to add or expressions that modify existing columns. Use NULL to remove a column. |
.by | tidy-select | NULL | Grouping columns computed per-group for this operation only. Added in dplyr 1.0.0. |
.keep | string | "all" | Controls which columns to retain. "all" keeps everything. "used" keeps only columns referenced in .... "unused" drops columns used to compute outputs. "none" drops all unused columns. |
.before | tidy-select | NULL | Place new columns before this column. Added in dplyr 1.1.0. |
.after | tidy-select | NULL | Place new columns after this column. Added in dplyr 1.1.0. |
Examples
Basic column creation
Create a new column from existing ones. Columns can reference each other within the same mutate() call.
library(dplyr)
df <- tibble(
name = c("Alice", "Bob", "Carol"),
mass_kg = c(70, 85, 62),
height_m = c(1.75, 1.82, 1.68)
)
df |> mutate(bmi = mass_kg / (height_m ^ 2))
#> # A tibble: 3 × 4
#> name mass_kg height_m bmi
#> <chr> <dbl> <dbl> <dbl>
#> 1 Alice 70 1.75 22.9
#> 2 Bob 85 1.82 25.6
#> 3 Carol 62 1.68 22.0
Reusing columns within a single call
Newly created columns are immediately available in the same mutate() call. Build up expressions step by step without repeating yourself.
df <- tibble(
name = c("Luke", "Han", "Leia"),
mass = c(77, 80, 55)
)
df |> mutate(
mass_lb = mass * 2.205,
mass_lb_sq = mass_lb ^ 2
)
#> # A tibble: 3 × 4
#> name mass mass_lb mass_lb_sq
#> <chr> <dbl> <dbl> <dbl>
#> 1 Luke 77 170. 28900
#> 2 Han 80 176. 31050
#> 3 Leia 55 121. 14680
Per-group operations with .by
The .by argument lets you compute per-group statistics without wrapping the whole pipeline in group_by(). Groups are active only for this mutate() call.
df <- tibble(
species = c("Human", "Human", "Droid", "Human", "Droid"),
name = c("Luke", "Han", "R2-D2", "Leia", "C-3PO"),
mass = c(77, 80, 32, 55, 75)
)
df |> mutate(
mass_rank = min_rank(desc(mass)),
.by = species
)
#> # A tibble: 5 × 4
#> species name mass mass_rank
#> <chr> <chr> <dbl> <int>
#> 1 Human Luke 77 2
#> 2 Human Han 80 1
#> 3 Droid R2-D2 32 2
#> 4 Human Leia 55 3
#> 5 Droid C-3PO 75 1
Controlling column retention with .keep
The .keep argument filters which columns are kept in the output. This is useful for cleaning up intermediate columns.
df <- tibble(
name = c("Alice", "Bob", "Carol"),
x = c(1, 2, 3),
y = c(4, 5, 6),
z = c(7, 8, 9)
)
# Keep only columns used in the computation
df |> mutate(total = x + y + z, .keep = "used")
#> # A tibble: 3 × 4
#> name x y z total
#> <chr> <dbl> <dbl> <dbl> <int>
#> 1 Alice 1 4 7 12
#> 2 Bob 2 5 8 15
#> 3 Carol 3 6 9 18
Placing columns with .before and .after
New columns appear on the far right by default. Control placement with .before or .after.
df <- tibble(
name = c("Alice", "Bob"),
age = c(30, 25),
city = c("NY", "LA")
)
df |> mutate(score = c(95, 88), .before = age)
#> # A tibble: 2 × 4
#> name score age city
#> <chr> <dbl> <dbl> <chr>
#> 1 Alice 95 30 NY
#> 2 Bob 88 25 LA
df |> mutate(rank = c(2, 1), .after = name)
#> # A tibble: 2 × 4
#> name rank age city
#> <chr> <dbl> <dbl> <chr>
#> 1 Alice 2 30 NY
#> 2 Bob 1 25 LA
Removing columns with NULL
Assign NULL to a column name to remove it from the tibble.
df <- tibble(
name = c("Alice", "Bob"),
age = c(30, 25),
temp = c(NA, NA)
)
df |> mutate(temp = NULL)
#> # A tibble: 2 × 2
#> name age
#> <chr> <dbl>
#> 1 Alice 30
#> 2 Bob 25
Using across() for multiple columns
Combine mutate() with across() to apply the same transformation to multiple columns at once.
df <- tibble(
name = c("Alice", "Bob", "Carol"),
score_a = c(85, 92, 78),
score_b = c(90, 88, 95)
)
df |> mutate(across(starts_with("score"), round, .names = "{col}_rounded"))
#> # A tibble: 3 × 5
#> name score_a score_b score_a_rounded score_b_rounded
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Alice 85 90 85 90
#> 2 Bob 92 88 92 88
#> 3 Carol 78 95 78 95
Common Gotchas
Column reuse across grouped tibbles
When working with a grouped tibble, mutate() computes values within each group. If you need global statistics, either ungroup first or use .by on the specific column.
# Per-species computation — groups from group_by() are respected
starwars |>
select(name, mass, species) |>
group_by(species) |>
mutate(mass_pct = mass / sum(mass, na.rm = TRUE) * 100) |>
filter(species == "Human")
Setting to NULL vs filtering rows
mutate(col = NULL) removes the column entirely. It does not set values to NA or change row count. Use filter() to remove rows.
.keep does not affect rows
.keep only controls column retention. It never drops rows. For row-level filtering, use filter() before or after mutate().