tidyr::pivot_longer()
Overview
pivot_longer() transforms wide data into long format. It takes two or more columns and gathers them into key-value pairs: column names land in a new column, and their values land in another. This is the inverse of pivot_wider().
The function is useful when your data has columns that represent values rather than variables — a common structure in survey data, repeated measurements, and any dataset where the column header encodes information.
Signature
pivot_longer(
data,
cols,
names_to = "name",
values_to = "value",
names_prefix = NULL,
names_sep = NULL,
names_pattern = NULL,
names_ptypes = NULL,
values_ptypes = NULL,
values_drop_na = FALSE,
...
)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
data | tibble / data frame | — | Input data. |
cols | tidy-select | — | Columns to pivot into longer format. |
names_to | character | "name" | Name of the column that will hold the old column names. |
values_to | character | "value" | Name of the column that will hold the cell values. |
names_prefix | character | NULL | Strip this prefix from column names before pivoting. |
names_sep | character | NULL | Separator between multiple names_to columns. |
names_pattern | character | NULL | Regex pattern to extract column name parts for multiple names_to. |
names_ptypes | named list | NULL | Types for the new name columns. |
values_ptypes | named list | NULL | Types for the value columns. |
values_drop_na | logical | FALSE | If TRUE, drop rows where all value columns are NA. |
Basic Usage
Simple pivot
The most common case: you have columns that each represent a category, and you want to stack them.
library(tidyr)
df <- tibble(
id = c("A", "B"),
Jan = c(100, 200),
Feb = c(150, 250),
Mar = c(120, 220)
)
pivot_longer(df, c(Jan, Feb, Mar), names_to = "month", values_to = "sales")
# # A tibble: 6 × 3
# id month sales
# <chr> <chr> <dbl>
# 1 A Jan 100
# 2 A Feb 150
# 3 A Mar 120
# 4 B Jan 200
# 5 B Feb 250
# 6 B Mar 220
Pivot all columns except identifier
Use the ! operator to pivot all columns except the identifier:
df <- tibble(
id = c("A", "B"),
Jan = c(100, 200),
Feb = c(150, 250)
)
pivot_longer(df, !id, names_to = "month", values_to = "sales")
# # A tibble: 4 × 3
# id month sales
# <chr> <chr> <dbl>
# 1 A Jan 100
# 2 A Feb 150
# 3 B Jan 200
# 4 B Feb 250
Strip prefixes from column names
Use names_prefix to remove a common prefix from column names before pivoting:
df <- tibble(
id = c(1, 2),
product_2023 = c(300, 450),
product_2024 = c(320, 480)
)
pivot_longer(
df,
c(product_2023, product_2024),
names_to = "year",
values_to = "revenue",
names_prefix = "product_"
)
# # A tibble: 4 × 3
# id year revenue
# <dbl> <chr> <dbl>
# 1 1 2023 300
# 2 1 2024 320
# 3 2 2023 450
# 4 2 2024 480
Multiple Name Columns
When column names encode multiple variables, split them into separate columns with names_sep or names_pattern.
Using names_sep
When column names follow a consistent separator, names_sep splits on that character:
df <- tibble(
id = c(1, 2),
temp_2021 = c(22.1, 18.5),
temp_2022 = c(23.4, 19.2)
)
pivot_longer(
df,
c(temp_2021, temp_2022),
names_to = c("var", "year"),
values_to = "reading",
names_sep = "_"
)
# # A tibble: 4 × 4
# id var year reading
# <dbl> <chr> <chr> <dbl>
# 1 1 temp 2021 22.1
# 2 1 temp 2022 23.4
# 3 2 temp 2021 18.5
# 4 2 temp 2022 19.2
Using names_pattern
For irregular column name structures, use a regex with capture groups:
df <- tibble(
id = c(1, 2),
Alice_score = c(85, 92),
Bob_score = c(78, 88)
)
pivot_longer(
df,
c(Alice_score, Bob_score),
names_to = "student",
values_to = "score",
names_pattern = "(.+)_.+"
)
# # A tibble: 4 × 3
# id student score
# <dbl> <chr> <dbl>
# 1 1 Alice 85
# 2 1 Bob 78
# 3 2 Alice 92
# 4 2 Bob 88
For two capture groups:
df <- tibble(
site = c("A", "B"),
alpha_2021 = c(10, 20),
beta_2021 = c(15, 25)
)
pivot_longer(
df,
c(alpha_2021, beta_2021),
names_to = c("site_code", "year"),
values_to = "measurement",
names_pattern = "(.+)_(.+)"
)
# # A tibble: 4 × 4
# site site_code year measurement
# <chr> <chr> <chr> <dbl>
# 1 A alpha 2021 10
# 2 A beta 2021 15
# 3 B alpha 2021 20
# 4 B beta 2021 25
Multiple Value Columns
By default, all pivoted columns share the same values_to column. To split values across multiple columns, provide a named vector with .value as a placeholder:
df <- tibble(
id = c(1, 2),
group_low_n = c(10, 20),
group_high_n = c(5, 8),
group_low_mean = c(2.1, 3.4),
group_high_mean = c(5.6, 6.2)
)
pivot_longer(
df,
c(group_low_n, group_high_n, group_low_mean, group_high_mean),
names_to = c("group", ".value"),
values_to = c("n", "mean"),
names_pattern = "group_(.+)_(.+)"
)
# # A tibble: 4 × 4
# id group n mean
# <dbl> <chr> <dbl> <dbl>
# 1 1 low 10 2.1
# 2 1 high 5 5.6
# 3 2 low 20 3.4
# 4 2 high 8 6.2
The .value placeholder tells pivot_longer to create separate value columns from the matched groups.
Gotchas and Edge Cases
values_drop_na
Setting values_drop_na = TRUE removes rows where the pivoted values are all NA. This is useful after a join or for sparse data:
df <- tibble(
id = c(1, 2),
x = c(1, NA),
y = c(NA, 2)
)
pivot_longer(df, c(x, y), names_to = "var", values_to = "val", values_drop_na = TRUE)
# # A tibble: 2 × 3
# id var val
# <dbl> <chr> <dbl>
# 1 1 x 1
# 2 2 y 2
NA in original data
If a cell in the original data is NA, that NA propagates to the pivoted row:
df <- tibble(id = c("A", "B"), x = c(1, NA), y = c(NA, 4))
pivot_longer(df, c(x, y), names_to = "col", values_to = "val")
# # A tibble: 4 × 3
# id col val
# <chr> <chr> <dbl>
# 1 A x 1
# 2 A y NA
# 3 B x NA
# 4 B y 4
Specifying column types
Use names_ptypes and values_ptypes to enforce types on the new columns:
df <- tibble(
id = c(1, 2),
x_2021 = c("1", "2"),
x_2022 = c("3", "4")
)
pivot_longer(
df,
c(x_2021, x_2022),
names_to = "year",
values_to = "val",
names_ptypes = list(year = factor()),
values_ptypes = list(val = double())
)
See Also
- /reference/tidyverse/tidyr-pivot/ — the inverse operation (
pivot_wider) for widening data back out - /reference/tidyverse/dplyr-mutate/ — often used after
pivot_longerto transform the new columns - /cookbooks/how-to-convert-wide-to-long/ — step-by-step example of reshaping wide data to tidy format