How to Pivot Data from Wide to Long Format in R
When you pivot data in R from wide to long format, tidyr::pivot_longer() turns column names into a single key column and gathers the corresponding values into a value column. This transformation is a frequent step before using ggplot2 or grouped dplyr operations.
library(tidyr)
wide <- tibble(
id = c(1, 2, 3),
name = c("Alice", "Bob", "Carol"),
wave1 = c(10, 20, 15),
wave2 = c(12, 22, 18),
wave3 = c(14, 25, 20)
)
long <- wide |>
pivot_longer(
cols = c(wave1, wave2, wave3),
names_to = "wave",
values_to = "score"
)
Instead of listing every column, use tidyr selection helpers like starts_with("wave"), ends_with("score"), or matches("^q[0-9]") for regex patterns. To keep certain columns out of the pivot, prefix them with a minus sign: cols = -c(id, name). Use values_drop_na = TRUE to discard rows where the pivoted value is NA. The long format is the standard input for ggplot2, grouped dplyr operations, and most modeling functions.
For data with multiple value types per observation (e.g., revenue and units per month), use the .value token with names_sep to split column names into separate columns.
wide <- tibble(
id = 1:2,
rev_jan = c(100, 200), rev_feb = c(110, 220),
units_jan = c(10, 20), units_feb = c(12, 24)
)
long <- wide |>
pivot_longer(
cols = c(rev_jan:units_feb),
names_to = c(".value", "month"),
names_sep = "_"
)
The .value token tells pivot_longer() to use the first part of each split column name as a new column name, creating rev and units columns from rev_jan, rev_feb, and similar.
See also
- Reshaping Data with tidyr, Full tutorial on tidyr reshaping functions
- tidyr::pivot_longer() / tidyr::pivot_wider(), The opposite transformation