How to Convert Wide Data to Long Format in R
When you convert wide data to long format, you make it ready for ggplot2 and grouped analysis. The modern approach uses tidyr::pivot_longer(), which handles edge cases that the older gather() function could not. Long-format data is also the preferred shape for most tidyverse functions, since each variable occupies one column and each observation occupies one row.
library(tidyr)
wide_df <- data.frame(
id = 1:3,
q1 = c(100, 200, 300),
q2 = c(150, 250, 350),
q3 = c(180, 280, 380),
q4 = c(210, 310, 410)
)
# All quarter columns become rows; id stays as identifier
long_df <- wide_df %>%
pivot_longer(cols = q1:q4,
names_to = "quarter",
values_to = "sales")
# # A tibble: 12 × 3 — each id repeats for q1 through q4
Use starts_with("q"), ends_with(), or matches() for flexible column selection. For multiple value columns (e.g., revenue and profit per quarter), set names_to = c(".value", "quarter") with names_sep = "_" to create separate revenue and profit columns automatically.
# Multiple value columns: revenue_q1, profit_q1, revenue_q2, profit_q2
long_df <- wide_df %>%
pivot_longer(
cols = starts_with("revenue") | starts_with("profit"),
names_to = c(".value", "quarter"),
names_sep = "_"
)
# data.table melt for speed with large datasets
library(data.table)
dt <- as.data.table(wide_df)
long_dt <- melt(dt, id.vars = "id",
measure.vars = patterns("^q"),
variable.name = "quarter",
value.name = "sales")
The key decision is whether you have one value column or multiple. pivot_longer() handles both cleanly. data.table::melt() is the choice for datasets with millions of rows. Base R’s reshape() works without dependencies but has a complex interface that can be harder to debug in production code.
See also
- tidyr::pivot_longer() / tidyr::pivot_wider(), Full reference for tidyr pivoting
- Reshaping Data with tidyr, Complete guide to tidyr reshaping