rguides

How to Convert Wide Data to Long Format in R

When you convert wide data to long format, you make it ready for ggplot2 and grouped analysis. The modern approach uses tidyr::pivot_longer(), which handles edge cases that the older gather() function could not. Long-format data is also the preferred shape for most tidyverse functions, since each variable occupies one column and each observation occupies one row.

library(tidyr)

wide_df <- data.frame(
  id = 1:3,
  q1 = c(100, 200, 300),
  q2 = c(150, 250, 350),
  q3 = c(180, 280, 380),
  q4 = c(210, 310, 410)
)

# All quarter columns become rows; id stays as identifier
long_df <- wide_df %>%
  pivot_longer(cols = q1:q4,
               names_to = "quarter",
               values_to = "sales")
# # A tibble: 12 × 3 — each id repeats for q1 through q4

Use starts_with("q"), ends_with(), or matches() for flexible column selection. For multiple value columns (e.g., revenue and profit per quarter), set names_to = c(".value", "quarter") with names_sep = "_" to create separate revenue and profit columns automatically.

# Multiple value columns: revenue_q1, profit_q1, revenue_q2, profit_q2
long_df <- wide_df %>%
  pivot_longer(
    cols = starts_with("revenue") | starts_with("profit"),
    names_to = c(".value", "quarter"),
    names_sep = "_"
  )

# data.table melt for speed with large datasets
library(data.table)
dt <- as.data.table(wide_df)
long_dt <- melt(dt, id.vars = "id",
                measure.vars = patterns("^q"),
                variable.name = "quarter",
                value.name = "sales")

The key decision is whether you have one value column or multiple. pivot_longer() handles both cleanly. data.table::melt() is the choice for datasets with millions of rows. Base R’s reshape() works without dependencies but has a complex interface that can be harder to debug in production code.

See also