tidyr::unnest()
Overview
unnest() expands a list-column containing data frames into regular rows and columns. It is the inverse of nest(). Where nest() groups columns into data frames stored in a single cell, unnest() pulls those data frames back out so each row in the nested column becomes its own row in the output.
unnest() is most useful after you have applied a transformation to nested data — like fitting a model per group with mutate() and map() — and now want to work with the results in a flat format again.
Signature
unnest(data, cols, ..., keep_empty = FALSE, ptype = NULL,
names_sep = NULL, names_repair = "check_unique")
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
data | tibble / data frame | — | Input data. |
cols | Required | List-column(s) to unnest. Use bare column names or tidyselect helpers. | |
keep_empty | logical | FALSE | If FALSE, rows with empty list elements (NULL, empty tibble) are dropped. If TRUE, they produce a row of NA values. |
ptype | list | NULL | Optional column prototype(s) to coerce the unested columns to. |
names_sep | character | NULL | If non-NULL, outer and inner names are pasted together with this separator. |
names_repair | "check_unique" | How to repair output column names. |
Basic Usage
Unnesting a single list-column
library(tidyr)
df <- tibble(
x = 1:3,
y = list(
NULL,
tibble(a = 1, b = 2),
tibble(a = 1:3, b = 3:1, c = 4)
)
)
df %>% unnest(y)
# # A tibble: 4 x 4
# x a b c
# <int> <dbl> <dbl> <dbl>
# 1 2 1 2 NA
# 2 3 1 3 4
# 3 3 2 2 4
# 4 3 3 1 4
Row 1 (x = 1, y = NULL) is dropped because keep_empty = FALSE by default. Row 2 expands to one row. Row 3 expands to three rows.
Preserving empty elements
df %>%
unnest(y, keep_empty = TRUE)
# # A tibble: 5 x 4
# x a b c
# <int> <dbl> <dbl> <dbl>
# 1 1 NA NA NA
# 2 2 1 2 NA
# 3 3 1 3 4
# 4 3 2 2 4
# 5 3 3 1 4
Unnesting multiple columns
When you unnest two columns at once, values are recycled to match:
df2 <- tibble(
id = c(1, 2),
x = list(tibble(a = 1:2), tibble(a = 3)),
y = list(tibble(p = 10, q = 20), tibble(p = 30))
)
df2 %>% unnest(c(x, y))
# # A tibble: 3 x 3
# id a p q
# <dbl> <int> <dbl> <dbl>
# 1 1 1 10 20
# 2 1 2 10 20
# 3 2 3 30 NA
Working With Nested Models
A common pattern: nest, fit models, extract results, then unnest to analyse coefficients:
library(dplyr)
library(purrr)
library(tidyr)
df <- tibble(
group = c("A", "A", "B", "B", "B"),
x = c(1, 2, 3, 4, 5),
y = c(10, 20, 30, 40, 50)
)
nested <- df %>%
nest(.by = group) %>%
mutate(model = map(data, ~ lm(x ~ y, data = .x)))
nested <- nested %>%
mutate(coef = map(model, coef))
nested %>%
select(group, coef) %>%
unnest(coef)
# # A tibble: 4 x 3
# group name estimate
# <chr> <chr> <dbl>
# 1 A (Intercept) 0
# 2 A y 0.1
# 3 B (Intercept) 0
# 4 B y 0.1
Using names_sep
When nested column names might conflict with existing columns, names_sep prefixes the inner names:
df3 <- tibble(
id = 1,
data = list(tibble(x = 1:2, y = 3:4))
)
df3 %>% unnest(data, names_sep = "_")
# Columns become data_x, data_y instead of x, y
Common Use Cases
Expanding time series per entity
library(lubridate)
measurements <- tibble(
station = c("A", "A", "B"),
reading = list(
tibble(date = as.Date("2021-01-01"), temp = 5.1),
tibble(date = as.Date("2021-01-02"), temp = 5.8),
tibble(date = as.Date("2021-01-01"), temp = 4.2)
)
)
measurements %>% unnest(reading)
# # A tibble: 3 x 3
# station date temp
# <chr> <date> <dbl>
# 1 A 2021-01-01 5.1
# 2 A 2021-01-02 5.8
# 3 B 2021-01-01 4.2
Flattening survey responses
survey <- tibble(
respondent = 1:2,
responses = list(
tibble(q1 = "agree", q2 = "disagree"),
tibble(q1 = "neutral", q2 = "agree")
)
)
survey %>% unnest(responses)
# # A tibble: 2 x 3
# respondent q1 q2
# <int> <chr> <chr>
# 1 1 agree disagree
# 2 2 neutral agree
Gotchas
Row ordering after unnesting. The order of rows in the output depends on the order of elements in the list-column, not the original row order. Use arrange() to restore a specific order:
nested %>%
unnest(data) %>%
arrange(group, x)
Missing elements drop rows by default. If a row has an empty list element (NULL, empty tibble), that row disappears from the output unless you pass keep_empty = TRUE.
Deprecated arguments. The old unnest(x, y, z) syntax is deprecated. Use unnest(c(x, y, z)). Similarly, .id, .sep, and .drop are deprecated — use mutate() before unnest().
Multiple list-columns with mismatched lengths. Shorter vectors are recycled. If lengths are not divisible, you get a warning and some cells become NA.
See Also
- /reference/tidyverse/tidyr_nest/ — the inverse, creates list-columns from rows
- /reference/tidyverse/tidyr_pivot_longer/ — reshaping wide to long format (different operation, similar spirit)
- /reference/tidyverse/dplyr-mutate/ — add and transform columns, often used before unnesting