separate()

Updated May 19, 2026· Tidyverse

rtidyrdata-wranglingstring-manipulation

Overview

separate() splits a character column into multiple columns along a delimiter pattern. It is the inverse of unite(). The function takes a single character vector and breaks it apart based on a separator, distributing the pieces across new columns you name in advance.

The default sep pattern matches any run of non-alphanumeric characters as the split point. This means common delimiters like spaces, hyphens, underscores, slashes, and dots all work without you needing to specify them explicitly.

Signature

separate(
  data,
  col,
  into,
  sep = "[^[:alnum:]]+",
  remove = TRUE,
  convert = FALSE,
  extra = "warn",
  fill = "warn",
  ...
)

Parameters

Parameter	Type	Default	Description
`data`	tibble / data frame	—	Input data frame or tibble.
`col`	character	—	Name of the column to separate.
`into`	character vector	—	Names for the new columns. Required — no default.
`sep`	character	`[^[:alnum:]]+`	Regular expression defining the split point. Defaults to any run of non-alphanumeric characters.
`remove`	logical	`TRUE`	If `TRUE`, remove the original `col` from the output.
`convert`	logical	`FALSE`	If `TRUE`, apply `type.convert()` to the new columns so they take appropriate R types (integer, numeric, logical, etc.).
`extra`	character	`"warn"`	What to do when a row has more pieces than `length(into)`: `"warn"` (warn and merge extras into last column), `"drop"` (silently discard extras), or `"merge"` (merge without warning).
`fill`	character	`"warn"`	What to do when a row has fewer pieces than `length(into)`: `"warn"` (warn and fill with `NA` on the right), `"right"` (fill on the right), or `"left"` (fill on the left).
`...`	additional arguments	—	Passed to methods.

Basic Usage

Simple split on default separator

library(tidyr)

df <- tibble(full_name = c("John Doe", "Jane Smith", "Bob Wilson"))

separate(df, full_name, into = c("first", "last"))
# # A tibble: 3 × 2
#   first last
#   <chr> <chr>
# 1 John  Doe
# 2 Jane  Smith
# 3 Bob   Wilson

Custom separator

When your delimiter is not a non-alphanumeric run, pass sep explicitly.

df <- tibble(date = c("2024-01-15", "2024-02-20", "2024-03-25"))

separate(df, date, into = c("year", "month", "day"), sep = "-")
# # A tibble: 3 × 3
#   year  month day
#   <chr> <chr> <chr>
# 1 2024  01    15
# 2 2024  02    20
# 3 2024  03    25

Convert types automatically

Set convert = TRUE to coerce the new columns to their natural types.

df <- tibble(date = c("2024-01-15", "2024-02-20", "2024-03-25"))

separate(df, date, into = c("year", "month", "day"), sep = "-", convert = TRUE)
# # A tibble: 3 × 3
#   year  month   day
#   <int>   <int> <int>
# 1  2024       1    15
# 2  2024       2    20
# 3  2024       3    25

Gotchas and Advanced

Handling extra pieces

With the default extra = "warn", overflow pieces are merged into the final column and a warning is emitted.

df <- tibble(id = c("a-b-c", "x-y"))

separate(df, id, into = c("first", "second"))
# Warning: Expected 2 pieces. Additional pieces discarded in 1 row (a-b-c).
# # A tibble: 2 × 2
#   first second
#   <chr> <chr>
# 1 a     b-c
# 2 x     y

To discard extras silently, use extra = "drop". To merge without warning, use extra = "merge".

Handling too few pieces

Use fill = "right" or fill = "left" to control which side receives NA when there are not enough pieces.

df <- tibble(id = c("only-one", "also-one"))

separate(df, id, into = c("first", "second"), fill = "left")
# Warning: Expected 2 pieces. Missing pieces filled with `NA` on the left.
# # A tibble: 2 × 2
#   first second
#   <chr> <chr>
# 1 NA    only-one
# 2 NA    also-one

Negative separator positions

You can specify sep as a negative integer to count from the right. A value of -1 splits one position from the end.

df <- tibble(code = c("abc123def", "xyz789uvw"))

separate(df, code, into = c("prefix", "suffix"), sep = -3)
# # A tibble: 2 × 2
#   prefix suffix
#   <chr>  <chr>
# 1 abc123 def
# 2 xyz789 uvw

NA values propagate to all output columns

If the input cell is NA, every new column receives NA in that row.

df <- tibble(pair = c("apple-orange", NA, "red-blue"))

separate(df, pair, into = c("a", "b"))
# # A tibble: 3 × 2
#   a     b
#   <chr> <chr>
# 1 apple orange
# 2 NA    NA
# 3 red   blue

Omitting a column with NA in `into`

If one of the names in into is NA, that column is silently dropped from the output.

df <- tibble(code = c("2024-01-15", "2024-02-20"))

separate(df, code, into = c("year", NA, "day"), sep = "-")
# # A tibble: 2 × 2
#   year  day
#   <chr> <chr>
# 1 2024  15
# 2 2024  20