How to split a string column into multiple columns in R
· 3 min read · Updated March 14, 2026 · beginner
r stringr tidyr string-splitting data-transformation
Splitting strings into parts is a common data cleaning task. Here are the main approaches in R.
With base R: strsplit()
The base R function strsplit() splits a character vector by a delimiter:
# Split by space
text <- "Hello world from R"
strsplit(text, " ")
# [[1]]
# [1] "Hello" "world" "from" "R"
# Split by comma
cities <- "New York,Los Angeles,Chicago"
strsplit(cities, ",")
# [[1]]
# [1] "New York" "Los Angeles" "Chicago"
strsplit() returns a list. Use unlist() to get a simple vector:
unlist(strsplit("a,b,c", ","))
# [1] "a" "b" "c"
With string_split()
The stringr package providesr: str a tidyverse-friendly version:
library(stringr)
# Basic split
str_split("one,two,three", ",")
# [[1]]
# [1] "one" "two" "three"
# Split into fixed number of pieces with str_split_n()
str_split_n("2026-03-14", "-", n = 3)
# [1] "2026" "03" "14"
# Split and simplify to matrix
str_split_fixed("a,b,c", ",", n = 3)
# [,1] [,2] [,3]
# [1,] "a" "b" "c"
With tidyr: separate()
For data frames, tidyr::separate() is the most convenient:
library(tidyr)
df <- tibble(full_name = c("John Doe", "Jane Smith", "Bob Wilson"))
# Split into two columns
df %>%
separate(full_name, into = c("first_name", "last_name"), sep = " ")
# # A tibble: 3 × 2
# first_name last_name
# <chr> <chr>
# 1 John Doe
# 2 Jane Smith
# 3 Bob Wilson
Multiple splits
Split into more than two columns:
df <- tibble(date = c("2026-01-15", "2026-03-14"))
df %>%
separate(date, into = c("year", "month", "day"), sep = "-")
# # A tibble: 2 × 3
# year month day
# <chr> <chr> <chr>
# 1 2026 01 15
# 2 2026 03 14
Handling extra pieces
By default, separate() drops extra pieces. Use extra = "merge" to keep them:
df <- tibble(name = c("John Michael Doe"))
df %>%
separate(name, into = c("first", "middle", "last"), sep = " ", extra = "merge")
# # A tibble: 1 × 3
# first middle last
# <chr> <chr> <chr>
# 1 John Michael Doe
Handling missing pieces
Use fill = "right" or fill = "left" to handle missing values:
df <- tibble(name = c("John", "Jane Doe"))
df %>%
separate(name, into = c("first", "last"), sep = " ", fill = "right")
# # A tibble: 2 × 2
# first last
# <chr> <chr>
# 1 John <NA>
# 2 Jane Doe
Using regex patterns
Separate can use regular expressions as delimiters:
df <- tibble(text = c("price: 100 USD", "price: 50 EUR"))
df %>%
separate(text, into = c("label", "value", "currency"), sep = ": | ")
# # A tibble: 2 × 3
# label value currency
# <chr> <chr> <chr>
# 1 price 100 USD
# 2 price 50 EUR
Undoing splits with unite()
Combine columns back together:
df <- tibble(
year = c("2026", "2026"),
month = c("03", "06"),
day = c("14", "15")
)
df %>%
unite("date", year, month, day, sep = "-")
# # A tibble: 2 × 1
# date
# <chr>
# 1 2026-03-14
# 2 2026-06-15
Common patterns
Split and expand into rows
Use tidyr::separate_rows() to split and create one row per element:
df <- tibble(id = 1:2, tags = c("r,stats,visualization", "python,ml"))
df %>%
separate_rows(tags, sep = ",")
# # A tibble: 5 × 2
# id tags
# <int> <chr>
# 1 1 r
# 2 1 stats
# 3 1 visualization
# 4 2 python
# 5 2 ml
Split file paths
path <- "~/projects/r-project/data/file.csv"
# Split by /
unlist(strsplit(path, "/"))
# [1] "~" "projects" "r-project" "data" "file.csv"
# Get just the filename
basename <- tail(unlist(strsplit(path, "/")), 1)
# [1] "file.csv"
See Also
- string-manipulation-stringr — Full guide to stringr
- stringr-strings — Tutorial on stringr
- stringr-strsplit — Reference for strsplit()