How to split a string column into multiple columns in R

· 3 min read · Updated March 14, 2026 · beginner
r stringr tidyr string-splitting data-transformation

Splitting strings into parts is a common data cleaning task. Here are the main approaches in R.

With base R: strsplit()

The base R function strsplit() splits a character vector by a delimiter:

# Split by space
text <- "Hello world from R"
strsplit(text, " ")
# [[1]]
# [1] "Hello" "world" "from"  "R"

# Split by comma
cities <- "New York,Los Angeles,Chicago"
strsplit(cities, ",")
# [[1]]
# [1] "New York"     "Los Angeles"  "Chicago"

strsplit() returns a list. Use unlist() to get a simple vector:

unlist(strsplit("a,b,c", ","))
# [1] "a" "b" "c"

With string_split()

The stringr package providesr: str a tidyverse-friendly version:

library(stringr)

# Basic split
str_split("one,two,three", ",")
# [[1]]
# [1] "one"   "two"   "three"

# Split into fixed number of pieces with str_split_n()
str_split_n("2026-03-14", "-", n = 3)
# [1] "2026" "03"   "14"

# Split and simplify to matrix
str_split_fixed("a,b,c", ",", n = 3)
#      [,1] [,2] [,3]
# [1,] "a"  "b"  "c"

With tidyr: separate()

For data frames, tidyr::separate() is the most convenient:

library(tidyr)

df <- tibble(full_name = c("John Doe", "Jane Smith", "Bob Wilson"))

# Split into two columns
df %>%
  separate(full_name, into = c("first_name", "last_name"), sep = " ")
# # A tibble: 3 × 2
#   first_name last_name
#   <chr>      <chr>    
# 1 John       Doe      
# 2 Jane       Smith    
# 3 Bob        Wilson

Multiple splits

Split into more than two columns:

df <- tibble(date = c("2026-01-15", "2026-03-14"))

df %>%
  separate(date, into = c("year", "month", "day"), sep = "-")
# # A tibble: 2 × 3
#   year  month day  
#   <chr> <chr> <chr>
# 1 2026  01    15   
# 2 2026  03    14

Handling extra pieces

By default, separate() drops extra pieces. Use extra = "merge" to keep them:

df <- tibble(name = c("John Michael Doe"))

df %>%
  separate(name, into = c("first", "middle", "last"), sep = " ", extra = "merge")
# # A tibble: 1 × 3
#   first middle  last 
#   <chr> <chr>   <chr>
# 1 John  Michael Doe

Handling missing pieces

Use fill = "right" or fill = "left" to handle missing values:

df <- tibble(name = c("John", "Jane Doe"))

df %>%
  separate(name, into = c("first", "last"), sep = " ", fill = "right")
# # A tibble: 2 × 2
#   first last 
#   <chr> <chr>
# 1 John  <NA> 
# 2 Jane  Doe

Using regex patterns

Separate can use regular expressions as delimiters:

df <- tibble(text = c("price: 100 USD", "price: 50 EUR"))

df %>%
  separate(text, into = c("label", "value", "currency"), sep = ": | ")
# # A tibble: 2 × 3
#   label  value currency
#   <chr>  <chr> <chr>   
# 1 price  100   USD     
# 2 price  50    EUR

Undoing splits with unite()

Combine columns back together:

df <- tibble(
  year = c("2026", "2026"),
  month = c("03", "06"),
  day = c("14", "15")
)

df %>%
  unite("date", year, month, day, sep = "-")
# # A tibble: 2 × 1
#   date  
#   <chr> 
# 1 2026-03-14
# 2 2026-06-15

Common patterns

Split and expand into rows

Use tidyr::separate_rows() to split and create one row per element:

df <- tibble(id = 1:2, tags = c("r,stats,visualization", "python,ml"))

df %>%
  separate_rows(tags, sep = ",")
# # A tibble: 5 × 2
#      id tags       
#   <int> <chr>      
# 1     1 r          
# 2     1 stats      
# 3     1 visualization
# 4     2 python     
# 5     2 ml

Split file paths

path <- "~/projects/r-project/data/file.csv"

# Split by /
unlist(strsplit(path, "/"))
# [1] "~"        "projects" "r-project" "data"      "file.csv"

# Get just the filename
basename <- tail(unlist(strsplit(path, "/")), 1)
# [1] "file.csv"

See Also