rguides

How to Split String Columns into Multiple Columns in R

Split string columns into multiple columns in R with strsplit() for simple cases or tidyr::separate() for data frame pipelines. The base function returns a list of character vectors split by a delimiter, while tidyr appends the new columns directly to the original data frame. Both approaches handle names, addresses, and delimited text fields. When the separator appears a variable number of times, use separate() with extra = "merge" to prevent data loss.

# Base R: strsplit() returns a list
text <- "Hello world from R"
strsplit(text, " ")
# [[1]]
# [1] "Hello" "world" "from"  "R"

# tidyr: separate() works on data frame columns
library(tidyr)
df <- tibble(full_name = c("John Doe", "Jane Smith"))
df %>% separate(full_name, into = c("first", "last"), sep = " ")
#   first last
#   <chr> <chr>
# 1 John  Doe
# 2 Jane  Smith

stringr::str_split() follows the same logic as strsplit() but with a more consistent API and better NA handling. Use str_split_fixed() to return a character matrix instead of a list.

library(stringr)
str_split("one,two,three", ",")
# [[1]]
# [1] "one"   "two"   "three"

str_split_fixed("a,b,c", ",", n = 3)
#      [,1] [,2] [,3]
# [1,] "a"  "b"  "c"

For splitting a string and expanding into rows, use tidyr::separate_rows(). To combine columns back together, use tidyr::unite(). When the separator is a fixed string, set fixed = TRUE in strsplit() to avoid regex interpretation. The extra and fill arguments in separate() control what happens when there are more or fewer pieces than column names.

See also