String Manipulation with stringr

· 3 min read · Updated March 7, 2026 · beginner
stringr strings tidyverse text regex

String manipulation is a fundamental skill for any R programmer. Whether you’re cleaning messy data, parsing text files, or building features from textual data, the stringr package provides a consistent, readable interface for all string operations. Part of the Tidyverse, stringr wraps base R’s inconsistent string functions with a unified API.

Installing and Loading stringr

If you’ve installed the Tidyverse, stringr is already available. Otherwise, install it separately:

install.packages("stringr")
library(stringr)

All stringr functions start with str_, making them easy to discover with autocompletion.

Basic String Operations

Measuring String Length

The str_length() function returns the number of characters in each string:

words <- c("hello", "world", "R programming")

str_length(words)
# [1] 5 5 14

This is equivalent to nchar() but handles NA more gracefully by default.

Extracting Substrings

Use str_sub() to extract or replace portions of a string:

text <- "R programming is fun"

# Extract characters from position 1 to 3
str_sub(text, 1, 3)
# [1] "R p"

# Replace substring in place
str_sub(text, 1, 3) <- "Py"
text
# [1] "Py programming is fun"

Combining Strings

The str_c() function concatenates strings, handling vectors elegantly:

first_name <- "John"
last_name <- "Doe"

str_c(first_name, " ", last_name)
# [1] "John Doe"

# For vectors, use sep and collapse
str_c(c("a", "b", "c"), 1:3, sep = "-")
# [1] "a-1" "b-2" "c-3"

str_c(c("a", "b", "c"), collapse = ", ")
# [1] "a, b, c"

Pattern Matching

Detecting Patterns

str_detect() returns TRUE where a pattern exists:

fruits <- c("apple", "banana", "cherry", "date")

# Find strings containing 'a'
str_detect(fruits, "a")
# [1] TRUE TRUE TRUE FALSE

Extracting Matches

Extract the actual matched text with str_extract():

emails <- c("user@domain.com", "test@example.org", "invalid")

str_extract(emails, "@\\w+\\.\\w+")
# [1] "@domain.com" "@example.org" NA

Replacing Patterns

Replace matched patterns with str_replace():

text <- "The cat sat on the mat"

str_replace(text, "cat", "dog")
# [1] "The dog sat on the mat"

# Replace all matches
str_replace_all(text, "at", "ot")
# [1] "The cot sot on the mot"

Splitting Strings

Split strings into vectors with str_split():

sentence <- "one,two,three,four"

str_split(sentence, ",")
# [[1]]
# [1] "one"   "two"   "three" "four"

Working with Whitespace

Trimming Whitespace

Remove leading and trailing whitespace with str_trim():

messy <- "   clean this   "

str_trim(messy)
# [1] "clean this"

Squishing Whitespace

Collapse multiple whitespace characters into single spaces with str_squish():

ugly <- "too    many     spaces"

str_squish(ugly)
# [1] "too many spaces"

Padding Strings

Add padding to strings with str_pad() for consistent widths:

numbers <- c("1", "25", "300")

str_pad(numbers, width = 5, side = "left", pad = "0")
# [1] "00001" "00025" "00300"

Practical Examples

Validating Email Addresses

Combine stringr functions for data validation:

is_valid_email <- function(email) {
  pattern <- "^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}$"
  str_detect(email, pattern)
}

emails <- c("user@domain.com", "invalid@", "test@site.org")
is_valid_email(emails)
# [1] TRUE FALSE TRUE

Extracting Numbers from Text

Pull numeric values from mixed text:

prices <- c("$19.99", "$25.50", "$9.99")

# Remove $ sign, convert to numeric
as.numeric(str_replace_all(prices, "\\$", ""))
# [1] 19.99 25.50  9.99

Cleaning Names

Standardize names with consistent formatting:

names <- c("  john doe ", "JANE SMITH", "Alice Bob")

names |>
  str_squish() |>
  str_to_title()
# [1] "John Doe"    "Jane Smith"  "Alice Bob"

Summary

The stringr package provides consistent, readable functions for string manipulation:

FunctionPurpose
str_length()Count characters
str_sub()Extract/replace substrings
str_c()Concatenate strings
str_detect()Find pattern matches
str_extract()Pull matched text
str_replace()Substitute patterns
str_trim()Remove outer whitespace
str_squish()Collapse internal whitespace
str_pad()Add padding characters

These functions form the foundation for text processing in R. Combined with regular expressions, stringr handles virtually any string manipulation task you’ll encounter.