String Manipulation with stringr

· 3 min read · Updated March 10, 2026 · beginner
stringr tidyverse strings text

stringr is part of the tidyverse and provides a consistent interface for string operations. If you’ve ever struggled with R’s base string functions — paste(), substr(), grep(), gsub() — stringr makes them easier to remember and use.

This guide covers the most useful stringr functions for everyday data work.

Installing stringr

install.packages("stringr")
library(stringr)

Or load the entire tidyverse:

library(tidyverse)

Creating Strings

str_c(): Combining Strings

str_c() combines strings together. It handles missing values gracefully.

str_c("Hello", " ", "World")
# [1] "Hello World"

str_c("x", 1:3, sep = "_")
# [1] "x_1" "x_2" "x_3"

str_flatten() collapses a vector into a single string:

str_flatten(c("a", "b", "c"), collapse = ", ")
# [1] "a, b, c"

str_repeat(): Repeating Strings

str_repeat("ha", 3)
# [1] "hahaha"

String Length

str_length(): Counting Characters

str_length(c("apple", "banana", "cherry"))
# [1] 5 6 6

This counts the actual characters, not bytes. For strings with non-ASCII characters, this matters.

Subsetting Strings

str_sub(): Extracting Parts

str_sub() extracts a substring by position.

x <- "abcdef"
str_sub(x, 1, 3)
# [1] "abc"

str_sub(x, -3, -1)
# [1] "def"

You can also use it to replace parts of a string:

x <- "apple"
str_sub(x, 1, 1) <- "A"
x
# [1] "Apple"

str_extract(): Pattern Extraction

str_extract() pulls out the first match to a pattern:

str_extract("The price is $50.00", "\\d+\\.\\d+")
# [1] "50.00"

str_extract_all() returns all matches:

str_extract_all("abc123def456", "\\d+")
# [[1]]
# [1] "123" "456"

Pattern Detection

str_detect(): Finding Patterns

str_detect() returns TRUE if a pattern exists in a string:

fruits <- c("apple", "banana", "cherry", "apricot")
str_detect(fruits, "^a")
# [1]  TRUE FALSE FALSE  TRUE

This is useful with sum() to count matches or with filter() in dplyr:

# Count strings starting with 'a'
sum(str_detect(fruits, "^a"))
# [1] 2

str_starts() and str_ends()

Check if strings start or end with a pattern:

str_starts(fruits, "a")
# [1]  TRUE FALSE FALSE  TRUE

str_ends(fruits, "e")
# [1]  TRUE FALSE TRUE FALSE

String Replacement

str_replace(): Substituting Patterns

str_replace() replaces the first match:

str_replace("apple pie", "pie", "tart")
# [1] "apple tart"

str_replace_all() replaces all matches:

str_replace_all("aaa", "a", "b")
# [1] "bbb"

Use str_remove() as shorthand for replacing with nothing:

str_remove_all("a-b-c-d", "-")
# [1] "abcd"

Splitting Strings

str_split(): Breaking Apart

str_split() splits a string into pieces:

str_split("a,b,c", ",")
# [[1]]
# [1] "a" "b" "c"

Add simplify = TRUE to get a matrix:

str_split("a,b,c", ",", simplify = TRUE)
#      [,1] [,2] [,3]
# [1,] "a"  "b"  "c"

Use str_glue() and str_glue_data() for string interpolation:

name <- "Alice"
age <- 30
str_glue("My name is {name} and I am {age} years old.")
# My name is Alice and I am 30 years old.

Whitespace Handling

str_trim(): Removing Extra Spaces

str_trim("  hello  ")
# [1] "hello"

str_squish("  hello   world  ")
# [1] "hello world"

str_pad(): Adding Padding

str_pad("apple", width = 10, side = "left", pad = " ")
# [1] "     apple"

str_pad("5", width = 2, pad = "0")
# [1] "05"

Case Manipulation

str_to_upper(), str_to_lower(), str_to_title()

str_to_upper("Hello World")
# [1] "HELLO WORLD"

str_to_lower("Hello World")
# [1] "hello world"

str_to_title("hello world")
# [1] "Hello World"

Sorting Strings

str_order() and str_sort()

x <- c("banana", "Apple", "cherry")
str_sort(x)
# [1] "Apple"  "banana" "cherry"

str_sort(x, locale = "en")
# [1] "Apple"  "banana" "cherry"

The locale argument matters for non-English characters.

Common Patterns

Email Extraction

emails <- c("john@email.com", "jane.doe@company.org", "invalid")
str_extract(emails, "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}")
# [1] "john@email.com"    "jane.doe@company.org" NA

Phone Number Formatting

phone <- "5551234567"
str_replace(phone, "(\\d{3})(\\d{3})(\\d{4})", "(\\1) \\2-\\3")
# [1] "(555) 123-4567"

Extracting Numbers from Text

text <- "The temperature is 25 degrees"
str_extract(text, "-?\\d+")
# [1] "25"

When to Use stringr

stringr is ideal for most string manipulation tasks. The function names are intuitive: str_ prefix, then a verb (detect, extract, replace, split, etc.).

For very large text data, you might consider stringi, which stringr is built on. For regex-heavy operations, the pattern syntax is the same.

Summary

FunctionPurpose
str_c()Combine strings
str_length()Count characters
str_sub()Extract by position
str_extract()Extract by pattern
str_detect()Check if pattern exists
str_replace()Substitute patterns
str_split()Split into pieces
str_trim()Remove whitespace
str_to_upper()Change case

Master these functions, and you’ll handle the vast majority of string manipulation tasks in R.