stringr::str_extract()

str_extract(string, pattern, regex = TRUE)
Returns: character · Updated March 16, 2026 · Tidyverse
stringr string pattern regex tidyverse

The str_extract() function from stringr extracts the first matching pattern from a string. It returns the actual matched text, not just TRUE/FALSE like str_detect().

Syntax

str_extract(string, pattern, regex = TRUE)

Parameters

ParameterTypeDefaultDescription
stringcharacterRequiredA character vector to search
patternpatternRequiredA pattern to look for (regex, fixed, or coll)
regexlogicalTRUEIf TRUE, pattern is interpreted as a regular expression

Examples

Basic usage

library(stringr)

# Extract a substring
strings <- c("abc123def", "xyz456", "789uvw")
str_extract(strings, "[a-z]+")
# [1] "abc" "xyz" "uvw"

Extracting numbers

# Extract digits from strings
str_extract(c("item123", "price999", "qty42"), "[0-9]+")
# [1] "123" "999" "42"

Using regex capture groups

# Extract the username from an email
emails <- c("user@example.com", "admin@domain.org", "test@site.net")
str_extract(emails, "^[^@]+")
# [1] "user" "admin" "test"

Extracting dates

# Extract dates in YYYY-MM-DD format
dates <- c("2024-01-15", "2023-12-25", "2025-06-30")
str_extract(dates, "[0-9]{4}-[0-9]{2}-[0-9]{2}")
# [1] "2024-01-15" "2023-12-25" "2025-06-30"

Using with dplyr mutate

library(dplyr)

df <- data.frame(
  text = c("Order #12345", "Invoice #67890", "Ref #ABCDE")
)

df %>%
  mutate(order_number = str_extract(text, "[0-9]+"))
#             text order_number
# 1  Order #12345        12345
# 2 Invoice #67890        67890
# 3    Ref #ABCDE        ABCDE

Using fixed() for exact matching

# Extract literal strings (not regex)
str_extract("Hello World", fixed("World"))
# [1] "World"

# Useful for case-insensitive matching
str_extract("Hello WORLD", fixed("world", ignore_case = TRUE))
# [1] "WORLD"

str_extract_all

For extracting all matches (not just the first), use str_extract_all():

strings <- c("abc123def456", "xyz789", "123")

# Extract all digit sequences
str_extract_all(strings, "[0-9]+")
# [[1]]
# [1] "123" "456"
# [[2]]
# [1] "789"
# [[3]]
# [1] "123"

# Simplify to vector with simplify = TRUE
str_extract_all(strings, "[0-9]+", simplify = TRUE)
#      [,1]  [,2]
# [1,] "123" "456"
# [2,] "789" ""  
# [3,] "123" ""

Common Use Cases

  • Parsing structured text (dates, emails, phone numbers)
  • Extracting IDs from log files
  • Pulling specific parts from formatted strings
  • Data cleaning and standardization

See Also