stringr::str_extract()
str_extract(string, pattern, regex = TRUE) Returns:
character · Updated March 16, 2026 · Tidyverse stringr string pattern regex tidyverse
The str_extract() function from stringr extracts the first matching pattern from a string. It returns the actual matched text, not just TRUE/FALSE like str_detect().
Syntax
str_extract(string, pattern, regex = TRUE)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
string | character | Required | A character vector to search |
pattern | pattern | Required | A pattern to look for (regex, fixed, or coll) |
regex | logical | TRUE | If TRUE, pattern is interpreted as a regular expression |
Examples
Basic usage
library(stringr)
# Extract a substring
strings <- c("abc123def", "xyz456", "789uvw")
str_extract(strings, "[a-z]+")
# [1] "abc" "xyz" "uvw"
Extracting numbers
# Extract digits from strings
str_extract(c("item123", "price999", "qty42"), "[0-9]+")
# [1] "123" "999" "42"
Using regex capture groups
# Extract the username from an email
emails <- c("user@example.com", "admin@domain.org", "test@site.net")
str_extract(emails, "^[^@]+")
# [1] "user" "admin" "test"
Extracting dates
# Extract dates in YYYY-MM-DD format
dates <- c("2024-01-15", "2023-12-25", "2025-06-30")
str_extract(dates, "[0-9]{4}-[0-9]{2}-[0-9]{2}")
# [1] "2024-01-15" "2023-12-25" "2025-06-30"
Using with dplyr mutate
library(dplyr)
df <- data.frame(
text = c("Order #12345", "Invoice #67890", "Ref #ABCDE")
)
df %>%
mutate(order_number = str_extract(text, "[0-9]+"))
# text order_number
# 1 Order #12345 12345
# 2 Invoice #67890 67890
# 3 Ref #ABCDE ABCDE
Using fixed() for exact matching
# Extract literal strings (not regex)
str_extract("Hello World", fixed("World"))
# [1] "World"
# Useful for case-insensitive matching
str_extract("Hello WORLD", fixed("world", ignore_case = TRUE))
# [1] "WORLD"
str_extract_all
For extracting all matches (not just the first), use str_extract_all():
strings <- c("abc123def456", "xyz789", "123")
# Extract all digit sequences
str_extract_all(strings, "[0-9]+")
# [[1]]
# [1] "123" "456"
# [[2]]
# [1] "789"
# [[3]]
# [1] "123"
# Simplify to vector with simplify = TRUE
str_extract_all(strings, "[0-9]+", simplify = TRUE)
# [,1] [,2]
# [1,] "123" "456"
# [2,] "789" ""
# [3,] "123" ""
Common Use Cases
- Parsing structured text (dates, emails, phone numbers)
- Extracting IDs from log files
- Pulling specific parts from formatted strings
- Data cleaning and standardization