grepl()
grepl(pattern, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE) grepl() searches for matches to a pattern in a character vector and returns a logical vector indicating whether each element matches.
Syntax
grepl(pattern, x, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| pattern | character | , | A regular expression pattern to match |
| x | character | — | A character vector to search in |
| ignore.case | logical | FALSE | If TRUE, ignore case when matching |
| perl | logical | FALSE | If TRUE, use Perl-compatible regular expressions |
| fixed | logical | FALSE | If TRUE, treat pattern as a literal string (faster) |
| useBytes | logical | FALSE | If TRUE, match byte-by-byte rather than character-by-character |
Examples
Basic usage
fruits <- c("apple", "banana", "cherry", "date", "elderberry")
grepl("a", fruits)
# [1] TRUE TRUE FALSE TRUE FALSE
Case-insensitive matching
The ignore.case = TRUE argument makes grepl() treat uppercase and lowercase letters as equivalent. A search for "red" matches "Red" and "GREEN" without any pre-processing of the input vector. This is the simplest way to handle inconsistent capitalization in user-submitted data, log files, or merged datasets where the same value appears in different cases.
colors <- c("Red", "GREEN", "blue", "YELLOW")
# Ignore case when matching
grepl("red", colors, ignore.case = TRUE)
# [1] TRUE TRUE FALSE TRUE
Fixed string matching (faster)
When your pattern is a literal string with no regex metacharacters, pass fixed = TRUE to skip regex compilation. This is measurably faster on large vectors and also prevents accidental regex interpretation — searching for "@" with the default regex engine would work but is unnecessary overhead. Use fixed = TRUE whenever the pattern is a known substring rather than a pattern that needs wildcards or alternation.
emails <- c("user@example.com", "test@domain.org", "invalid")
grep("@", emails, fixed = TRUE)
# [1] 1 2
Common patterns
The examples above show grepl() in isolation. In practice, you combine the logical vector it returns with other R functions. These three patterns handle the most frequent grepl-related tasks you will encounter in data analysis workflows.
Counting matches
Since grepl() returns a logical vector where TRUE is 1 and FALSE is 0, wrapping the result in sum() gives you a count of matching elements. This is useful for data quality checks — counting rows with missing values, records matching a validation rule, or entries containing a specific keyword.
# Count how many elements match
sum(grepl("pattern", strings))
Filtering data frames
The logical vector from grepl() feeds directly into R’s [ subset operator to filter rows: df[grepl("pattern", df$column), ]. This is base R’s equivalent of dplyr::filter(df, grepl("pattern", column)). The logical-vector approach works on data frames, matrices, and atomic vectors alike, making it a versatile pattern that does not rely on any external packages.
# Filter rows containing a pattern
df[grepl("pattern", df$column), ]
Multiple patterns (OR logic)
The regex alternation operator | lets you match any of several patterns in a single grepl() call. The pattern "apple|orange" matches strings containing either word. This is cleaner than chaining multiple grepl() calls with | and runs as a single regex pass. For programmatic pattern construction, use paste(terms, collapse = "|") to build the alternation string from a vector of search terms.
# Match multiple patterns using alternation
fruits <- c("apple", "orange", "banana", "grape")
grepl("apple|orange", fruits)
# [1] TRUE TRUE FALSE FALSE
grepl() vs grep() and when to use each
grepl() returns a logical vector of the same length as x, making it natural for filtering: x[grepl(pattern, x)]. grep() returns indices or matched values directly and is useful when you need positions rather than a mask.
For fixed-string matching (no wildcards), pass fixed = TRUE — this skips regex compilation and is considerably faster when checking many strings against a literal pattern. For case-insensitive matching without converting case, use ignore.case = TRUE.
The perl = TRUE flag switches the regex engine to PCRE, which supports lookaheads, lookbehinds, and named capture groups that are not available in the default TRE engine. For most string-detection tasks, the default engine is sufficient.
grepl() propagates NA values: a missing string returns NA. In filtering contexts, combine with & !is.na(x) if you want to exclude missing values from matches.
# NA propagation in grepl()
vals <- c("apple", NA, "banana", "cherry")
grepl("a", vals)
# [1] TRUE NA TRUE FALSE
# Exclude NAs when filtering
vals[grepl("a", vals) & !is.na(vals)]
# [1] "apple" "banana"
The stringr equivalent is str_detect(x, pattern), which always uses PCRE and has consistent NA handling.
A practical note on regex performance: for simple literal substring checks (fixed = TRUE), grepl() is significantly faster than the default TRE regex engine. For checking whether a column contains a fixed string across millions of rows, grepl(pattern, x, fixed = TRUE) can be 5–10x faster than the regex variant. When pattern is a user-supplied string that might contain regex metacharacters, using fixed = TRUE also prevents accidental regex injection.
grepl() returns a logical vector the same length as the input, making it suitable for filtering: x[grepl(pattern, x)]. Unlike grep(), which returns indices or matching values, grepl() fits naturally in filter() and other tidyverse verbs. Set perl = TRUE to enable PCRE syntax for lookaheads and non-capturing groups. ignore.case = TRUE performs case-insensitive matching without altering the input.
See also
startsWith(x, prefix) and endsWith(x, suffix) are faster alternatives to grepl('^prefix', x) for simple prefix/suffix checks.
grepl(pattern, x, value = FALSE) in base R is equivalent to str_detect(x, pattern) but with arguments reversed. The perl = TRUE flag enables PCRE (Perl-Compatible Regular Expressions) which supports lookaheads, lookbehinds, and non-capturing groups. grepl("(?i)pattern", x, perl = TRUE) is case-insensitive PCRE matching — the (?i) modifier applies only to the subsequent pattern, not the entire expression. str_detect() handles the ignore_case option more readably via regex("pattern", ignore_case = TRUE).