dplyr::filter()
filter(.data, ...) Returns:
tibble · Updated March 13, 2026 · Tidyverse dplyr filter tidyverse data-wrangling
The filter() function from dplyr selects rows from a data frame or tibble based on logical conditions. It provides a readable alternative to base R’s subsetting with [ and is one of the most frequently used dplyr verbs. Unlike base R’s subsetting, filter() uses tidy evaluation, making expressions more readable and less prone to errors with non-standard evaluation.
Syntax
filter(.data, ...)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
.data | tibble / data.frame | Required | A tibble or data frame to filter |
... | logical expressions | Required | Conditions that must evaluate to TRUE for a row to be kept |
Examples
Basic usage
library(dplyr)
# Create sample data
df <- data.frame(
name = c("Alice", "Bob", "Charlie", "Diana", "Eve"),
age = c(25, 30, 35, 28, 22),
department = c("Sales", "Engineering", "Sales", "Marketing", "Engineering")
)
# Filter rows where age is greater than 25
filter(df, age > 25)
# name age department
# 1 Bob 30 Engineering
# 2 Charlie 35 Sales
# 3 Diana 28 Marketing
Multiple conditions
# Filter with multiple conditions (AND logic)
filter(df, department == "Sales" & age > 30)
# name age department
# 1 Charlie 35 Sales
# Use OR logic with |
filter(df, department == "Sales" | department == "Marketing")
# name age department
# 1 Charlie 35 Sales
# 2 Diana 28 Marketing
Using helper functions
# Filter using grepl for pattern matching
filter(df, grepl("^A", name))
# name age department
# 1 Alice 25 Sales
# Using between for range checks
filter(df, between(age, 25, 35))
# name age department
# 1 Alice 25 Sales
# 2 Bob 30 Engineering
# 3 Charlie 35 Sales
# 4 Diana 28 Marketing
Common Patterns
- Chaining with pipe:
df %>% filter(condition) %>% select(col1, col2) - Using
%in%:filter(df, name %in% c("Alice", "Bob")) - Negating conditions:
filter(df, !is.na(column)) - Filtering with
slice: Combine withslice_head()orslice_sample()for subsetting - Filtering across multiple columns: Use
if_all()orif_any()with column ranges
Advanced Filtering Techniques
# Filter with row-wise conditions using if_all
filter(df, if_all(everything(), ~ !is.na(.)))
# Filter with OR across columns
filter(df, if_any(c(age, name), ~ . > 30))
# Using na_if to convert values before filtering
filter(df, age > na_if(0, NA))
Performance Considerations
The filter() function is optimized to work efficiently with large datasets. For very large data, consider using filter() after select() to minimize the data being processed. When working with database backends via dbplyr, filter() translates your conditions to SQL WHERE clauses, pushing the filtering to the database server for optimal performance.