dplyr::filter()

filter(.data, ...)
Returns: tibble · Updated March 13, 2026 · Tidyverse
dplyr filter tidyverse data-wrangling

The filter() function from dplyr selects rows from a data frame or tibble based on logical conditions. It provides a readable alternative to base R’s subsetting with [ and is one of the most frequently used dplyr verbs. Unlike base R’s subsetting, filter() uses tidy evaluation, making expressions more readable and less prone to errors with non-standard evaluation.

Syntax

filter(.data, ...)

Parameters

ParameterTypeDefaultDescription
.datatibble / data.frameRequiredA tibble or data frame to filter
...logical expressionsRequiredConditions that must evaluate to TRUE for a row to be kept

Examples

Basic usage

library(dplyr)

# Create sample data
df <- data.frame(
  name = c("Alice", "Bob", "Charlie", "Diana", "Eve"),
  age = c(25, 30, 35, 28, 22),
  department = c("Sales", "Engineering", "Sales", "Marketing", "Engineering")
)

# Filter rows where age is greater than 25
filter(df, age > 25)
#      name age  department
# 1    Bob  30 Engineering
# 2 Charlie  35      Sales
# 3  Diana  28  Marketing

Multiple conditions

# Filter with multiple conditions (AND logic)
filter(df, department == "Sales" & age > 30)
#      name age department
# 1 Charlie  35      Sales

# Use OR logic with |
filter(df, department == "Sales" | department == "Marketing")
#      name age  department
# 1 Charlie  35      Sales
# 2  Diana  28  Marketing

Using helper functions

# Filter using grepl for pattern matching
filter(df, grepl("^A", name))
#   name age department
# 1 Alice  25     Sales

# Using between for range checks
filter(df, between(age, 25, 35))
#      name age  department
# 1    Alice  25      Sales
# 2      Bob  30 Engineering
# 3 Charlie  35      Sales
# 4   Diana  28  Marketing

Common Patterns

  • Chaining with pipe: df %>% filter(condition) %>% select(col1, col2)
  • Using %in%: filter(df, name %in% c("Alice", "Bob"))
  • Negating conditions: filter(df, !is.na(column))
  • Filtering with slice: Combine with slice_head() or slice_sample() for subsetting
  • Filtering across multiple columns: Use if_all() or if_any() with column ranges

Advanced Filtering Techniques

# Filter with row-wise conditions using if_all
filter(df, if_all(everything(), ~ !is.na(.)))

# Filter with OR across columns
filter(df, if_any(c(age, name), ~ . > 30))

# Using na_if to convert values before filtering
filter(df, age > na_if(0, NA))

Performance Considerations

The filter() function is optimized to work efficiently with large datasets. For very large data, consider using filter() after select() to minimize the data being processed. When working with database backends via dbplyr, filter() translates your conditions to SQL WHERE clauses, pushing the filtering to the database server for optimal performance.

See Also