How to subset a data frame by multiple conditions in R
· 3 min read · Updated March 14, 2026 · beginner
r subsetting filtering dplyr data.table
Subsetting by multiple conditions lets you filter data based on several criteria at once. This is essential for real-world data analysis.
The Three Boolean Operators
Before diving in, understand the three key operators:
&(AND): Both conditions must be true|(OR): At least one condition must be true!(NOT): Negates a condition
With dplyr
The tidyverse filter() function handles multiple conditions clearly:
library(dplyr)
# AND: both conditions must be TRUE
subset <- df %>%
filter(salary > 50000 & department == "Engineering")
# OR: at least one condition must be TRUE
subset <- df %>%
filter(salary > 70000 | years_experience > 5)
# Mix AND and OR with parentheses
subset <- df %>%
filter((salary > 50000 | bonus > 10000) & department == "Sales")
Using ! (NOT):
# Exclude certain departments
subset <- df %>%
filter(!department %in% c("HR", "Legal"))
# Keep rows where salary is NOT missing
subset <- df %>%
filter(!is.na(salary))
With Base R
Base R uses square bracket notation with the same operators:
# AND: both conditions
subset <- df[df$salary > 50000 & df$department == "Engineering", ]
# OR: at least one condition
subset <- df[df$salary > 70000 | df$years_experience > 5, ]
# Complex conditions with parentheses
subset <- df[(df$salary > 50000 | df$bonus > 10000) & df$department == "Sales", ]
# NOT operator
subset <- df[!df$department %in% c("HR", "Legal"), ]
The comma after the conditions is required for data frames—it tells R to select rows.
With data.table
The data.table package uses a similar syntax but is more concise:
library(data.table)
dt <- as.data.table(df)
# AND
subset <- dt[salary > 50000 & department == "Engineering"]
# OR
subset <- dt[salary > 70000 | years_experience > 5]
# NOT
subset <- dt[!department %in% c("HR", "Legal")]
Practical Examples
Multiple Numeric Conditions
library(dplyr)
# Filter between a range
subset <- df %>%
filter(age >= 25 & age <= 45)
# Equivalent using between()
subset <- df %>%
filter(between(age, 25, 45))
Multiple String Conditions
library(dplyr)
# Match multiple exact values
subset <- df %>%
filter(city %in% c("New York", "Los Angeles", "Chicago"))
# Starts with (using grepl)
subset <- df %>%
filter(grepl("^Eng", department))
Combining Date and Numeric Conditions
library(dplyr)
df$hire_date <- as.Date(df$hire_date)
subset <- df %>%
filter(
hire_date > as.Date("2020-01-01"),
salary > 50000,
department %in% c("Engineering", "Data")
)
Common Mistakes
Forgetting parentheses with mixed AND/OR causes unexpected results:
# WRONG - OR binds first, then AND
subset <- df %>% filter(salary > 50000 & department == "Sales" | department == "Engineering")
# RIGHT - use parentheses to control order
subset <- df %>% filter((salary > 50000 & department == "Sales") | department == "Engineering")