How to Replace Values in Data Frame Columns in R
When you replace values in a data frame column, mutate() with ifelse() or case_when() handles dplyr pipelines cleanly, base R subsetting works for one‑off replacements, and data.table’s := operator shines when speed matters on large datasets.
library(dplyr)
# Simple replacement with ifelse()
df <- df %>% mutate(status = ifelse(status == "active", 1, 0))
# Multiple conditions with case_when()
df <- df %>% mutate(
grade = case_when(score >= 90 ~ "A", score >= 80 ~ "B", TRUE ~ "C")
)
# Replace NA with coalesce()
df <- df %>% mutate(price = coalesce(price, 0))
For base R, df$status[df$status == "active"] <- 1 modifies in place without loading packages. The data.table equivalent dt[status == "active", status := 1] modifies by reference, avoiding copies on large datasets. For string replacements within values, use gsub() in base R or stringr::str_replace_all(). case_when() is preferred over nested ifelse() for multiple conditions because it stays readable as the logic grows. Use na_if() to convert specific values to NA before replacing them. When working with factor columns, convert to character first with as.character() before replacing, then convert back with as.factor() — factor levels do not update automatically. For numeric columns, between() from dplyr is a readable alternative to chaining >= and <= comparisons in case_when(). Both base R and data.table approaches handle vectorized replacement efficiently.
# Base R subsetting
df$status[df$status == "active"] <- 1
# data.table by reference
library(data.table)
dt <- data.table(df)
dt[status == "active", status := 1]