How to Select Columns by Name in R with dplyr::select()
Select columns by name in R with dplyr::select(), base R bracket notation, or data.table column syntax. Each approach extracts the columns you need while preserving row order. The dplyr method is the most readable for exploratory work, while base R and data.table suit scripting and pipelines. For renaming while selecting, use new_name = old_name syntax inside select().
library(dplyr)
# Select multiple columns by name
df %>% select(name, salary, department)
# Exclude columns with !
df %>% select(!id)
# Select by pattern or predicate
df %>% select(starts_with("sal"), where(is.numeric))
Base R uses df[, c("name", "salary")] for simple column lists. For programmatic selection when column names are in a variable, use all_of(col_vector) (strict) or any_of(col_vector) (skips missing columns): df |> select(all_of(keep_cols)).
# Reorder while selecting: put key columns first
df %>% select(id, name, department, everything())
Data.table provides fast in-place selection: dt[, .(name, salary, department)] or dt[, .SD, .SDcols = patterns("^s")] for pattern-based selection. Selection helpers include starts_with(), ends_with(), contains(), matches(), and where(). The last_col() and num_range() helpers are useful when column names follow a predictable numbering scheme. To drop columns instead of keeping them, prefix names with - or use ! inside select(). Use rename() instead of select() when you only want to rename columns without dropping any.