Tidyverse
Common dplyr, purrr, and tibble workflows.
dplyr::*_join()
Join two data frames by matching rows based on key columns. Learn how to use dplyr join functions to combine datasets in R.
left_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) dplyr::across()
Apply functions across multiple columns in dplyr: across() modifies columns in place while where() selects columns by type or condition.
across(.cols = everything(), .fns = NULL, ..., .names = NULL) dplyr::arrange
Sort rows of a data frame by column values with dplyr::arrange(), dplyr's row-ordering verb.
arrange(.data, ..., .by_group = FALSE) dplyr::arrange()
Sort rows of a data frame by column values in ascending or descending order using dplyr.
arrange(.data, ..., .by_group = FALSE) dplyr::bind_rows() / dplyr::bind_cols()
Combine data frames by stacking rows or joining columns horizontally.
bind_rows(..., .id = NULL) dplyr::count()
Count the number of observations in each group. These dplyr verbs provide a convenient way to summarise data by grouping variables.
count(x, ..., wt = NULL, sort = FALSE, .drop = TRUE) dplyr::distinct
Remove duplicate rows from an R data frame, keeping only unique combinations of specified columns.
distinct(.data, ..., .keep_all = FALSE) dplyr::filter()
Subset rows of a data frame based on logical conditions using expressive dplyr syntax.
filter(.data, ...) dplyr::filter()
Subset rows of a data frame or tibble using logical conditions with dplyr's filter() function.
filter(.data, ...) dplyr::group_by
Group a data frame by one or more columns for per-group operations with summarise() and mutate().
group_by(.data, ..., .add = FALSE, .drop = group_by_drop_default(.data)) dplyr::group_by()
Group data by one or more columns and compute summary statistics using summarise().
group_by(.data, ..., .add = FALSE, .drop = group_by_drop_default(.data)) dplyr::mutate
Add or modify columns in a tibble or data frame with mutate, dplyr's column-wise transformation function.
mutate(.data, ..., .by = NULL, .keep = c('all','used','unused','none'), .before = NULL, .after = NULL) dplyr::mutate()
Create new columns or modify existing ones in a tibble or data frame using vectorised operations.
mutate(.data, ..., .by = NULL, .keep = c("all", "used", "unused", "none"), .before = NULL, .after = NULL) dplyr::pull
Extract a single column from a data frame as a vector. Defaults to the last column, supports negative indexing from the right, and can produce named vectors.
pull(.data, var = -1, name = NULL, ...) dplyr::rename
Rename columns in a data frame using new_name = old_name syntax. Also covers rename_with() for batch renaming with functions.
rename(.data, ...) dplyr::rename() / dplyr::relocate()
Rename and reorder columns in a tibble or data frame using syntactic naming.
rename(.data, ...)
relocate(.data, ..., .before = NULL, .after = NULL) dplyr::select
Select specific columns from a data frame using flexible tidyselect helpers and syntax.
dplyr::select()
Select columns from a data frame by name, position, or pattern.
select(.data, ...) dplyr::slice()
Select rows by position, head, tail, random sampling, or rank using dplyr slice functions.
slice(.data, ...) dplyr::summarise()
Collapse a tibble to one row per group using summary functions. Use .by (dplyr 1.1+) or group_by().
summarise(.data, ..., .by = NULL, .sort = FALSE, .na.rm = FALSE) lubridate::ymd
Parse dates in year-month-day format from character or numeric input with automatic separator detection and flexible truncation support.
ymd(..., quiet = FALSE, tz = NULL, locale = Sys.getlocale('LC_TIME'), truncated = 0) purrr::keep() / purrr::discard() / purrr::compact()
Filter elements of a list or vector by keeping those matching a predicate, discarding those that don't, or removing NULL and empty elements.
keep(.x, .p, ...)
discard(.x, .p, ...)
compact(.x, ...) purrr::map()
Apply a function to each element of a list or vector, returning a list, vector, or other type.
map(.x, .f, ...) purrr::reduce()
Iteratively combine elements of a vector or list into a single value (reduce) or return the intermediate results (accumulate).
reduce(.x, .f, ..., .init) purrr::safely() / purrr::possibly() / purrr::quietly()
Wrap functions to handle errors gracefully, capture them, and continue execution without failing.
safely(.f, otherwise = NULL, quiet = TRUE)
possibly(.f, otherwise = NULL, quiet = TRUE)
quietly(.f) purrr::walk()
Apply a function for its side effects, returning the input invisibly. Use walk() for single vectors and walk2() for parallel iteration over two vectors.
walk(.x, .f, ...) readr::write_csv()
Write a data frame to a CSV file with readr. Covers parameters, NA handling, quoting, appending, compression, and common gotchas.
write_csv(x, file, na = "NA", append = FALSE, col_names = !append, quote = "needed", escape = "double", eol = "\n", num_threads = readr_threads(), progress = show_progress()) stringr::str_c()
Join multiple strings into one string with optional separators.
str_c(..., sep = "", collapse = NULL) stringr::str_detect()
Detect the presence or absence of a pattern in a string.
str_detect(string, pattern, regex = TRUE) stringr::str_extract()
Extract the first matching pattern from a string.
str_extract(string, pattern, regex = TRUE) stringr::str_length()
Get the length of a string in characters.
str_length(string) stringr::str_pad()
Pad a string to a specified width by adding characters.
str_pad(string, width, side = c("left", "right", "both"), pad = " ") stringr::str_replace()
Replace the first occurrence or all occurrences of a pattern in a string.
str_replace(string, pattern, replacement) stringr::str_trim()
Remove leading and trailing whitespace from strings.
str_trim(string, side = "both") tidyr::fill
Fill missing values in selected columns using the previous or next value. Supports down, up, and bidirectional filling within groups.
fill(data, ..., .by = NULL, .direction = c("down", "up", "downup", "updown")) tidyr::pivot_longer() / tidyr::pivot_wider()
Reshape data between long and wide formats using tidyr's pivot functions.
pivot_longer(data, cols, names_to = "name", values_to = "value")