← Reference

Tidyverse

Common dplyr, purrr, and tibble workflows.

dplyr::*_join()

Join two data frames by matching rows based on key columns. Learn how to use dplyr join functions to combine datasets in R.

left_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)

dplyr::across()

Apply functions across multiple columns in dplyr: across() modifies columns in place while where() selects columns by type or condition.

across(.cols = everything(), .fns = NULL, ..., .names = NULL)

dplyr::arrange

Sort rows of a data frame by column values with dplyr::arrange(), dplyr's row-ordering verb.

arrange(.data, ..., .by_group = FALSE)

dplyr::arrange()

Sort rows of a data frame by column values in ascending or descending order using dplyr.

arrange(.data, ..., .by_group = FALSE)

dplyr::bind_rows() / dplyr::bind_cols()

Combine data frames by stacking rows or joining columns horizontally.

bind_rows(..., .id = NULL)

dplyr::count()

Count the number of observations in each group. These dplyr verbs provide a convenient way to summarise data by grouping variables.

count(x, ..., wt = NULL, sort = FALSE, .drop = TRUE)

dplyr::distinct

Remove duplicate rows from an R data frame, keeping only unique combinations of specified columns.

distinct(.data, ..., .keep_all = FALSE)

dplyr::filter()

Subset rows of a data frame based on logical conditions using expressive dplyr syntax.

filter(.data, ...)

dplyr::filter()

Subset rows of a data frame or tibble using logical conditions with dplyr's filter() function.

filter(.data, ...)

dplyr::group_by

Group a data frame by one or more columns for per-group operations with summarise() and mutate().

group_by(.data, ..., .add = FALSE, .drop = group_by_drop_default(.data))

dplyr::group_by()

Group data by one or more columns and compute summary statistics using summarise().

group_by(.data, ..., .add = FALSE, .drop = group_by_drop_default(.data))

dplyr::mutate

Add or modify columns in a tibble or data frame with mutate, dplyr's column-wise transformation function.

mutate(.data, ..., .by = NULL, .keep = c('all','used','unused','none'), .before = NULL, .after = NULL)

dplyr::mutate()

Create new columns or modify existing ones in a tibble or data frame using vectorised operations.

mutate(.data, ..., .by = NULL, .keep = c("all", "used", "unused", "none"), .before = NULL, .after = NULL)

dplyr::pull

Extract a single column from a data frame as a vector. Defaults to the last column, supports negative indexing from the right, and can produce named vectors.

pull(.data, var = -1, name = NULL, ...)

dplyr::rename

Rename columns in a data frame using new_name = old_name syntax. Also covers rename_with() for batch renaming with functions.

rename(.data, ...)

dplyr::rename() / dplyr::relocate()

Rename and reorder columns in a tibble or data frame using syntactic naming.

rename(.data, ...) relocate(.data, ..., .before = NULL, .after = NULL)

dplyr::select

Select specific columns from a data frame using flexible tidyselect helpers and syntax.

dplyr::select()

Select columns from a data frame by name, position, or pattern.

select(.data, ...)

dplyr::slice()

Select rows by position, head, tail, random sampling, or rank using dplyr slice functions.

slice(.data, ...)

dplyr::summarise()

Collapse a tibble to one row per group using summary functions. Use .by (dplyr 1.1+) or group_by().

summarise(.data, ..., .by = NULL, .sort = FALSE, .na.rm = FALSE)

lubridate::ymd

Parse dates in year-month-day format from character or numeric input with automatic separator detection and flexible truncation support.

ymd(..., quiet = FALSE, tz = NULL, locale = Sys.getlocale('LC_TIME'), truncated = 0)

purrr::keep() / purrr::discard() / purrr::compact()

Filter elements of a list or vector by keeping those matching a predicate, discarding those that don't, or removing NULL and empty elements.

keep(.x, .p, ...) discard(.x, .p, ...) compact(.x, ...)

purrr::map()

Apply a function to each element of a list or vector, returning a list, vector, or other type.

map(.x, .f, ...)

purrr::reduce()

Iteratively combine elements of a vector or list into a single value (reduce) or return the intermediate results (accumulate).

reduce(.x, .f, ..., .init)

purrr::safely() / purrr::possibly() / purrr::quietly()

Wrap functions to handle errors gracefully, capture them, and continue execution without failing.

safely(.f, otherwise = NULL, quiet = TRUE) possibly(.f, otherwise = NULL, quiet = TRUE) quietly(.f)

purrr::walk()

Apply a function for its side effects, returning the input invisibly. Use walk() for single vectors and walk2() for parallel iteration over two vectors.

walk(.x, .f, ...)

readr::write_csv()

Write a data frame to a CSV file with readr. Covers parameters, NA handling, quoting, appending, compression, and common gotchas.

write_csv(x, file, na = "NA", append = FALSE, col_names = !append, quote = "needed", escape = "double", eol = "\n", num_threads = readr_threads(), progress = show_progress())

stringr::str_c()

Join multiple strings into one string with optional separators.

str_c(..., sep = "", collapse = NULL)

stringr::str_detect()

Detect the presence or absence of a pattern in a string.

str_detect(string, pattern, regex = TRUE)

stringr::str_extract()

Extract the first matching pattern from a string.

str_extract(string, pattern, regex = TRUE)

stringr::str_length()

Get the length of a string in characters.

str_length(string)

stringr::str_pad()

Pad a string to a specified width by adding characters.

str_pad(string, width, side = c("left", "right", "both"), pad = " ")

stringr::str_replace()

Replace the first occurrence or all occurrences of a pattern in a string.

str_replace(string, pattern, replacement)

stringr::str_trim()

Remove leading and trailing whitespace from strings.

str_trim(string, side = "both")

tidyr::fill

Fill missing values in selected columns using the previous or next value. Supports down, up, and bidirectional filling within groups.

fill(data, ..., .by = NULL, .direction = c("down", "up", "downup", "updown"))

tidyr::pivot_longer() / tidyr::pivot_wider()

Reshape data between long and wide formats using tidyr's pivot functions.

pivot_longer(data, cols, names_to = "name", values_to = "value")