rguides

Reference

Tidyverse

Common dplyr, purrr, and tibble workflows.

  1. dplyr::*_join()

    Join two data frames by matching rows based on key columns. Learn how to use dplyr join functions to combine datasets in R.

  2. dplyr::across()

    Apply functions across multiple columns in dplyr: across() modifies columns in place while where() selects columns by type or condition.

  3. dplyr::arrange

    Sort rows of a data frame by column values with dplyr::arrange(), dplyr's row-ordering verb.

  4. dplyr::arrange()

    Sort rows of a data frame by column values in ascending or descending order using dplyr.

  5. dplyr::bind_cols

    Combine data frames by adding columns side by side. Matches rows by position, not by key.

  6. dplyr::bind_rows() / dplyr::bind_cols()

    Combine data frames by stacking rows or joining columns horizontally.

  7. dplyr::case_when

    Create a new column using vectorized conditional logic with case_when(), handling multiple conditions in order — the dplyr equivalent of SQL's CASE WHEN.

  8. dplyr::count()

    Count the number of observations in each group. These dplyr verbs provide a convenient way to summarise data by grouping variables.

  9. dplyr::distinct

    Remove duplicate rows from an R data frame, keeping only unique combinations of specified columns.

  10. dplyr::filter()

    Subset rows of a data frame based on logical conditions using expressive dplyr syntax.

  11. dplyr::filter()

    Subset rows of a data frame or tibble using logical conditions with dplyr's filter() function.

  12. dplyr::group_by

    Group a data frame by one or more columns for per-group operations with summarise() and mutate().

  13. dplyr::group_by()

    Group data by one or more columns and compute summary statistics using summarise().

  14. dplyr::if_else

    Type-strict vectorized if-else for R. Stricter than base ifelse(), handles NAs explicitly, preserves types. Used inside mutate() for conditional columns.

  15. dplyr::mutate

    Add or modify columns in a tibble or data frame with mutate, dplyr's column-wise transformation function.

  16. dplyr::mutate()

    Create new columns or modify existing ones in a tibble or data frame using vectorised operations.

  17. dplyr::pull

    Extract a single column from a data frame as a vector. Defaults to the last column, supports negative indexing from the right, and can produce named vectors.

  18. dplyr::relocate

    Move data frame columns to new positions using tidy-select syntax with relocate(), including .before, .after, and renaming during the move.

  19. dplyr::rename

    Rename columns in a data frame using new_name = old_name syntax. Also covers rename_with() for batch renaming with functions.

  20. dplyr::rename() / dplyr::relocate()

    Rename and reorder columns in a tibble or data frame using syntactic naming.

  21. dplyr::select

    Select specific columns from a data frame using flexible tidyselect helpers and syntax.

  22. dplyr::select()

    Select columns from a data frame by name, position, or pattern.

  23. dplyr::slice()

    Select rows by position, head, tail, random sampling, or rank using dplyr slice functions.

  24. dplyr::summarise()

    Collapse a tibble to one row per group using summary functions. Use .by (dplyr 1.1+) or group_by().

  25. fct_lump

    Collapse uncommon factor levels into an Other category. Covers fct_lump_n, fct_lump_prop, fct_lump_min, and fct_lump_lowfreq.

  26. fct_reorder

    Reorder factor levels by a summary statistic of a second variable.

  27. ggplot2::aes()

    Map variables to visual aesthetics in ggplot2. Covers aes(), aes_string(), aes_quosures(), and how column names become plot labels.

  28. ggplot2::coord_flip

    Flip horizontal and vertical axes in ggplot2. Swaps x and y so horizontal bar charts, boxplots, and histograms display cleanly without re-coding aesthetics.

  29. ggplot2::facet_wrap

    Wrap a 1D sequence of panels into a 2D grid with ggplot2. Control nrow, ncol, scales, strip position, and which axes are displayed.

  30. ggplot2::geom_bar()

    Draw bars with height proportional to count or value. geom_bar uses stat_count by default, mapping x to categories and y to frequencies.

  31. ggplot2::geom_boxplot

    Create box and whiskers plots to visualise the distribution of a continuous variable across groups.

  32. ggplot2::geom_histogram()

    Draw histograms to show the distribution of a continuous variable. geom_histogram bins the data and draws bars proportional to the count in each bin.

  33. ggplot2::geom_line()

    Connect observations in order with a line. geom_line draws a line through the data in the sequence it appears, suitable for time series and sequential data.

  34. ggplot2::geom_point()

    Add a scatter plot layer with geom_point(). Covers position, size, colour, shape, alpha aesthetics, and position_jitter for overplotting.

  35. ggplot2::labs

    Set axis labels, legend title, plot title, subtitle, caption, and tag for a ggplot. All in one place.

  36. ggplot2::scale_color_manual

    Define your own colour mappings for discrete variables. Map factor levels to exact colours using a named or unnamed vector.

  37. ggplot2::theme

    Control non-data plot elements in ggplot2: titles, axis labels, legend, panel background, grid lines, and more.

  38. glue()

    Format and interpolate strings in R with expressions inside braces.

  39. interval

    Create an Interval object in lubridate, representing a time span between two specific datetime endpoints with calendar awareness.

  40. lubridate::ymd

    Parse dates in year-month-day format from character or numeric input with automatic separator detection and flexible truncation support.

  41. now

    Get the current system time in R as a POSIXct object. Control the timezone with the tzone argument using IANA timezone strings like 'UTC' or 'America/New_York'.

  42. purrr::discard

    Drop elements from a list or vector that don't match a predicate. discard() removes items where predicate is TRUE — opposite of keep(). Works with pipes.

  43. purrr::keep

    Keep elements of a list or vector that satisfy a predicate. Discard drops the rest. Compact removes empty elements. All three work with the pipe.

  44. purrr::keep() / purrr::discard() / purrr::compact()

    Filter elements of a list or vector by keeping those matching a predicate, discarding those that don't, or removing NULL and empty elements.

  45. purrr::map

    Apply a function to each element of a vector or list. map returns a list; type-specific variants return atomic vectors directly.

  46. purrr::map()

    Apply a function to each element of a list or vector, returning a list, vector, or other type.

  47. purrr::map2

    Iterate over two vectors in parallel, applying a function pairwise. Type-specific variants return atomic vectors of the corresponding type.

  48. purrr::pmap

    Iterate over multiple inputs simultaneously using a list of parameters. pmap feeds corresponding elements from each list as arguments to your function.

  49. purrr::possibly

    Wrap any function with possibly() to return a default value instead of crashing. quietly suppress errors or let them surface. The counterpart to safely().

  50. purrr::reduce

    Apply a binary function cumulatively to a list or vector with purrr reduce. Fold left, fold right, provide an initial value, and inspect intermediate results.

  51. purrr::reduce

    Apply a binary function cumulatively to a list or vector with purrr reduce. Fold left, fold right, provide an initial value, and inspect intermediate results.

  52. purrr::safely

    Wrap any function with safely() to return a list with result and error components. Inspect errors without crashing. The counterpart to possibly().

  53. purrr::safely() / purrr::possibly() / purrr::quietly()

    Wrap functions to handle errors gracefully, capture them, and continue execution without failing.

  54. purrr::walk

    Use purrr walk to perform side effects — save files, print output, plot — while keeping the pipe flowing. Returns input invisibly.

  55. purrr::walk()

    Apply a function for its side effects, returning the input invisibly. Use walk() for single vectors and walk2() for parallel iteration over two vectors.

  56. read_csv

    Read a CSV file into a tibble with automatic type inference. Part of the readr package in the tidyverse.

  57. readr::write_csv()

    Write a data frame to a CSV file with readr. Covers parameters, NA handling, quoting, appending, compression, and common gotchas.

  58. replace_na

    Replace NA values in R vectors and data frames with tidyr::replace_na().

  59. separate()

    Split a character column into multiple columns by splitting on a delimiter pattern. Part of the tidyr package in the tidyverse.

  60. str_sub

    Extract a substring from a character vector using inclusive start/end positions, with support for negative indexing from the end of the string.

  61. stringr::str_c()

    Join multiple strings into one string with optional separators.

  62. stringr::str_detect()

    Detect the presence or absence of a pattern in a string.

  63. stringr::str_extract()

    Extract the first matching pattern from a string.

  64. stringr::str_length()

    Get the length of a string in characters.

  65. stringr::str_pad()

    Pad a string to a specified width by adding characters.

  66. stringr::str_replace()

    Replace the first occurrence or all occurrences of a pattern in a string.

  67. stringr::str_trim()

    Remove leading and trailing whitespace from strings.

  68. tibble

    Create a tibble, a modern reimagining of the data frame in R, with better printing, stricter subsetting, and consistent behavior.

  69. tidyr::complete()

    Fill in missing combinations of values in a data frame. Use complete() to expose implicit gaps in your data and turn them into explicit rows.

  70. tidyr::drop_na()

    Drop rows containing any missing values from a data frame. Use drop_na() to quickly clean data before analysis or modelling.

  71. tidyr::fill

    Fill missing values in selected columns using the previous or next value. Supports down, up, and bidirectional filling within groups.

  72. tidyr::nest()

    Nest columns into a list-column of data frames. Use nest() to create nested tidy data for per-group operations.

  73. tidyr::pivot_longer()

    Lengthen data by pivoting columns into rows. Transform wide data into tidy format where each row is a single observation.

  74. tidyr::pivot_longer() / tidyr::pivot_wider()

    Reshape data between long and wide formats using tidyr's pivot functions.

  75. tidyr::pivot_wider()

    Widen data by spreading key-value pairs across columns. The inverse of pivot_longer().

  76. tidyr::unnest()

    Expand list-columns back into rows and regular columns. Use unnest() to flatten nested tidy data for analysis.

  77. tribble

    Create a tibble using a readable row-by-row layout with tribble(), the tidyverse alternative to data.frame() for small, human-readable tables.

  78. unite

    Unite multiple columns into one by pasting strings together.