rguides

How to Read CSV Files and Summarise with dplyr in R

Read CSV files and summarise with dplyr for a fast initial exploration pattern in R. Use readr::read_csv() to load data as a tibble with automatic column type inference, then chain dplyr verbs to aggregate by groups. This covers most exploratory analysis tasks in a handful of lines. Call problems(df) after read_csv() to check for rows that failed type conversion. For large files, use read_csv("file.csv", n_max = 1000) to load only the first thousand rows during exploration.

library(readr)
library(dplyr)

data <- read_csv("sales.csv")

summary_table <- data %>%
  group_by(region) %>%
  summarise(
    orders = n(),
    total_revenue = sum(revenue, na.rm = TRUE),
    .groups = "drop"
  )

summary_table

After reading, inspect the structure with glimpse() for a compact column overview. Use col_types to override type inference when a column is read incorrectly. The .groups argument controls whether the result remains grouped: "drop" removes grouping, "keep" preserves it, and "drop_last" drops the innermost grouping level. You can summarise multiple columns at once with across(): summarise(across(where(is.numeric), mean, na.rm = TRUE)) computes the mean of every numeric column in one call.

See also