dplyr::count()
count(x, ..., wt = NULL, sort = FALSE, .drop = TRUE) count() and tally() are dplyr verbs for counting observations. count() is a convenience wrapper that combines group_by() and summarise(n()) in one step. tally() adds a count column to a grouped tibble.
Syntax
count(x, ..., wt = NULL, sort = FALSE, .drop = TRUE)
tally(x, wt = NULL, sort = FALSE, name = "n")
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
x | tibble/data.frame | required | The data to count |
... | variables | optional | Variables to group by (count only) |
wt | variable | NULL | Optional weighting variable, adds a weighted count instead of raw count |
sort | logical | FALSE | If TRUE, sort output by count in descending order |
.drop | logical | TRUE | If FALSE, include combinations with zero counts |
name | string | "n" | Name of the count column (tally only) |
Examples
Basic usage
Calling tally() on a data frame without arguments counts all rows and returns a single-row tibble with an n column. This is the simplest possible usage and produces the same result as count() with no grouping variables:
library(dplyr)
# Simple count of all rows
mtcars |> tally()
# # A tibble: 1 × 1
# n
# <int>
# 1 32
Count by group
Passing a column name to count() groups by that column, counts the rows within each group, and then ungroups the result automatically. The output contains one row per unique value of the grouping variable plus an n column with the frequency:
# Count cars by number of cylinders
mtcars |> count(cyl)
# # A tibble: 3 × 2
# cyl n
# <dbl> <int>
# 1 4 11
# 2 6 7
# 3 8 14
Count by multiple groups
When you supply more than one column to count(), it creates a frequency table for every unique combination of those columns. The resulting tibble has one row per combination of values, which can quickly grow large if the columns have many distinct levels:
# Count by cylinders and gears
mtcars |> count(cyl, gear)
# # A tibble: 8 × 3
# cyl gear n
# <dbl> <dbl> <int>
# 1 4 3 1
# 2 4 4 7
# 3 4 5 3
# 4 6 3 2
# 5 6 4 4
# 6 6 5 1
# 7 8 3 5
# 8 8 5 9
Weighted counts
The wt argument transforms count() from a row counter into a sum aggregator. Instead of counting how many rows belong to each group, it sums the values of the specified weight column. This is the equivalent of group_by(cyl) |> summarise(n = sum(hp)) but expressed more concisely:
# Weight by horsepower - sum of hp per cylinder
mtcars |> count(cyl, wt = hp)
# # A tibble: 3 × 2
# cyl n
# <dbl> <dbl>
# 1 4 908
# 2 6 761
# 3 8 2929
Using tally() with existing groups
While count() creates its own grouping, tally() works on data that is already grouped. After calling group_by(), tally() adds an n column containing the row count per group. This pattern is useful when you want to apply multiple summarising steps to the same grouping structure:
# First group, then tally
mtcars |> group_by(cyl) |> tally()
# # A tibble: 3 × 2
# cyl n
# <dbl> <int>
# 1 4 11
# 2 6 7
# 3 8 14
Sort by count
Setting sort = TRUE orders the output by the count column in descending order, putting the largest groups first. This is a common pattern for exploratory analysis where you want to see which categories dominate the dataset without running a separate arrange(desc(n)) step:
# Sort output in descending order
mtcars |> count(cyl, sort = TRUE)
# # A tibble: 3 × 2
# cyl n
# <dbl> <int>
# 1 8 14
# 2 4 11
# 3 6 7
Common patterns
Proportions from counts
A common follow-up to count() is computing proportions within groups. After counting, you can pipe the result into mutate() and divide each row’s count by the total sum of the n column. This gives you the fraction each group contributes to the whole dataset:
# Add a proportion column
mtcars |>
count(cyl) |>
mutate(prop = n / sum(n))
# # A tibble: 3 × 3
# cyl n prop
# <dbl> <int> <dbl>
# 1 4 11 0.344
# 2 6 7 0.219
# 3 8 14 0.438
Using with filter
Chaining count() with filter() lets you identify groups that meet a minimum frequency threshold. This pattern is especially useful when cleaning categorical data, where you might want to collapse rare levels into an “other” category or focus analysis on only the most frequent groups:
# Find groups with more than N observations
mtcars |> count(cyl) |> filter(n > 10)
# # A tibble: 2 × 2
# cyl n
# <dbl> <int>
# 1 4 11
# 2 8 14
Naming the count column
The default output column from count() is named n. You can change this with the name argument, which is helpful when the result is being fed into a join or another pipeline step where a generic n column would be ambiguous. A descriptive name like total_cars makes the output self-documenting:
# Custom name for the count column
mtcars |> count(cyl, name = "total_cars")
# # A tibble: 3 × 2
# cyl total_cars
# <dbl> <int>
# 1 4 11
# 2 6 7
# 3 8 14
dplyr::count() in practice
count() counts the number of rows for each combination of grouping variables, returning a data frame with a n column. count(df, group) is shorthand for df |> group_by(group) |> summarise(n = n()) |> ungroup(). The result is ungrouped by default.
count() accepts wt for weighted counts: count(df, category, wt = amount) sums amount rather than counting rows. This produces a frequency table of totals by category.
add_count() adds the count as a column to the original data frame rather than summarising, every row gets its group’s count in a new column n. This is useful for filtering: add_count(df, group) |> filter(n > 10) keeps only rows from groups with more than 10 members. It is equivalent to df |> group_by(group) |> filter(n() > 10) |> ungroup() but more explicit.
The sort = TRUE argument orders results by n descending, making the most common values appear first. count(df, word, sort = TRUE) is the starting point for a frequency analysis, it shows the most common words at the top. name allows renaming the count column: count(df, x, name = "total") uses total instead of n.
count() is shorthand for group_by() followed by summarise(n = n()) followed by ungroup(). Pass sort = TRUE to order results by frequency descending. Pass wt = col to compute a weighted count (sum of col) instead of a row count. add_count() is the mutate equivalent, it appends the count as a new column without collapsing rows, useful for computing proportions within groups.
count() vs tally()
count() is a shorthand for group_by() followed by summarize(n = n()) followed by ungroup(). The tally() function is the same but without the grouping step, it counts within an already-grouped data frame. Using count() is more concise for creating frequency tables from ungrouped data; using add_count() adds the count as a new column while keeping all original rows, which is useful for computing proportions.
The wt argument makes count() behave like a weighted count, summing a column instead of counting rows. count(data, category, wt = sales) gives the total sales per category rather than the number of rows. This turns count() into a general aggregation tool for common summary patterns without requiring the full group_by + summarize syntax.
See also
- dplyr::group_by()
- dplyr::across()
- dplyr::filter()
count(df, col, wt = weight_col)computes a weighted count, the sum ofweight_colrather than the number of rows per group.count()withname = "count"renames the output column from the defaultn.add_count(df, col)appends the count as a new column without collapsing rows — useful for computing proportions:mutate(proportion = n / sum(n))afteradd_count(). For weighted counts:count(df, group, wt = weight_col)sumsweight_colper group instead of counting rows.tally()is equivalent tocount()without column specification.