rguides

dplyr::count()

count(x, ..., wt = NULL, sort = FALSE, .drop = TRUE)

count() and tally() are dplyr verbs for counting observations. count() is a convenience wrapper that combines group_by() and summarise(n()) in one step. tally() adds a count column to a grouped tibble.

Syntax

count(x, ..., wt = NULL, sort = FALSE, .drop = TRUE)
tally(x, wt = NULL, sort = FALSE, name = "n")

Parameters

ParameterTypeDefaultDescription
xtibble/data.framerequiredThe data to count
...variablesoptionalVariables to group by (count only)
wtvariableNULLOptional weighting variable, adds a weighted count instead of raw count
sortlogicalFALSEIf TRUE, sort output by count in descending order
.droplogicalTRUEIf FALSE, include combinations with zero counts
namestring"n"Name of the count column (tally only)

Examples

Basic usage

Calling tally() on a data frame without arguments counts all rows and returns a single-row tibble with an n column. This is the simplest possible usage and produces the same result as count() with no grouping variables:

library(dplyr)

# Simple count of all rows
mtcars |> tally()
# # A tibble: 1 × 1
#       n
#   <int>
# 1    32

Count by group

Passing a column name to count() groups by that column, counts the rows within each group, and then ungroups the result automatically. The output contains one row per unique value of the grouping variable plus an n column with the frequency:

# Count cars by number of cylinders
mtcars |> count(cyl)
# # A tibble: 3 × 2
#     cyl     n
#   <dbl> <int>
# 1     4    11
# 2     6     7
# 3     8    14

Count by multiple groups

When you supply more than one column to count(), it creates a frequency table for every unique combination of those columns. The resulting tibble has one row per combination of values, which can quickly grow large if the columns have many distinct levels:

# Count by cylinders and gears
mtcars |> count(cyl, gear)
# # A tibble: 8 × 3
#     cyl  gear     n
#   <dbl> <dbl> <int>
# 1     4     3     1
# 2     4     4     7
# 3     4     5     3
# 4     6     3     2
# 5     6     4     4
# 6     6     5     1
# 7     8     3     5
# 8     8     5     9

Weighted counts

The wt argument transforms count() from a row counter into a sum aggregator. Instead of counting how many rows belong to each group, it sums the values of the specified weight column. This is the equivalent of group_by(cyl) |> summarise(n = sum(hp)) but expressed more concisely:

# Weight by horsepower - sum of hp per cylinder
mtcars |> count(cyl, wt = hp)
# # A tibble: 3 × 2
#     cyl     n
#   <dbl> <dbl>
# 1     4   908
# 2     6   761
# 3     8  2929

Using tally() with existing groups

While count() creates its own grouping, tally() works on data that is already grouped. After calling group_by(), tally() adds an n column containing the row count per group. This pattern is useful when you want to apply multiple summarising steps to the same grouping structure:

# First group, then tally
mtcars |> group_by(cyl) |> tally()
# # A tibble: 3 × 2
#     cyl     n
#   <dbl> <int>
# 1     4    11
# 2     6     7
# 3     8    14

Sort by count

Setting sort = TRUE orders the output by the count column in descending order, putting the largest groups first. This is a common pattern for exploratory analysis where you want to see which categories dominate the dataset without running a separate arrange(desc(n)) step:

# Sort output in descending order
mtcars |> count(cyl, sort = TRUE)
# # A tibble: 3 × 2
#     cyl     n
#   <dbl> <int>
# 1     8    14
# 2     4    11
# 3     6     7

Common patterns

Proportions from counts

A common follow-up to count() is computing proportions within groups. After counting, you can pipe the result into mutate() and divide each row’s count by the total sum of the n column. This gives you the fraction each group contributes to the whole dataset:

# Add a proportion column
mtcars |> 
  count(cyl) |>
  mutate(prop = n / sum(n))
# # A tibble: 3 × 3
#     cyl     n   prop
#   <dbl> <int>  <dbl>
# 1     4    11 0.344 
# 2     6     7 0.219 
# 3     8    14 0.438 

Using with filter

Chaining count() with filter() lets you identify groups that meet a minimum frequency threshold. This pattern is especially useful when cleaning categorical data, where you might want to collapse rare levels into an “other” category or focus analysis on only the most frequent groups:

# Find groups with more than N observations
mtcars |> count(cyl) |> filter(n > 10)
# # A tibble: 2 × 2
#     cyl     n
#   <dbl> <int>
# 1     4    11
# 2     8    14

Naming the count column

The default output column from count() is named n. You can change this with the name argument, which is helpful when the result is being fed into a join or another pipeline step where a generic n column would be ambiguous. A descriptive name like total_cars makes the output self-documenting:

# Custom name for the count column
mtcars |> count(cyl, name = "total_cars")
# # A tibble: 3 × 2
#     cyl total_cars
#   <dbl>      <int>
# 1     4         11
# 2     6          7
# 3     8         14

dplyr::count() in practice

count() counts the number of rows for each combination of grouping variables, returning a data frame with a n column. count(df, group) is shorthand for df |> group_by(group) |> summarise(n = n()) |> ungroup(). The result is ungrouped by default.

count() accepts wt for weighted counts: count(df, category, wt = amount) sums amount rather than counting rows. This produces a frequency table of totals by category.

add_count() adds the count as a column to the original data frame rather than summarising, every row gets its group’s count in a new column n. This is useful for filtering: add_count(df, group) |> filter(n > 10) keeps only rows from groups with more than 10 members. It is equivalent to df |> group_by(group) |> filter(n() > 10) |> ungroup() but more explicit.

The sort = TRUE argument orders results by n descending, making the most common values appear first. count(df, word, sort = TRUE) is the starting point for a frequency analysis, it shows the most common words at the top. name allows renaming the count column: count(df, x, name = "total") uses total instead of n.

count() is shorthand for group_by() followed by summarise(n = n()) followed by ungroup(). Pass sort = TRUE to order results by frequency descending. Pass wt = col to compute a weighted count (sum of col) instead of a row count. add_count() is the mutate equivalent, it appends the count as a new column without collapsing rows, useful for computing proportions within groups.

count() vs tally()

count() is a shorthand for group_by() followed by summarize(n = n()) followed by ungroup(). The tally() function is the same but without the grouping step, it counts within an already-grouped data frame. Using count() is more concise for creating frequency tables from ungrouped data; using add_count() adds the count as a new column while keeping all original rows, which is useful for computing proportions.

The wt argument makes count() behave like a weighted count, summing a column instead of counting rows. count(data, category, wt = sales) gives the total sales per category rather than the number of rows. This turns count() into a general aggregation tool for common summary patterns without requiring the full group_by + summarize syntax.

See also

  • dplyr::group_by()
  • dplyr::across()
  • dplyr::filter() count(df, col, wt = weight_col) computes a weighted count, the sum of weight_col rather than the number of rows per group.count() with name = "count" renames the output column from the default n. add_count(df, col) appends the count as a new column without collapsing rows — useful for computing proportions: mutate(proportion = n / sum(n)) after add_count(). For weighted counts: count(df, group, wt = weight_col) sums weight_col per group instead of counting rows. tally() is equivalent to count() without column specification.