How to Extract Unique Values from a Vector in R
To extract unique values in R, start with unique(), which removes duplicate elements from a vector while preserving the order of first occurrence. It works on any atomic vector type — numeric, character, logical, or factor — making it the go-to tool for deduplication tasks.
x <- c(1, 2, 2, 3, 3, 3, 4, 5, 5)
unique(x)
# [1] 1 2 3 4 5
colors <- c("red", "blue", "green", "red", "yellow", "blue")
unique(colors)
# [1] "red" "blue" "green" "yellow"
unique() also works directly on data frames, removing duplicate rows. For more control — like deduplicating based on a subset of columns — dplyr::distinct() is the tidyverse equivalent. distinct(df, id, name) keeps rows with unique (id, name) combinations, ignoring all other columns during the comparison.
library(dplyr)
df <- data.frame(
id = c(1, 2, 2, 3, 4, 1),
name = c("A", "B", "B", "C", "D", "A")
)
distinct(df)
# id name
# 1 1 A
# 2 2 B
# 3 3 C
# 4 4 D
To count unique values instead of listing them, use length(unique(x)) or the more efficient dplyr::n_distinct(x), which is faster for large vectors because it avoids constructing the full deduplicated result. unique() preserves NA values — a single NA in your vector appears once in the output. To exclude NAs, filter first with unique(x[!is.na(x)]) or na.omit(x) |> unique().
See also
- unique(), Base R function for extracting unique values
- duplicated(), Find duplicate elements
- table(), Count frequency of values