High-Performance Vectors with vctrs
The vctrs package provides a consistent framework for working with vectors in R. It solves a fundamental problem: R’s base functions handle different vector types inconsistently. The package gives you type-safe operations, automatic recycling, and a system for creating custom vector types that integrate seamlessly with the tidyverse.
What vctrs solves
Base R treats vectors differently depending on their type. length() works everywhere, but nrow() only works on data frames. Recycling behavior varies. Type conversion is unpredictable. These inconsistencies force you to write defensive code that checks vector types at runtime.
vctrs establishes a unified vector protocol. It defines operations that work identically across all vector types, including your own custom classes. The package powers dplyr, tidyr, and ggplot2 under the hood.
Core Functions
vec_size and vec_size_common
vec_size() returns the length of any vector, treating data frames as if they were vectors of rows:
library(vctrs)
vec_size(1:10)
#> [1] 10
vec_size(mtcars)
#> [1] 32
vec_size_common() computes a common size for multiple vectors, enforcing length-1 vectors to recycle:
vec_size_common(1:10, c(TRUE, FALSE), letters[1:2])
#> [1] 10
vec_slice
vec_slice() extracts a subset using integer indices. It preserves vector attributes:
x <- c(a = 1, b = 2, c = 3)
vec_slice(x, c(1, 3))
#> a c
#> 1 3
For data frames, vec_slice() selects rows while preserving columns:
vec_slice(mtcars, 1:5)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 1 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 1 0 4 4
#> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
vec_recycle
vec_recycle() automatically recycles vectors to a common length, throwing an error on incompatible lengths:
vec_recycle(1:3, c(TRUE, FALSE))
#> [1] 1 2 3
# This throws an error
try(vec_recycle(1:3, 1:4))
#> Error: Can't recycle `1:3` (size 3) to size 4.
vec_cast
vec_cast() converts between types with explicit control over conversion rules:
# Successful cast
vec_cast(1:3, character())
#> [1] "1" "2" "3"
# Failed cast - throws error
try(vec_cast(c("a", "b"), integer()))
#> Error: Can't convert `c("a", "b")` <character> to <integer>.
The third argument controls what happens when conversion fails: x_bar (the default, throws error), x, or NA:
tryCatch(
vec_cast("a", integer()),
error = function(e) NA_integer_
)
#> [1] NA NA
The vctr class
The vctr class lets you create custom vector types that behave like built-in vectors. You define a class, then implement the required methods.
library(vctrs)
# Define a new vector type
new_percent <- function(x) {
stopifnot(is.numeric(x), x >= 0, x <= 1)
new_vctr(x, class = "percent")
}
# Print method
format.percent <- function(x, ...) {
paste0(round(vec_data(x) * 100, 1), "%")
}
# Implement vec_cast for converting to/from percent
vec_cast.percent <- function(x, to, ...) {
if (is.character(to)) {
format(x)
} else {
stop_incompatible_cast(x, to, x_arg = "", to_arg = "")
}
}
# Test it
p <- new_percent(c(0.25, 0.5, 0.75))
p
#> <percent[3]>
#> [1] 25% 50% 75%
vec_size(p)
#> [1] 3
The vctr class automatically gets length(), subsetting, and printing behaviors. You implement methods like format(), vec_cast(), and vec_math() to customize behavior.
Performance comparison
vctrs operations are designed for performance. They minimize type checking at runtime by enforcing type consistency during construction. Here’s a quick comparison:
# Base R - implicit recycling with warning
x <- 1:3
y <- 1:2
x + y
#> [1] 2 4 4
#> Warning message:
#> In x + y : longer object length is not a multiple of the shorter
# vctrs - explicit recycling with error on mismatch
vec_recycle(x, y)
#> Error: Can't recycle `1:3` (size 3) to size 2.
The performance difference shows most clearly when working with data frames in pipelines. vctrs-powered functions like dplyr::mutate() avoid repeated type checks because they validate once, then operate efficiently:
library(dplyr)
# vctrs validates once, then operates efficiently
result <- mtcars %>%
mutate(
disp_l = disp / 61.0237,
wt_kg = wt * 453.592
) %>%
head(3)
# Compare to base R approach requiring manual checks
base_result <- transform(mtcars,
disp_l = disp / 61.0237,
wt_kg = wt * 453.592
)
head(base_result, 3)
The tidyverse uses vctrs to ensure that type conversions happen at the right time—explicitly when you request them, not silently when you least expect it.
Practical examples
Validating input in a function
Use vctrs to validate and normalize function inputs:
normalize <- function(x) {
x <- tryCatch(
vec_cast(x, double()),
error = function(e) NA_real_
)
x <- vec_recycle(x, vec_size_common(x, 1))
x / sum(x, na.rm = TRUE)
}
normalize(c(1, 2, 3))
#> [1] 0.1666667 0.3333333 0.5000000
normalize(1)
#> [1] NaN # because sum(1) = 1, and 1/1 = 1, wait...
# Actually: 1 / 1 = 1 (single element normalizes to itself)
Creating a date vector type
new_fiscal_year <- function(year, quarter) {
stopifnot(
is.integer(year),
is.integer(quarter),
quarter >= 1,
quarter <= 4
)
new_vctr(
vec_cbind(year = year, quarter = quarter),
class = "fiscal_year"
)
}
fy <- new_fiscal_year(2024L, 1:4)
fy
#> <fiscal_year[4]>
#> year quarter
#> [1,] 2024 1
#> [2,] 2024 2
#> [3,] 2024 3
#> [4,] 2024 4
See also
- Data Table — fast in-memory tabular data operations
- Polars for R — blazing-fast DataFrames using Rust
- Functional Programming with purrr — list and vector iteration patterns