Rcpp and C++ Integration in R
What is Rcpp?
R is excellent for statistics and data analysis, but it can be slow when you’re doing heavy computations. That’s where Rcpp comes in.
Rcpp is a package that lets you write C++ code directly inside R. C++ is a fast, low-level programming language that runs much quicker than R for certain tasks. Rcpp acts as a bridge between the two languages, handling all the tricky details of converting data back and forth.
Think of Rcpp like a translator—you write C++ (the fast language), and Rcpp translates it so R can understand and use it.
Installing Rcpp
Before you begin, make sure you have a C++ compiler installed on your system:
- Windows: Install Rtools from https://cran.r-project.org/bin/windows/Rtools/
- macOS: Install Xcode Command Line Tools from the App Store
- Linux: You likely already have g++ installed; if not, use your package manager
Now install Rcpp in R:
# Install Rcpp from CRAN
install.packages("Rcpp")
# Load the library
library(Rcpp)
# Check it's installed
packageVersion("Rcpp")
Writing Your First C++ Function
Here’s a simple example: adding two numbers together. In R, you’d write:
add_r <- function(a, b) {
a + b
}
Here’s the same function written in C++ using Rcpp:
// Load Rcpp functionality
#include <Rcpp.h>
// This attribute tells Rcpp to expose this function to R
// [[Rcpp::export]]
// The actual C++ function
double add_cpp(double a, double b) {
return a + b;
}
To use this in R, you have two main options. The first is using sourceCpp() to compile code directly in your R session:
library(Rcpp)
# This compiles and loads the C++ function into your R environment
sourceCpp(code = '
#include <Rcpp.h>
// [[Rcpp::export]]
double add_cpp(double a, double b) {
return a + b;
}
')
# Now call it like any R function
add_cpp(5, 3) # Returns 8
You can also save your C++ code to a file and load it:
# Save the C++ code to matrix_multiply.cpp, then:
sourceCpp("matrix_multiply.cpp")
Understanding Rcpp Data Types
C++ needs to know what type of data it’s working with. Rcpp provides types that match R’s data structures:
| R type | Rcpp type | What it holds |
|---|---|---|
numeric() | Rcpp::NumericVector | Decimal numbers |
integer() | Rcpp::IntegerVector | Whole numbers |
character() | Rcpp::CharacterVector | Text strings |
logical() | Rcpp::LogicalVector | TRUE/FALSE |
list() | Rcpp::List | Mixed data |
data.frame() | Rcpp::DataFrame | Tabular data |
Here’s an example using different vector types:
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::List example_types(
Rcpp::NumericVector nums,
Rcpp::IntegerVector ints,
Rcpp::CharacterVector strs
) {
// Return all three vectors bundled as a list
return Rcpp::List::create(
Rcpp::Named("numbers") = nums,
Rcpp::Named("integers") = ints,
Rcpp::Named("strings") = strs
);
}
Notice the Rcpp::Named() function—this gives names to list elements, just like c(a = 1) in R.
A More Practical Example: Adding Vectors
A more practical case: adding two vectors element-by-element.
# Pure R version
add_vectors_r <- function(x, y) {
x + y
}
// C++ version with Rcpp
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::NumericVector add_vectors_cpp(
Rcpp::NumericVector x,
Rcpp::NumericVector y
) {
// Create output vector of same length
Rcpp::NumericVector result(x.size());
// Loop through and add each element
for (int i = 0; i < x.size(); i++) {
result[i] = x[i] + y[i];
}
return result;
}
The C++ version looks longer, but for large vectors it runs significantly faster. Key things to notice:
x.size()gets the vector length- Indexing starts at 0 (unlike R’s 1-based indexing)
- We return the result vector, and Rcpp handles converting it back to R
Working with Matrices
Rcpp also handles matrices. The type is Rcpp::NumericMatrix. One important difference: C++ uses zero-based indexing, while R uses one-based.
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::NumericMatrix matrix_multiply(
Rcpp::NumericMatrix A,
Rcpp::NumericMatrix B
) {
// Get dimensions
int rows = A.nrow();
int cols = B.ncol();
int mid = A.ncol(); // must equal B.nrow()
// Create result matrix
Rcpp::NumericMatrix C(rows, cols);
// Matrix multiplication
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++) {
double sum = 0;
for (int k = 0; k < mid; k++) {
sum += A(i, k) * B(k, j);
}
C(i, j) = sum;
}
}
return C;
}
Use it from R:
# First, save the C++ code above to matrix_multiply.cpp
sourceCpp("matrix_multiply.cpp")
A <- matrix(1:4, nrow = 2)
B <- matrix(5:8, nrow = 2)
matrix_multiply(A, B)
Performance Comparison
When should you use Rcpp? Here’s a benchmark to see the speed difference:
# Define the R version
add_vectors_r <- function(x, y) x + y
# Define the C++ version
library(Rcpp)
sourceCpp(code = '
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::NumericVector add_vectors_cpp(Rcpp::NumericVector x, Rcpp::NumericVector y) {
Rcpp::NumericVector result(x.size());
for (int i = 0; i < x.size(); i++) result[i] = x[i] + y[i];
return result;
}
')
# Run the benchmark
library(microbenchmark)
x <- runif(10000)
y <- runif(10000)
microbenchmark(
r_version = add_vectors_r(x, y),
cpp_version = add_vectors_cpp(x, y)
)
For small vectors (under 1000 elements), the difference is barely noticeable. But as your data grows, C++ typically runs 10 to 100 times faster depending on the operation.
A good rule: only rewrite in C++ the parts of your code that are actually slow. Use R’s built-in profiling tools first to find bottlenecks.
Debugging Your Rcpp Code
When something goes wrong in your C++ code, you need ways to debug it. Two useful tools:
Printing output:
#include <Rcpp.h>
// [[Rcpp::export]]
void debug_example(int x) {
Rcpp::Rcout << "The value is: " << x << std::endl;
}
Rcout works like R’s cat() or print().
Raising errors:
#include <Rcpp.h>
// [[Rcpp::export]]
double safe_divide(double a, double b) {
if (b == 0) {
Rcpp::stop("Cannot divide by zero!");
}
return a / b;
}
Rcpp::stop() works like stop() in R—it halts execution and shows an error message.
Common Beginner Mistakes
A few things to watch out for when starting out:
-
Forgetting the export attribute — Without
// [[Rcpp::export]], your function won’t be visible to R. -
Type mismatches — If R passes an integer but your C++ expects a double, things can break. Be explicit about types.
-
Indexing errors — Remember C++ uses 0-based indexing, not R’s 1-based.
-
Memory leaks — Rcpp handles most memory management automatically.
Using Rcpp in Packages
Once you’re comfortable with sourceCpp(), you might want to use Rcpp in a proper R package. Rcpp provides helper functions to set this up:
# Create a new package with Rcpp
library(Rcpp)
Rcpp::RcppPackage.skeleton("my_package")
# This creates the basic package structure with:
# - inst/include/ for header files
# - src/ for C++ source code
# - LinkToRcpp in DESCRIPTION
For more advanced usage, the Rcpp attributes // [[Rcpp::depends()]] and // [[Rcpp::plugins()]] let you specify package dependencies and compiler plugins.
Wrapping Up
Rcpp lets you bring C++ speed into your R workflow without rewriting everything. Start small: find a slow function in your code, rewrite just that part in C++ using Rcpp, and compare the performance.
The Rcpp package documentation at https://cran.r-project.org/web/packages/Rcpp/index.html has many more examples. The Rcpp Gallery at https://gallery.rcpp.org is especially useful for real-world recipes.
Give it a try with something simple first, then gradually tackle more complex problems as you get comfortable.
See Also
- Rcpp attributes — Advanced export attributes and dependencies
- Rcpp Gallery — Hundreds of working examples for common tasks
- Rcpp Quick Reference Guide — Complete Rcpp API reference