R5 Reference Classes in R
Reference Classes (sometimes called R5, though their official name is Reference Classes) are R’s third object-oriented system. They sit alongside S3 and S4, but behave quite differently. If you have used languages like Java or C#, Reference Classes will feel familiar. If you have only worked with S3 and S4, the difference will surprise you.
The Key Difference: Reference Semantics
S3 and S4 objects follow what R programmers call copy-on-modify semantics. When you assign an object to a new variable, you get a complete copy. Modify the copy, and the original stays unchanged.
# S3 object - copy on modify
person <- function(name, age) {
structure(list(name = name, age = age), class = "person")
}
p1 <- person("Alice", 30)
p2 <- p1
p2$age <- 31
p1$age # Still 30
p2$age # 31
Reference Classes break this pattern. They use reference semantics: when you assign a refclass object to a new variable, both variables point to the same underlying object. Modify one, and you modify both.
# Reference Class - same object
Person <- setRefClass("Person",
fields = c("name", "age"),
methods = list(
greet = function() paste("Hello, I'm", name)
)
)
p1 <- Person$new(name = "Alice", age = 30)
p2 <- p1
p2$age <- 31
p1$age # Now 31 - both point to same object
p2$age # 31
This is the single most important thing to understand about Reference Classes. It affects everything: how you pass objects to functions, how you think about equality, and when refclasses are appropriate.
Defining a Reference Class
You create refclasses with setRefClass(). Unlike S4’s setClass(), you keep the return value around because it is your generator function.
Person <- setRefClass("Person")
class(Person)
# [1] "RefClass"
# attr(,"package")
# [1] "methods"
The generator has several important methods:
$new()- create instances of the class$fields()- list defined fields$methods()- add or modify methods$help()- get help on methods$lock()- lock fields so they can only be set once
Adding Fields
Fields hold data. You specify them with the fields argument, either as a character vector of names or as a named list with types:
# Just names - defaults to "ANY"
Person <- setRefClass("Person",
fields = c("name", "age")
)
# With types
BankAccount <- setRefClass("BankAccount",
fields = list(
holder = "character",
balance = "numeric",
transactions = "list"
)
)
Valid field types include: character, numeric, integer, logical, list, environment, and ANY (allows anything).
When you create an instance, pass initial values to $new():
account <- BankAccount$new(
holder = "Bob",
balance = 1000,
transactions = list()
)
account$holder # "Bob"
account$balance # 1000
You can also access fields using the $get() and $set() methods:
account$get("balance") # 1000
account$set("balance", 2000)
account$balance # 2000
Adding Methods
Methods are functions that operate on object fields. You define them in the methods argument to setRefClass():
BankAccount <- setRefClass("BankAccount",
fields = list(
holder = "character",
balance = "numeric",
transactions = "list"
),
methods = list(
deposit = function(amount) {
if (amount <= 0) stop("Deposit must be positive")
balance <<- balance + amount
transactions <<- c(transactions, list(deposit = amount))
invisible(.self)
},
withdraw = function(amount) {
if (amount > balance) stop("Insufficient funds")
balance <<- balance - amount
transactions <<- c(transactions, list(withdrawal = amount))
invisible(.self)
},
get_balance = function() balance
)
)
Notice the <<- assignment operator. This is how methods modify fields. It assigns to the enclosing environment (the object). Using regular <- would just create a local variable.
You can also add methods after class creation using the generator:
Person$methods(
celebrate_birthday = function() {
age <<- age + 1
message("Happy birthday!")
}
)
Inheritance with contains
Reference Classes support inheritance through the contains argument:
Employee <- setRefClass("Employee",
contains = "Person",
fields = c("employee_id", "department"),
methods = list(
greet = function() {
paste("Hi, I'm", name, "from", department)
}
)
)
The child class inherits all fields and methods from the parent. You can override methods by redefining them.
You can also inherit from multiple parents:
Manager <- setRefClass("Manager",
contains = c("Employee", "Person"),
fields = "team_size"
)
Common Methods
All Reference Class objects inherit from envRefClass and get several built-in methods:
account$copy() # Copy the object
account$field("balance") # Get field value
account$initFields() # Re-initialize fields
account$trace("deposit") # Trace method calls
account$untrace("deposit") # Stop tracing
When to Use Reference Classes
Reference Classes shine in specific situations:
- Simulation and modelling - when you are modelling complex state that changes over time, like game state or statistical simulations
- GUI programming - when you need objects that persist and mutate in response to user actions
- State machines - when you have objects that transition through defined states
- Caching and memoization - when you need objects that can update their cached values in place
When Not to Use Reference Classes
Most R code should avoid Reference Classes:
- Data analysis pipelines - prefer data frames and the tidyverse; functional pipelines are cleaner
- Statistical modelling - use S3/S4 for model objects; they fit R’s ecosystem better
- Package development - unless you specifically need mutation, S3 is usually the right choice
- Parallel computing - refclass objects can cause headaches because of their reference semantics
The majority of your R code should be functional and side-effect free. That is easier to test, reason about, and share with other R programmers. Use Reference Classes only where mutable state is genuinely required.
Limitations
Reference Classes have some constraints:
- You cannot add fields after creation (that would invalidate existing objects)
- Field names starting with
.are reserved for internal use - The enclosing environment is used for the object itself, so you cannot use closures in the usual way
- Copy semantics can be surprising if you are not expecting them
Summary
Reference Classes give R something it historically lacked: true mutable objects with reference semantics. They behave like objects in mainstream OOP languages, which can be either a benefit or a curse depending on context. The key is recognizing when you actually need mutation - and when you do not.
For most R programming, S3 remains the right tool. But when you are building simulations, GUIs, or stateful systems, Reference Classes are exactly what you need.
See Also
- S3 Classes in R — R’s simplest OOP system
- S4 Classes in R — R’s formal OOP system with multiple dispatch