rguides

Object-Oriented R with R6

R6 is a class system for R that brings true object-oriented programming with encapsulated, mutable objects. If you’ve worked with R for any length of time, you’ve probably encountered S3 and S4 classes—the traditional OO systems in R. They’re functional in nature, but they can feel awkward when you’re trying to build something that behaves like objects in other languages. That’s where R6 comes in.

What is R6?

R6 provides reference classes that behave like objects in languages such as Python or Java. Unlike S3 and S4, R6 objects are mutable—they change in place rather than requiring you to copy the entire object every time you modify something. This makes them particularly useful for building stateful systems, caches, database connections, or anything where you need to maintain internal state across function calls.

The package was created by Winston Chang, and it’s the engine behind Shiny’s reactive objects. If you’ve used Shiny, you’ve indirectly used R6.

Here’s why you might choose R6 over other OO systems:

  • You need mutable state that persists across operations
  • You want true encapsulation—controlling what can be accessed and modified
  • You’re building something that feels more natural as an object with methods
  • You need inheritance to share behavior across related classes

Creating R6 classes

Creating an R6 class is straightforward. You define a class using R6Class() and specify its fields and methods:

library(R6)

Person <- R6Class("Person",
  public = list(
    name = NULL,
    age = NULL,
    
    initialize = function(name, age) {
      self$name <- name
      self$age <- age
    },
    
    greet = function() {
      paste0("Hello, my name is ", self$name)
    }
  )
)

# Create an instance
alice <- Person$new("Alice", 30)
alice$greet()

Notice the use of self to refer to the current object. This is similar to this in other languages. You’ll use self to access both fields and methods.

Fields, methods, and initialization

Fields are the data stored in your object. In the example above, name and age are fields. You can provide default values directly in the field definition:

Counter <- R6Class("Counter",
  public = list(
    count = 0,
    
    increment = function() {
      self$count <- self$count + 1
      invisible(self)
    },
    
    get_count = function() {
      self$count
    }
  )
)

The initialize method is special—it’s called automatically when you create a new instance with $new(). It’s your constructor, where you set up the initial state of the object.

Inheritance and class hierarchies

R6 supports inheritance, allowing you to create child classes that inherit fields and methods from parent classes:

Employee <- R6Class("Employee",
  inherit = Person,
  public = list(
    salary = NULL,
    
    initialize = function(name, age, salary) {
      super$initialize(name, age)
      self$salary <- salary
    },
    
    greet = function() {
      paste0(super$greet(), " and I earn $", self$salary)
    }
  )
)

bob <- Employee$new("Bob", 35, 50000)
bob$greet()

The super object lets you call methods from the parent class. This is essential when you want to extend behavior without completely replacing it.

Active bindings

Active bindings provide a way to define getters and setters that look like fields but actually run code when accessed. They’re computed properties:

Temperature <- R6Class("Temperature",
  public = list(
    celsius = 0,
    
    get_fahrenheit = function() {
      self$celsius * 9/5 + 32
    },
    
    set_fahrenheit = function(value) {
      self$celsius <- (value - 32) * 5/9
    }
  ),
  
  active = list(
    fahrenheit = function(value) {
      if (missing(value)) {
        self$get_fahrenheit()
      } else {
        self$set_fahrenheit(value)
      }
    }
  )
)

temp <- Temperature$new()
temp$celsius <- 100
temp$fahrenheit  # Returns 212
temp$fahrenheit <- 32
temp$celsius    # Returns 0

Active bindings are fantastic for creating read-only properties, computed fields, or validating assignments.

Reference semantics

This is the most important concept to understand about R6, and it trips up many newcomers. R6 objects use reference semantics, not value semantics. What does that mean?

With S3 or S4 objects, when you assign an object to a new variable, you get a copy:

df1 <- data.frame(x = 1:5)
df2 <- df1
df2$x <- 10
df1$x  # Still 1:5 - df1 wasn't modified

With R6, when you assign an object to a new variable, both variables point to the same underlying object:

counter1 <- Counter$new()
counter2 <- counter1
counter2$increment()
counter1$get_count()  # Returns 1 - both reference the same object

This is powerful but dangerous. It means:

  1. Changes to one reference affect all references to that object
  2. You can’t use standard R patterns like x <- modify(x) to modify objects
  3. You need to be explicit about when you’re creating copies

Here’s how you actually create an independent copy:

counter1 <- Counter$new()
counter2 <- counter1$clone()
counter2$increment()
counter1$get_count()  # Returns 0 - counter2 is independent

Practical example: a simple calculator class

Let’s put together everything we’ve learned into a working example—a calculator class with history:

Calculator <- R6Class("Calculator",
  public = list(
    history = character(0),
    
    add = function(a, b) {
      result <- a + b
      self$record("add", a, b, result)
      result
    },
    
    subtract = function(a, b) {
      result <- a - b
      self$record("subtract", a, b, result)
      result
    },
    
    multiply = function(a, b) {
      result <- a * b
      self$record("multiply", a, b, result)
      result
    },
    
    divide = function(a, b) {
      if (b == 0) stop("Cannot divide by zero")
      result <- a / b
      self$record("divide", a, b, result)
      result
    },
    
    get_history = function() {
      self$history
    },
    
    clear_history = function() {
      self$history <- character(0)
      invisible(self)
    },
    
    record = function(operation, a, b, result) {
      entry <- sprintf("%s(%g, %g) = %g", operation, a, b, result)
      self$history <- c(self$history, entry)
    }
  )
)

calc <- Calculator$new()
calc$add(2, 3)
calc$multiply(4, 5)
calc$get_history()
calc$clear_history()

Notice how I return self from most methods—this enables method chaining, which makes the interface much nicer to use.

Common pitfalls and best practices

Here are some things I’ve learned the hard way:

1. Forgetting that R6 objects are mutable Always remember: assigning an R6 object doesn’t copy it. If you need a copy, call $clone(). This is especially important when passing R6 objects to functions or storing them in lists.

2. Using private fields when you should use public R6 supports private fields that can’t be accessed from outside the class. Use them for internal state that external code shouldn’t touch:

SecureCounter <- R6Class("SecureCounter",
  public = list(
    get_count = function() private$count,
    increment = function() {
      private$count <- private$count + 1
      invisible(self)
    }
  ),
  private = list(
    count = 0
  )
)

3. Not handling NULL fields properly Initialize fields explicitly or handle NULL cases in your methods. R6 doesn’t enforce types, so your code needs to be defensive.

4. Overusing R6 Just because you can use R6 doesn’t mean you should. For simple cases where you just need a function that returns a list with some behavior, S3 is often sufficient and lighter weight.

5. Forgetting to return something useful Methods that modify state should usually return self if they’re intended to be chained, or return the modified value explicitly. Don’t leave users guessing.

R6 vs S3 and S4

R6 uses reference semantics, when you modify an R6 object, all variables pointing to it see the change. This differs from S3 and S4, which use copy-on-modify semantics. In S3, modified <- original; modified$x <- 5 does not change original$x. In R6, modified <- original; modified$x <- 5 modifies the same underlying object, so original$x also becomes 5.

This reference behavior makes R6 appropriate for objects with shared mutable state: database connections, cache objects, progress trackers, and configuration managers. It is inappropriate for data containers that should behave like values, use tibbles or lists for those.

The clone() method

Since R6 objects are references, object2 <- object1 does not copy the object, both variables point to the same instance. To get an independent copy, use object1$clone(). Deep cloning (including nested R6 objects) requires object1$clone(deep = TRUE). Forgetting to use clone() when you need a copy is the most common R6 bug: the “copy” and original share state silently.

When to use R6 for package development

R6 is useful in package development when you need objects that accumulate state over multiple method calls: a builder pattern, a connection pool, or an event emitter. The tidyverse uses R6 internally for several such purposes (e.g., R6 classes in httr2). For public APIs, the tidyverse style guide recommends S3 over R6 because S3 is more familiar to most R users and plays better with generic functions like print() and summary().

Summary

R6 provides reference semantics with a clean class definition syntax. Use it when you need mutable objects that change in place, connection pools, stateful iterators, or objects shared across function calls. For most data analysis work, functional approaches with S3 dispatch are simpler and more idiomatic in R. R6’s value is in systems programming within R: building abstractions that manage state over time, like a file writer that buffers output or a rate limiter that tracks request timing.

See also

R6 fills an important gap in R’s OO landscape. It’s not the right tool for every job, but when you need true objects with mutable state, it’s exactly what you’re looking for.