rguides

Object-Oriented Programming with R6

R6 is the package most R developers reach for when they want encapsulated, mutable, reference-based objects. If you have ever built an S3 generic and wished you could keep the state and the methods in one place, or if you have been bitten by copy-on-modify semantics in a long pipeline, R6 classes are the answer.

This guide walks through the R6 API in the order you will actually use it: defining a class, exposing public and private members, building fields that compute on access, inheriting from a parent, and avoiding the three reference-semantics traps that catch everyone at least once. R6 2.6.1 (the current CRAN release) is the version used for every example below.

Why R6, and where it fits in R’s OOP zoo

R ships with three object systems, and most R users have met at least one of them:

  • S3 dispatches a generic on the class of its first argument. State lives wherever you put it (a list, an environment, a column in a data frame), and methods are normal functions. Cheap, flexible, and the lingua franca of base R and the tidyverse.
  • S4 adds formal class definitions, multiple dispatch, and slot checking. Powerful, but the syntax is heavier and the gotchas (method selection, setGeneric collisions) are legendary.
  • RC (Reference Classes, a.k.a. R5) is base R’s built-in mutable-object system. Methods live on the object. The catch: it is built on S4, which makes cross-package inheritance painful and adds noticeable overhead.

R6 gives you the RC mental model (methods on the object, mutable state, obj$method() syntax) without the S4 baggage. It is implemented on top of R environments rather than S4, which keeps it fast, portable across packages, and friendly to debugging. Trade-off: R6 is from CRAN, not base R, so you add R6 to your DESCRIPTION. The official comparison lives in the Advanced R chapter on OOP, and the package documentation is at r6.r-lib.org.

If you want a tour of the other systems first, see the S3 classes guide, the S4 classes guide, or the R5 reference classes guide. For a broader “OOP in R” framing that includes design trade-offs, see the existing Object-Oriented R6 overview.

Defining your first R6 class

You build a class with R6Class(), then call Generator$new() to create an instance. Methods become members of the public list, and any non-function in that list is treated as a field. The self and private bindings are provided automatically inside every method. You never pass them as arguments.

The minimal example, lifted from the official introduction:

library(R6)

Accumulator <- R6Class("Accumulator", list(
  sum = 0,
  add = function(x = 1) {
    self$sum <- self$sum + x
    invisible(self)
  }
))

x <- Accumulator$new()
x$add(4)
x$sum
#> [1] 4
x$add(10)$add(10)$sum
#> [1] 24

Two conventions to settle on from day one: UpperCamelCase for class names (they show up in print() output and in error messages), and snake_case for fields and methods. invisible(self) at the end of mutating methods is what enables the x$add(10)$add(10)$sum chain. Drop it and the chain breaks at the first call.

Public, private, and the $ accessor

Real classes need a constructor that validates input and a way to keep internal state hidden. R6 supports both through initialize() and the private list.

BankAccount <- R6Class("BankAccount",
  public = list(
    initialize = function(owner, balance = 0) {
      stopifnot(is.character(owner), length(owner) == 1)
      stopifnot(is.numeric(balance), length(balance) == 1, balance >= 0)
      private$owner  <- owner
      private$balance <- balance
    },
    deposit = function(amount) {
      stopifnot(is.numeric(amount), amount > 0)
      private$balance <- private$balance + amount
      invisible(self)
    },
    withdraw = function(amount) {
      stopifnot(is.numeric(amount), amount > 0)
      if (amount > private$balance) stop("insufficient funds")
      private$balance <- private$balance - amount
      invisible(self)
    },
    describe = function() {
      cat(sprintf("%s has $%.2f\n", private$owner, private$balance))
      invisible(self)
    }
  ),
  private = list(
    owner   = NULL,
    balance = 0
  )
)

acct <- BankAccount$new("Ada", 100)
acct$deposit(50)$withdraw(25)$describe()
#> Ada has $125.00
acct$balance      # NULL (private, not visible from outside)

A note on the private$ boundary: nothing in R is truly private. A determined caller can still walk the enclosure environment and grab private$balance directly. What private$ actually buys you is refactoring safety: you can rename or reshape internal fields without worrying that some external caller was reaching into them, because the access pattern is mediated by methods. Treat it as a barrier against accidental misuse, not against adversarial access, and never store secrets in private fields. The stop() and is.numeric() reference pages cover the validation helpers used here.

Active bindings: fields that compute

Sometimes you want a member that looks like a field but runs a function on read or write. Active bindings fill that role. They are always public, and they take a single argument (value); use missing(value) to distinguish reads from writes.

Person <- R6Class("Person",
  private = list(.age = NA, .name = NULL),
  active = list(
    age = function(value) {
      if (missing(value)) return(private$.age)
      stop("`$age` is read only", call. = FALSE)
    },
    name = function(value) {
      if (missing(value)) return(private$.name)
      stopifnot(is.character(value), length(value) == 1)
      private$.name <- value
      self
    }
  ),
  public = list(
    initialize = function(name, age = NA) {
      private$.name <- name
      private$.age  <- age
    }
  )
)

p <- Person$new("Lin", 30)
p$name
#> [1] "Lin"
p$age
#> [1] 30
p$age <- 40
#> Error: `$age` is read only

Two things to remember. First, an active binding with no value argument will error on assignment (the call site passes the right-hand side as an argument, and the function does not accept any). Second, you can mix read-only and read/write bindings in the same active list. The pattern of if (missing(value)) is the whole API.

Inheritance and method chaining

R6 supports single inheritance with the inherit argument. Override a method by giving the subclass its own version, and call the parent with super$method(...). Note the surprising-but-useful fact: a subclass can read its superclass’s private members via super$, which is friendlier than the strict privacy rules in C++ or Java.

Queue <- R6Class("Queue",
  public = list(
    initialize = function(...) {
      for (item in list(...)) private$queue <- c(private$queue, list(item))
    },
    add = function(x) {
      private$queue <- c(private$queue, list(x))
      invisible(self)
    },
    remove = function() {
      if (private$length() == 0) return(NULL)
      head <- private$queue[[1]]
      private$queue <- private$queue[-1]
      head
    }
  ),
  private = list(
    queue  = list(),
    length = function() base::length(private$queue)
  )
)

CountingQueue <- R6Class("CountingQueue",
  inherit = Queue,
  public = list(
    add = function(x) {
      private$total <- private$total + 1
      super$add(x)
    },
    get_total = function() private$total
  ),
  private = list(total = 0)
)

cq <- CountingQueue$new("x", "y")
cq$get_total()
#> [1] 2

Method chaining flows through inheritance: as long as the superclass method ends with invisible(self), the subclass can call super$add(x) and then continue the chain.

Reference semantics: the part that bites

R6 objects are environments under the hood, which means assignment does not copy. Three rules cover almost every surprise you will hit:

  1. b <- a is not a copy. Both names point to the same object. Use a$clone() (or a$clone(deep = TRUE) for nested R6 fields) when you need an independent copy.
  2. Fields with reference semantics must be built inside initialize(). If you put public = list(e = SomeR6$new()) in the class body, every instance shares the same nested object. This is the most common R6 bug.
  3. Class edits are not retroactive. Generator$set("public", "foo", value) only affects objects created by future Generator$new() calls. Already-built instances keep the methods they had at construction.

The Bad/Good pattern below shows rule 2 in action:

Simple <- R6Class("Simple", public = list(x = NULL))

# BAD: shared field (every instance points to the same Simple)
Bad <- R6Class("Bad", public = list(e = Simple$new()))

# GOOD: each instance gets its own Simple
Good <- R6Class("Good",
  public = list(
    e = NULL,
    initialize = function() self$e <- Simple$new()
  )
)

b1 <- Bad$new();  b2 <- Bad$new();  b1$e$x <- 9
b2$e$x   # [1] 9 (leaked across instances)
g1 <- Good$new(); g2 <- Good$new(); g1$e$x <- 9
g2$e$x   # [1] NULL (independent)

The same trap applies to environments and lists of R6 objects: anything with reference semantics. The purrr::map() reference page is a good starting point for iterating safely over collections of R6 instances.

Cloning, finalize, and locking

Three more pieces round out the API:

  • Cloning. $clone() is a shallow copy; $clone(deep = TRUE) recurses into fields that are themselves R6 objects. It does not recurse into environments, lists, or R5/RC objects. If you need custom deep-clone logic (say, to copy an external connection or a list of R6 objects), define private$deep_clone = function(name, value) { ... } and it will run once per field when $clone(deep = TRUE) fires. If you never need to clone, set cloneable = FALSE in R6Class() to skip the ~84 kB of method machinery on the class.
  • $finalize(). A private$finalize runs when the object is garbage-collected and again when R exits (the underlying reg.finalizer() uses onexit = TRUE). Use it to close files, disconnect from databases, or unregister callbacks. As of R6 2.6.0, a public finalize prints a deprecation warning and is on the path to removal, so keep finalizers in private.
  • Locking. Set lock_objects = TRUE (the default) to freeze the per-instance public and private environments so fields can’t be added at runtime. Set lock_class = TRUE, or call Generator$lock(), to freeze the class definition itself: future $new() calls still work, but you cannot add members via Generator$set(). The old lock= argument was removed in R6 2.6.0, so write lock_objects if you need to spell it out.

Common pitfalls

These come up over and over, mostly in the R6 issue tracker and on RStudio Community:

  • “Why does my field appear linked across all instances?” You assigned an R6 object or environment in the class body. Move construction into initialize().
  • “I changed the class definition but my existing object still has the old methods.” Expected. R6 binds methods to the instance at $new() time, not to the generator. Run Generator$new() again to get the new methods.
  • “Why is $print() printing twice?” Your print() method returns a value without invisible(). End it with invisible(self).
  • “Private members are leaking.” In R there is no enforced privacy. The barrier is the private$ accessor. Do not store secrets in private fields.
  • “I tried lock = TRUE and got a warning.” That argument was removed in R6 2.6.0. Use lock_objects = TRUE.
  • “Active binding with no arguments errors on assignment.” Give the function a value parameter, and branch on missing(value) for read vs. write.

Conclusion

R6 classes give R a clean way to write encapsulated, reference-based code without dragging in S4. The core API is small (R6Class(), public, private, active, inherit, $new(), $clone()), but the reference semantics take some getting used to. Internalize the three rules (use $clone(), build reference fields in initialize(), remember that class edits are not retroactive) and you will avoid 90% of the surprises.

From here, the natural next step is to wrap an R6 class in an R package so you can version it, document it, and share it. The package needs R6 in Imports and either fully qualified R6::R6Class() calls or @import R6 in roxygen; no NAMESPACE entry is required for the class itself. S3 generics dispatch on R6 objects automatically, because every instance carries the class attribute c("YourClass", "R6").

See Also

  • class(): introspect an R6 object’s S3 class chain.
  • stop(): signal errors from initialize() and finalize().
  • Environments as fields: the underlying mechanism R6 is built on.