Async R with future and promises
Async R with future and promises
R is traditionally single-threaded, meaning your code executes one statement at a time. When you run a long computation (training a model, downloading a large dataset, or querying a database) your entire R session blocks until it completes. This creates a poor user experience in interactive applications like Shiny apps, where the interface becomes unresponsive during heavy processing and every concurrent user waits in a queue behind the slow operation.
The future and promises packages provide a practical solution to this problem. They enable asynchronous programming in R, allowing you to delegate expensive operations to background processes while your main R session remains free to handle other work. This async model is familiar to JavaScript and Python developers but brings its own R-specific patterns worth understanding before you adopt it in production code.
Understanding futures
A future is an abstraction for a value that may not be available yet. When you create a future, R immediately returns a promise-like object while the actual computation happens in the background. The moment you try to access the value, R either retrieves the result (if ready) or blocks until it is.
The future package handles the “invoking” side of async programming: launching computations in separate R processes. Here’s a basic example:
library(future)
# Set up parallel processing
plan(multisession, workers = 2)
# Create a future - computation runs in background
f <- future({
Sys.sleep(2) # Simulate expensive operation
sqrt(144)
})
# This prints immediately - future runs in background
cat("Future created, doing other work...\n")
# Accessing the value blocks until ready
result <- value(f)
print(result) # 12
Calling value(f) is the simplest way to retrieve a future’s result, but it blocks the current session until the computation finishes. For non-blocking workflows, the %<-% operator provides a convenient shorthand that assigns the result to a variable as soon as it becomes available, without blocking subsequent code from executing:
The %<-% operator provides a convenient shorthand for creating futures:
plan(multisession)
result %<-% {
Sys.sleep(1)
sum(1:1000)
}
print(result) # 500500
Future backends
The future package supports multiple backends for executing async code:
| Backend | Description |
|---|---|
sequential | Default, runs in current R process (not async) |
multisession | Runs in background R sessions on the same machine |
multicore | Uses forked processes (faster but doesn’t work on Windows) |
cluster | Runs on remote machines or an ad-hoc cluster |
# Use multiple background R processes
plan(multisession, workers = 4)
# Use cluster for distributed computing
plan(cluster, workers = c("node1", "node2", "node3"))
Understanding promises
The promises package handles the “handling” side of async programming: working with the results once they’re ready. While future creates the async tasks, promises provides the API for registering callbacks and chaining operations. Together, they form a complete async toolkit where future handles execution and promises manage the flow of data through callbacks. This separation of concerns means you can swap backends without changing your promise chain logic.
A promise represents the eventual result of an async operation. It can be:
- Fulfilled: The operation completed successfully with a value
- Rejected: The operation failed with an error
- Pending: The operation hasn’t completed yet
library(promises)
# Create a promise
p <- promise(~{
Sys.sleep(1)
"Hello, async world!"
})
# Register a callback to run when the promise resolves
then(p, onFulfilled = function(value) {
print(value) # "Hello, async world!"
})
Combining future and promises
The real power comes from combining both packages. The future package can be converted into a promise using future_promise(), which wraps a future expression and returns a promise that resolves when the background computation finishes. This bridge between the two packages means you can launch heavy computations in background workers while using promises to orchestrate what happens next:
library(future)
library(promises)
plan(multisession, workers = 2)
# Create a future and convert it to a promise
fp <- future_promise({
Sys.sleep(2)
iris |> head()
})
# Use then() to handle the result
then(fp, onFulfilled = function(df) {
cat("Data received:\n")
print(head(df))
})
This pattern is especially useful in Shiny applications, where you want to keep the UI responsive while performing heavy computations.
Chaining async operations
One of the most powerful features of promises is chaining. Each then() call returns a new promise, allowing you to compose multiple async operations:
library(future)
library(promises)
plan(multisession)
# Create an async pipeline
future_promise({
data.frame(x = 1:10, y = rnorm(10))
}) |>
then(function(df) {
df$z <- df$x + df$y
df
}) |>
then(function(df) {
mean(df$z)
}) |>
then(function(avg) {
cat("Average:", avg, "\n")
avg
})
Error handling
Promises provide reliable error handling through the onRejected callback. When a future computation throws an error, the promise is rejected instead of fulfilled, and you can catch that rejection in a handler without the entire pipeline crashing. This keeps your async workflows resilient even when individual steps fail unexpectedly:
library(future)
library(promises)
plan(multisession)
fp <- future_promise({
stop("Something went wrong!")
})
# Handle success and failure separately
then(fp,
onFulfilled = function(value) {
cat("Success:", value, "\n")
},
onRejected = function(error) {
cat("Error caught:", error$message, "\n")
NA # Return a default value
}
)
The catch() function provides a simpler syntax for error handling when you only care about the failure path. Instead of registering both onFulfilled and onRejected callbacks, catch() attaches a single error handler to the promise chain. It returns a default value that downstream then() calls can process, keeping the pipeline flowing even after a rejection:
future_promise({
log("not a number")
}) |>
catch(function(err) {
cat("Caught error:", err$message, "\n")
0
})
Use finally() to run cleanup code regardless of success or failure. This is essential for releasing resources like file handles, network connections, or temporary files that would otherwise leak. The finally() callback runs after both then() and catch() handlers, ensuring cleanup happens whether the promise was fulfilled or rejected:
temp_file <- tempfile()
future_promise({
write.csv(mtcars, temp_file, row.names = FALSE)
read.csv(temp_file)
}) |>
finally(function() {
# Cleanup runs whether promise succeeded or failed
unlink(temp_file)
})
Practical example: parallel data processing
Here is a realistic example that processes multiple datasets in parallel using all three patterns we have covered: launching futures in a background plan, chaining operations with then(), and collecting results with promise_all(). This pattern is common in ETL pipelines where you need to process independent data chunks concurrently:
library(future)
library(promises)
library(purrr)
plan(multisession, workers = 4)
# Simulate processing multiple files
process_file <- function(i) {
Sys.sleep(0.5) # Simulate reading/processing
data.frame(
file = paste0("file_", i),
rows = sample(100:1000, 1),
avg_value = runif(1)
)
}
# Launch all processing tasks in parallel
futures <- map(1:8, ~future_promise(process_file(.x)))
# Combine results as they complete
future_promise({
promises::promise_all(.list = ..1) |>
then(function(results) {
do.call(rbind, results)
})
}, .list = futures) |>
then(function(final_df) {
cat("Processed", nrow(final_df), "rows across", length(unique(final_df$file)), "files\n")
print(head(final_df))
})
Key limitations to remember
-
Globals: Variables from the parent environment are automatically exported to background processes. Large data can significantly slow down execution. Use the
globalsparameter to manually specify what’s needed. -
Package loading: Packages must be available in the background process. Use the
packagesparameter to explicitly load required packages. -
Native resources: Database connections and network sockets created in the parent process cannot be used in child processes; create them within the future block instead.
-
Return values: Large return values require serialization, which takes time. For very large datasets, consider writing results to disk instead.
Choosing your async R strategy
Start simple and add complexity only when you need it. For most users, plan(multisession) with future_promise() covers the common case of keeping a Shiny app responsive during heavy computation. The future package alone is enough for parallel batch processing where you do not need promise chains.
When you do need promise chains — composing multiple async operations, branching on results, collecting parallel tasks — the promises package adds the then(), catch(), and finally() primitives that make those workflows readable. The key insight is that future handles where code runs and promises handles what happens with the result. Keeping those concerns separate makes your async code easier to test, debug, and refactor as your application grows. For data science workflows, the furrr package is often the best middle ground: it parallelizes purrr-style iteration with a familiar API while handling the future backend automatically.
See also
- Parallel Computing in R: traditional parallel backends with foreach, snow, and multicore
- Parallel Mapping with furrr: purrr-style parallel iteration across CPU cores
- Shiny Modules: build modular Shiny apps that stay responsive with async patterns