Async R with future and promises
Async R with future and promises
R is traditionally single-threaded, meaning your code executes one statement at a time. When you run a long computation—training a model, downloading a large dataset, or querying a database—your entire R session blocks until it completes. This creates a poor user experience in interactive applications like Shiny apps, where the interface becomes unresponsive during heavy processing.
The future and promises packages provide a solution. They enable asynchronous programming in R, allowing you to delegate expensive operations to background processes while your main R session remains free to handle other work.
Understanding futures
A future is an abstraction for a value that may not be available yet. When you create a future, R immediately returns a promise-like object while the actual computation happens in the background. The moment you try to access the value, R either retrieves the result (if ready) or blocks until it is.
The future package handles the “invoking” side of async programming—launching computations in separate R processes. Here’s a basic example:
library(future)
# Set up parallel processing
plan(multisession, workers = 2)
# Create a future - computation runs in background
f <- future({
Sys.sleep(2) # Simulate expensive operation
sqrt(144)
})
# This prints immediately - future runs in background
cat("Future created, doing other work...\n")
# Accessing the value blocks until ready
result <- value(f)
print(result) # 12
The %<-% operator provides a convenient shorthand for creating futures:
plan(multisession)
result %<-% {
Sys.sleep(1)
sum(1:1000)
}
print(result) # 500500
Future backends
The future package supports multiple backends for executing async code:
| Backend | Description |
|---|---|
sequential | Default, runs in current R process (not async) |
multisession | Runs in background R sessions on the same machine |
multicore | Uses forked processes (faster but doesn’t work on Windows) |
cluster | Runs on remote machines or an ad-hoc cluster |
# Use multiple background R processes
plan(multisession, workers = 4)
# Use cluster for distributed computing
plan(cluster, workers = c("node1", "node2", "node3"))
Understanding promises
The promises package handles the “handling” side of async programming—working with the results once they’re ready. While future creates the async tasks, promises provides the API for registering callbacks and chaining operations.
A promise represents the eventual result of an async operation. It can be:
- Fulfilled: The operation completed successfully with a value
- Rejected: The operation failed with an error
- Pending: The operation hasn’t completed yet
library(promises)
# Create a promise
p <- promise(~{
Sys.sleep(1)
"Hello, async world!"
})
# Register a callback to run when the promise resolves
then(p, onFulfilled = function(value) {
print(value) # "Hello, async world!"
})
Combining future and promises
The real power comes from combining both packages. The future package can be converted into a promise using future_promise():
library(future)
library(promises)
plan(multisession, workers = 2)
# Create a future and convert it to a promise
fp <- future_promise({
Sys.sleep(2)
iris |> head()
})
# Use then() to handle the result
then(fp, onFulfilled = function(df) {
cat("Data received:\n")
print(head(df))
})
This pattern is especially useful in Shiny applications, where you want to keep the UI responsive while performing heavy computations.
Chaining async operations
One of the most powerful features of promises is chaining. Each then() call returns a new promise, allowing you to compose multiple async operations:
library(future)
library(promises)
plan(multisession)
# Create an async pipeline
future_promise({
data.frame(x = 1:10, y = rnorm(10))
}) |>
then(function(df) {
df$z <- df$x + df$y
df
}) |>
then(function(df) {
mean(df$z)
}) |>
then(function(avg) {
cat("Average:", avg, "\n")
avg
})
Error handling
Promises provide robust error handling through the onRejected callback:
library(future)
library(promises)
plan(multisession)
fp <- future_promise({
stop("Something went wrong!")
})
# Handle success and failure separately
then(fp,
onFulfilled = function(value) {
cat("Success:", value, "\n")
},
onRejected = function(error) {
cat("Error caught:", error$message, "\n")
NA # Return a default value
}
)
The catch() function provides a simpler syntax for error handling:
future_promise({
log("not a number")
}) |>
catch(function(err) {
cat("Caught error:", err$message, "\n")
0
})
Use finally() to run cleanup code regardless of success or failure:
temp_file <- tempfile()
future_promise({
write.csv(mtcars, temp_file, row.names = FALSE)
read.csv(temp_file)
}) |>
finally(function() {
# Cleanup runs whether promise succeeded or failed
unlink(temp_file)
})
Practical example: parallel data processing
Here’s a realistic example that processes multiple datasets in parallel:
library(future)
library(promises)
library(purrr)
plan(multisession, workers = 4)
# Simulate processing multiple files
process_file <- function(i) {
Sys.sleep(0.5) # Simulate reading/processing
data.frame(
file = paste0("file_", i),
rows = sample(100:1000, 1),
avg_value = runif(1)
)
}
# Launch all processing tasks in parallel
futures <- map(1:8, ~future_promise(process_file(.x)))
# Combine results as they complete
future_promise({
promises::promise_all(.list = ..1) |>
then(function(results) {
do.call(rbind, results)
})
}, .list = futures) |>
then(function(final_df) {
cat("Processed", nrow(final_df), "rows across", length(unique(final_df$file)), "files\n")
print(head(final_df))
})
Key limitations to remember
-
Globals: Variables from the parent environment are automatically exported to background processes. Large data can significantly slow down execution. Use the
globalsparameter to manually specify what’s needed. -
Package loading: Packages must be available in the background process. Use the
packagesparameter to explicitly load required packages. -
Native resources: Database connections and network sockets created in the parent process cannot be used in child processes—create them within the future block instead.
-
Return values: Large return values require serialization, which takes time. For very large datasets, consider writing results to disk instead.
Summary
The future package provides a unified interface for parallel and asynchronous execution in R, while promises offers a clean API for handling async results. Together, they enable:
- Non-blocking computations that keep your R session responsive
- Parallel execution across multiple R processes
- Composable async workflows with chaining
- Robust error handling
These tools are essential for building responsive Shiny applications, processing large datasets efficiently, and integrating with async APIs. Start with plan(multisession) and future_promise() for the simplest path to async R programming.