Running R in Production: Best Practices for 2026
Running R in production is fundamentally different from running it interactively in RStudio. You need reliability, scheduling, monitoring, and often containerization. When an analysis script runs automatically at 2 AM without anyone watching, every assumption your interactive workflow takes for granted, such as the correct working directory, the installed packages, and the environment variables, must be stated explicitly. This guide covers the essential tools and patterns for running R in production in 2026, from simple cron jobs to containerized microservices.
TL;DR
- Schedule with cron for simple daily or weekly R scripts; use targets for multi-step dependency-aware pipelines.
- Containerize with Docker to lock the R version, system libraries, and package versions into a single deployable image.
- Serve predictions via plumber REST APIs; wrap them in Docker and deploy to Kubernetes or AWS ECS.
- Pin dependencies with renv so production restores the exact package versions used during development.
- Log everything and add health checks. An unmonitored cron job that fails silently can go unnoticed for weeks.
Why run R in production?
R was designed for interactive analysis, not server-side execution. When you work in RStudio, the environment provides a working directory, loaded packages in your global namespace, and immediate visual feedback on errors. None of those conveniences exist when a script runs unattended on a server at 3 AM. Yet organizations increasingly need R to run without human supervision for several reasons:
- Run scheduled reports and dashboards
- Execute ML model inference at scale
- Power APIs that serve predictions
- Automate data pipelines that run without human intervention
Modern R tooling makes all of this possible, often with minimal overhead. The same script you developed in RStudio can be scheduled, containerized, and monitored with a modest investment in learning the surrounding tooling. Running R in production is less about changing how you write R code and more about adding the operational layer that makes that code reliable when nobody is watching. The tools covered in this guide — cron, targets, Docker, plumber, and renv — form a production stack that handles the scheduling, environment, serving, and observability concerns that interactive RStudio does not provide.
Scheduling R scripts
The simplest production pattern is scheduling an R script to run at a fixed interval. This is the gateway for running R in production: you write a script the same way you would for an interactive session, then configure a scheduler to invoke it automatically. The challenge is that the scheduler provides none of the conveniences of an interactive R session: no working directory, no loaded namespaces, no visible error output unless you capture it explicitly.
cron jobs
On Linux servers, cron is the standard scheduler:
# Edit crontab
crontab -e
# Run an R script every day at 2 AM
0 2 * * * /usr/bin/Rscript /home/user/scripts/daily_report.R >> /var/log/r_jobs.log 2>&1
Key tips for cron with R:
- Use absolute paths everywhere
- Redirect output to a log file
- Set the working directory explicitly in your script
When the cron daemon launches Rscript, it does so in a minimal environment without the working directory or library paths you would have in an interactive R session. Your script must therefore set up its own execution context at the top, before any package loads or data reads happen. The boilerplate below handles this reliably for any scheduled script:
# Start of your scheduled R script
setwd("/home/user/projects/myproject")
# Load required packages
library(dplyr)
library(glue)
# Your production logic here
Placing setwd() and your library calls right at the top ensures that every subsequent line of the script runs with predictable paths and loaded dependencies. Without this, a crontab-run script can fail silently because cron’s default working directory is the user’s home folder, not the project directory.
The targets package
For complex pipelines with multiple interdependent steps, cron alone becomes brittle. Each script must manually track which input files changed and which outputs need regeneration. The targets package solves this by defining a directed acyclic graph of computation targets and automatically detecting which ones are out of date:
# _targets.R
library(targets)
tar_option_set(packages = c("dplyr", "ggplot2", "readr"))
list(
tar_target(raw_data, read_csv("data/input.csv")),
tar_target(cleaned_data, raw_data |> filter(!is.na(value))),
tar_target(plot, ggplot(cleaned_data, aes(x, y)) + geom_point()),
tar_render(report, "report.Rmd")
)
Run with tar_make() to execute only out-of-date targets. This is ideal for data pipelines where you want to avoid re-running expensive steps. Targets also handles parallel execution, tracks runtime performance per target, and provides a visual graph of your pipeline dependencies. When a daily pipeline grows from three steps to thirty, targets keeps the execution time proportional to the changes, not the total pipeline size.
Containerizing R with Docker
Docker has become essential for reproducible R deployments. Building a Docker image captures the operating system, system libraries, R version, and all installed packages into a single artefact that behaves identically on a developer laptop, a CI runner, and a production server. This eliminates the “it works on my machine” class of deployment failures that plague R scripts which rely on locally installed packages.
The key advantage for running R in production is that a Docker image is self-contained. You do not need to install R on the production server, configure system libraries, or worry about version conflicts with other applications sharing the same machine. The image contains everything the R script needs, from the base operating system packages up through the specific CRAN package versions. When you need to update a dependency or change the R version, you rebuild the image and redeploy. The old and new versions never conflict because they run in separate containers.
Basic Dockerfile for R
FROM r-base:4.3.2
RUN apt-get update && apt-get install -y libcurl4-openssl-dev libssl-dev libxml2-dev && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY . .
RUN Rscript -e "install.packages(c(tidyverse, plumber))"
CMD ["Rscript", "app.R"]
Multi-stage build for smaller images
The first Dockerfile ships the full build toolchain inside the final image. Compilers and development libraries bloat the image and expand the attack surface. A multi-stage build uses a build stage to install packages, then copies only the compiled R library into a fresh runtime image. One stage installs packages with -dev libraries, a second stage inherits only the artifacts. Multi-stage builds are worth the extra configuration whenever your image needs system build dependencies that are only required during package installation. The runtime stage starts from a fresh base image, so compilers and headers never ship to production.
REST APIs with plumber
The plumber package turns R functions into REST API endpoints by converting roxygen-style comments into HTTP route definitions. This is the standard way to serve R model predictions, data transformations, or reporting logic to other services that do not speak R. Instead of wrapping R in a Python or Node.js service, plumber lets you build the API directly in R. Each roxygen comment above a function defines an HTTP method, route path, expected parameters, and serialization format.
library(plumber)
#* @get /predict
#* @param age:numeric
#* @param income:numeric
#* @serializer json
function(age, income) {
model <- readRDS("model.rds")
new_data <- data.frame(age = as.numeric(age), income = as.numeric(income))
list(prediction = predict(model, new_data), confidence = 0.87)
}
Run with plumber::plumb("api.R")$run(host = "0.0.0.0", port = 8000). This starts a web server on all network interfaces. For production, wrap plumber in a Docker container alongside the model file and any required packages. A minimal plumber Dockerfile looks like the earlier R Dockerfile but exposes port 8000 and sets the CMD to launch the API directly. When deploying to Kubernetes or AWS ECS, the container image is the deployable unit and the orchestrator handles process lifecycle and health checking for you.
Environment management with renv
The renv package ensures consistent package versions across development and production. When a colleague updates a package or CRAN ships a new release, your local environment does not pick up the change until you explicitly run renv::snapshot() or renv::restore(). This isolation is the foundation of reproducible R deployments:
# Initialize renv in your project
renv::init()
# Snapshot your current state
renv::snapshot()
# In production, restore exact same state
renv::restore()
The lockfile renv.lock commits to version control and guarantees reproducibility. With renv handling package versions, the next concern is knowing what your code is doing while it runs. Production services need logs to diagnose failures, track performance, and alert operations teams when something goes wrong. Unlike interactive R where you see errors immediately in the console, a production script running at 2 AM must write its own audit trail.
Monitoring and logging
Production R needs observability. An unmonitored scheduled script is a time bomb: it can fail silently for weeks before anyone notices the reports stopped generating. Observability requires both logging (recording what happened) and monitoring (alerting when something goes wrong). For R in production, start with structured logging and build alerting from there.
Basic logging
logger <- function(message, level = "INFO") {
timestamp <- format(Sys.time(), "%Y-%m-%d %H:%M:%S")
cat(sprintf("[%s] [%s] %s\n", timestamp, level, message))
}
logger("Starting data processing")
Error handling in production
Logging records events, but error handling determines what happens when those events are failures. In production, an unhandled exception that stops the R process means the scheduler or orchestrator sees a non-zero exit code and may not restart the service correctly. Wrapping operations in error handlers gives you control over the failure path. The safely() function from purrr wraps any function and returns a list with either a result or an error field:
safe_process <- safely(function() {
read_csv("data.csv") |> summarise(mean = mean(value))
}, otherwise = NULL)
if (is.null(safe_process()$result)) {
logger("Pipeline failed", "ERROR")
}
Health checks for plumber APIs
With logging and error handling in place, the last piece of operational readiness is giving your infrastructure a way to determine whether the service is alive. A health check endpoint answers a yes-or-no question: is this service ready to handle requests right now? Orchestrators call this endpoint every few seconds and will stop routing traffic to instances that fail.
#* @get /health
function() {
list(status = "healthy", timestamp = Sys.time())
}
Health check endpoints are essential for container orchestration. Kubernetes and other orchestration platforms call /health at regular intervals to determine whether to route traffic to your service. A healthy response tells the orchestrator the service is ready; an unhealthy one triggers a restart. Without a health endpoint, your plumber API cannot be deployed reliably in an orchestrated environment where automated health checks are the sole mechanism for detecting failures.
Performance optimization
R is single-threaded by default. A single R process uses exactly one CPU core, which means a script that processes ten thousand CSV files sequentially leaves the other cores on the server idle. For running R in production at any meaningful scale, you need to either parallelise the work within R or run multiple independent R processes. Production optimizations include:
Parallel processing
library(furrr)
library(future)
plan(multisession, workers = 4)
# Parallel map
results <- future_map_dfr(files, ~{
read_csv(.x) |> process_data()
})
Database connections
Parallel processing handles compute-bound workloads, but many production R scripts are IO-bound, spending most of their time waiting on database queries. Efficient database access matters more than CPU parallelism for these workloads. The DBI package provides a uniform interface across database backends, and managing connections explicitly prevents the leaks that accumulate over long-running scheduled jobs. Always store credentials in environment variables, never hardcoded in the script.
con <- dbConnect(Postgres(),
host = "db.example.com", dbname = "analytics",
user = Sys.getenv("DB_USER"), password = Sys.getenv("DB_PASSWORD"))
on.exit(dbDisconnect(con))
Use connection pooling for high-traffic APIs with pool or rpostgis. Pooling reuses connections across requests instead of opening a new connection for every query, which becomes critical when your plumber API handles concurrent requests. Without pooling, each concurrent request establishes its own database session, quickly exhausting server connection limits.
CI/CD for R projects
Testing and deploying R code through a CI/CD pipeline catches regressions before they reach production. GitHub Actions is the most common choice for R projects because the r-lib/actions organisation maintains reusable workflow steps for installing R, caching packages, and running checks. A CI pipeline for an R project typically installs dependencies from renv.lock, runs testthat tests, checks for linting issues with lintr, and optionally builds a Docker image for deployment. Making CI part of the workflow for running R in production means every push runs the same checks and produces a deployable artefact. The workflow below runs tests on every push.
on: [push]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: r-lib/actions/setup-r@v2
- run: Rscript -e "install.packages('testthat')"
- run: Rscript -e "testthat::test_dir('tests')"
Common production patterns
| Pattern | Tool | When to Use |
|---|---|---|
| Scheduled scripts | cron + Rscript | Simple daily/weekly jobs |
| Complex pipelines | targets | DAG workflows with dependencies |
| REST API | plumber | Real-time predictions |
| Web app | Shiny | Interactive dashboards |
| Containerized | Docker + plumber | Scaleable microservices |
R code runs in production in several patterns. Plumber REST APIs expose R functions as HTTP endpoints, a common pattern for serving model predictions or data processing pipelines. Posit Connect (formerly RStudio Connect) deploys Plumber APIs, Shiny apps, and Quarto documents with built-in authentication, scheduling, and monitoring. For cloud-native deployments, Docker containers running Plumber are deployed to AWS ECS, Google Cloud Run, or Kubernetes.
For batch processing pipelines, targets provides dependency-aware pipeline execution: it only re-runs steps whose inputs have changed, tracks all outputs, and provides pipeline visualization and progress reporting. Combined with scheduled execution (GitHub Actions, AWS Lambda, or Airflow), targets pipelines provide a reliable framework for daily data processing jobs.
Production checklist
Deploying R to production means your code runs without anyone watching, and a failure discovered three days later through a missing report is a failure in your monitoring, not in your R code. Before you deploy, walk through each of these items and confirm the answer is yes. If any item is unclear or untested, fix it before the first production run:
- Reproducibility:
renv.lockis committed andrenv::restore()succeeds in CI. Use therockerproject for Docker base images with explicit version tags likerocker/r-ver:4.5.0. Pin the R version and avoidlatesttags that introduce silent changes when upstream images rebuild. - Process management: Use
systemdorpm2to restart on failure and redirect logs. Set memory limits on long-running Shiny apps that can leak memory over time, and configure stdout and stderr to feed into your logging pipeline. - Structured logging: The
loggerpackage emits JSON-formatted logs that integrate with Loki, Datadog, or CloudWatch. Include request method, path, status code, and duration in every plumber API log entry so you can track performance trends and error spikes. - Error handling: Wrap external calls in
tryCatch()orsafely()so failures never go silent. Every error should either be handled explicitly with a fallback path or bubble up with full diagnostic context for investigation. - Monitoring: Alert on scheduled job failures via email. Track API request duration and error rates over time with APM tools. Set up alerts for data drift in model predictions so retraining happens before accuracy degrades.
- Rollback path: Test the full Docker build in CI, pin the R version in your Dockerfile, and document how to revert a deployment. A rollback path turns a potential outage into a brief inconvenience.
Running R in production means treating your R code with the same discipline as any production service. Version dependencies, containerise the runtime, log structured output, and alert on failure. Those four practices applied consistently will take you further than any single tool.
See also
- Building REST APIs with plumber - Creating APIs from R functions
- Reproducible Pipelines with targets - DAG-based workflow scheduling
- Reproducible Environments with renv - Managing package versions
- Running R in GitHub Actions - CI/CD for R projects
- Parallel Computing in R - Performance optimization