rguides

Model Deployment with Vetiver in R

Vetiver is Posit’s official package for MLOps workflows in R. It provides a standardized way to version, deploy, and monitor machine learning models through a unified model deployment pipeline. If you’re already using tidymodels for model building, vetiver fits naturally into your pipeline for serving predictions in production.

This article walks through an end-to-end workflow: training a model, creating a model card, versioning with pins, deploying as a Plumber API, and serving predictions over HTTP.

Training and preparing a model

First, you need a trained model. Vetiver works with any model that follows the tidymodels prediction interface. Here’s a simple example using a linear regression on the mtcars dataset:

library(tidymodels)
library(vetiver)
library(pins)

# Train a simple model
data(mtcars)
lm_model <- linear_reg() %>%
  set_engine("lm") %>%
  fit(mpg ~ wt + hp + cyl, data = mtcars)

# Create a vetiver model object
v <- vetiver_model(lm_model, "mpg_predictor")
v

The vetiver_model() function wraps your trained model with metadata needed for deployment. This includes the model name, version, and required preprocessing.

Using workflows with custom preprocessing

For real-world models, you’ll often need custom preprocessing steps beyond simple formula syntax. Vetiver supports tidymodels workflows that bundle preprocessing recipes with your model:

library(recipes)

# Define a preprocessing recipe
mpg_recipe <- recipe(mpg ~ wt + hp + cyl, data = mtcars) %>%
  step_log(hp) %>%
  step_center(all_predictors()) %>%
  step_scale(all_predictors())

# Create a workflow combining recipe and model
lm_wf <- workflow() %>%
  add_recipe(mpg_recipe) %>%
  add_model(linear_reg() %>% set_engine("lm")) %>%
  fit(data = mtcars)

# Wrap in vetiver
v <- vetiver_model(lm_wf, "mpg_predictor_workflow")
v

The workflow object preserves your preprocessing steps when deployed, ensuring predictions use the same transformations as training. Packaging preprocessing alongside the model eliminates a common source of production bugs: training-serving skew, where different transformations are applied at prediction time than were used during training. The recipe inside the workflow handles centering, scaling, and any other feature engineering steps automatically.

Creating model cards

Model cards document your model for reproducibility and governance. Vetiver generates model cards from templates using rmarkdown:

# Generate a model card from the vetiver template
rmarkdown::draft(
  "model-card.Rmd",
  template = "vetiver_model_card",
  package = "vetiver"
)

This creates an R Markdown file you can customize with your model description, performance metrics, and training data information. Edit the generated file to add:

  • Model description and purpose
  • Input/output specifications
  • Performance metrics (if available)
  • Training data information
  • Any known limitations

Render the model card to HTML or PDF for sharing:

rmarkdown::render("model-card.Rmd")

Versioning with pins

Vetiver uses the pins package for model versioning. This stores your model in a versioned registry, making it easy to track changes over time. Each time you write a model to the board, pins automatically increments the version number and preserves the previous versions, giving you a complete history of every model you have deployed. See the pins guide for more on board backends and version management:

# Connect to a board (local folder in this case)
board <- board_folder("models/")

# Pin the model with version tracking
board %>% vetiver_pin_write(v)

When you update your model and re-run vetiver_pin_write(), pins automatically versions the new model. The board stores metadata including the R package versions used during training and a prototype of the input data for schema validation. You can retrieve specific versions:

# Get the latest version
latest_model <- board %>% vetiver_pin_read("mpg_predictor")

# Get a specific version
v1_model <- board %>% vetiver_pin_read("mpg_predictor", version = 1)

The pins board keeps a log of every version created, including timestamps and optional notes for each one. You can inspect the full version history to understand when each model was registered and how many iterations exist. The pins dashboard shows your model registry:

board %>% pin_versions("mpg_predictor")

Serving predictions via HTTP

Once your model is versioned and stored in a pins board, the next step is making it available to applications that need predictions. Vetiver generates a Plumber API that wraps your model and exposes a /predict endpoint over HTTP. Clients can send prediction requests to your API. The request body contains the input features as JSON:

library(httr)
library(jsonlite)

# Example prediction request payload
prediction_request <- list(
  wt = 2.5,
  hp = 120,
  cyl = 6
)

# Send request to the API
httr::POST(
  "http://localhost:8080/predict",
  body = jsonlite::toJSON(prediction_request),
  content_type_json()
)

The vetiver API processes the JSON payload, validates that the input columns match the model’s expected schema, runs the prediction pipeline, and returns the results. Input validation rejects requests with missing or mismatched columns before the model runs, preventing silent errors in production. The API returns predictions in JSON format:

{
  "predictions": [
    {"mpg": 23.5}
  ]
}

Handling batch predictions

For bulk prediction workloads, sending many individual requests introduces unnecessary network overhead and serialization cost. Instead, you can send a data frame with multiple rows in a single request, and the API will return predictions for all of them at once:

# Create a data frame with multiple predictions
batch_request <- data.frame(
  wt = c(2.0, 2.5, 3.0),
  hp = c(100, 120, 150),
  cyl = c(4, 6, 8)
)

# Send batch request
response <- httr::POST(
  "http://localhost:8080/predict",
  body = jsonlite::toJSON(batch_request),
  content_type_json()
)

# Parse response
predictions <- jsonlite::fromJSON(httr::content(response, as = "text"))
predictions

The API automatically handles the data frame and returns predictions for all input rows. For large-scale batch processing, this approach is far more efficient than looping through individual prediction calls and avoids the latency of round-trip HTTP overhead per row.

Deploying to posit connect

For production deployment, Posit Connect provides the smoothest integration with vetiver. After publishing your API:

library(rsconnect)

# Deploy the Plumber API to Posit Connect
rsconnect::deployAPI(
  api = "app.R",
  account = "your-account-name",
  appName = "mpg_predictor",
  appTitle = "MPG Prediction API"
)

Replace "app.R" with the file containing your Plumber router. Posit Connect handles hosting, scaling, and authentication out of the box. Once published, the API is available at a stable URL with automatic TLS termination and access control.

Here’s the full pipeline from training to deployment:

library(tidymodels)
library(vetiver)
library(pins)
library(plumber)
library(recipes)

# 1. Train model with preprocessing
data(mtcars)
lm_wf <- workflow() %>%
  add_recipe(recipe(mpg ~ wt + hp + cyl, data = mtcars) %>%
    step_log(hp)) %>%
  add_model(linear_reg() %>% set_engine("lm")) %>%
  fit(data = mtcars)

# 2. Create vetiver model
v <- vetiver_model(lm_wf, "mpg_predictor")

# 3. Create model card from template
rmarkdown::draft(
  "model_card.Rmd",
  template = "vetiver_model_card",
  package = "vetiver"
)

# 4. Version with pins
board <- board_folder("models/")
board %>% vetiver_pin_write(v)

# 5. Deploy as API
pr <- pr() %>% vetiver_api(v)
pr_run(pr, host = "0.0.0.0", port = 8080)

Deployment considerations

For production deployment, consider these aspects:

API Hosting: You can deploy Plumber APIs on various platforms including Posit Connect, Shiny Server, or containerized with Docker. Posit Connect provides the smoothest integration with vetiver through the rsconnect package. For more on Plumber APIs, see our guide to building Plumber APIs.

Model Monitoring: Track prediction drift by logging requests and responses. Vetiver doesn’t include built-in monitoring, but you can add custom logging within your Plumber endpoints.

Authentication: Plumber supports authentication via middleware. For production APIs, ensure appropriate access controls are in place.

Scaling: For high-traffic applications, consider running multiple R processes behind a load balancer. Plumber APIs are single-threaded by default.

The mLOps workflow with vetiver

vetiver connects the model development and model serving steps. After fitting a model with workflows::workflow() and parsnip, wrapping it in vetiver::vetiver_model(fitted_workflow, "model-name") creates a versioned, deployable model object. The model includes its input data prototype, which vetiver uses to validate prediction requests.

vetiver_pin_write(board, v_model) stores the model in a pins board, a local folder, S3 bucket, or Posit Connect. vetiver_pin_read(board, "model-name") retrieves the latest version. Pinning creates a versioned history of model artifacts, enabling rollback to any previous version.

Deploying as a plumber API

vetiver_write_plumber(v_model, "plumber.R") generates a Plumber API definition that serves the model’s /predict endpoint. The generated file includes input validation using the model’s data prototype. Deploy with plumber::pr("plumber.R") |> pr_run() locally or on Posit Connect.

The /predict endpoint accepts POST requests with JSON-formatted input data. The response contains predictions in the same format as predict(model, new_data). This standardized interface allows client code in R, Python, or any language to call the model API.

Model monitoring

vetiver::vetiver_compute_metrics() computes performance metrics on new data, comparing against the training metrics stored with the model. vetiver::vetiver_plot_metrics() visualizes metric drift over time. Setting up a regular monitoring job, say, weekly computation of metrics on recent predictions — alerts when model performance degrades, triggering retraining.

The vetiver monitoring workflow integrates with pins for storing metric history, making it straightforward to build a dashboard that shows model health over time using ggplot2 or Shiny.

Model cards

vetiver integrates with model cards — structured documentation that describes a model’s intended use, performance metrics, and limitations. vetiver_model_card() generates a Quarto template pre-populated with model details. Completing a model card before deployment forces explicit documentation of what the model does and does not do, which is valuable for auditing and communication with non-technical stakeholders.

Monitoring and drift

Deployed models degrade as data distributions shift. vetiver supports monitoring workflows: log predictions and actuals over time, compute metrics on recent data, and alert when performance drops below a threshold. The pins package stores monitoring data alongside the model artifact. Combine vetiver monitoring with a scheduled Quarto report to produce automatic performance summaries without manual analysis each time.

Version management

Each pin_write() to a board creates a new version of the model artifact. vetiver_pin_read() by default retrieves the latest version. Pin an older version with the version argument for rollback. The version history gives you a reproducible audit trail: you can retrieve the exact model that made predictions on a given date and reproduce those predictions with the same inputs.

The ML deployment problem

Deploying a machine learning model means making it available to produce predictions for new data, either on demand through an API or as part of a batch pipeline. The common problem is version management: the model was trained in a notebook or script, and now it must be served in a different environment with the same package versions, the same preprocessing pipeline, and predictable behavior. Without infrastructure, this is error-prone and manual.

vetiver provides a versioning and deployment framework for ML models in R. It bundles a fitted model with its metadata — package versions, feature requirements, and a prediction endpoint definition — into a versioned artifact that can be stored, retrieved, and deployed consistently. The same model artifact that produced predictions in development is what runs in production.

Model versioning with pins

vetiver stores model artifacts using the pins package. A pin is a versioned, named data object stored in a board — local, cloud storage, or Posit Connect. Saving a vetiver model to a board creates a new version automatically. Retrieving it by name and optional version gives deterministic access to any previously saved model. The versioning lets you roll back to a previous model if a new version performs poorly.

The model metadata stored with the artifact includes the R package versions used to train the model and an example dataset for input validation. When the model is loaded in a new environment, the metadata provides the information needed to reproduce the same environment. Tools like renv can reconstruct the original package environment from the stored metadata.

Serving models as aPIs

vetiver_api creates a plumber API from a vetiver model object. The API has a /predict endpoint that accepts new data and returns predictions. Deploying this API on Posit Connect, a Docker container, or a cloud platform makes predictions available to any application that can send HTTP requests. The plumber integration means the API has auto-generated OpenAPI documentation describing the input schema and output format.

Input validation at the prediction endpoint checks that incoming data has the correct columns and types before running the model. Sending malformed data to an ML model often produces silently wrong predictions rather than an error, making input validation critical for production APIs. The example dataset stored in the model artifact defines the expected input schema for the validation check.

Summary

Vetiver makes it straightforward to serve tidymodels predictions over HTTP. The key functions are:

  • vetiver_model() wraps your trained model
  • vetiver_pin_write() and vetiver_pin_read() handle version control
  • The vetiver_model_card template generates documentation
  • pr() %>% vetiver_api(v) creates a deployable API

The workflow integrates with tidymodels for training, recipes for preprocessing, and pins for versioning, giving you the tools to put models into production without building custom infrastructure from scratch.