rguides

Working REST APIs with httr2 in R

Working REST APIs in R means sending HTTP requests, parsing JSON responses, and building pipelines that reliably consume data from web services. The httr2 package is the modern successor to httr, providing a pipe-friendly interface for making HTTP requests. This guide covers everything from simple GET requests to OAuth 2.0 authentication flows.

Why httr2?

The httr2 package offers several advantages over its predecessor:

  • Pipe-friendly syntax: Build requests step-by-step using the pipe operator
  • Better error handling: Granular control over how errors are managed
  • Improved retry logic: Built-in support for handling transient failures
  • Request throttling: Native rate limiting support
  • Modern authentication: Full OAuth 2.0 support

Installation

Install httr2 from CRAN:

install.packages("httr2")
library(httr2)

httr2 is designed for R 4.1 or later, leveraging the native pipe operator. The package works independently of the tidyverse but integrates smoothly with it, following R’s standard package conventions.

Making your first request

The fundamental workflow involves creating a request, modifying it, and executing it. request() creates an unexecuted request object, which you then configure and dispatch:

library(httr2)

# Create a request object
req <- request("https://httpbin.org/get")

# Execute the request
resp <- req |> req_perform()

# Check the response status
resp |> resp_status()
# [1] 200

# Extract the response body
resp |> resp_body_string()

Each function in the chain returns the modified object, enabling readable request construction. The request object carries all the configuration: URL, headers, method, and body, until req_perform() executes it and returns a response object that you can inspect or parse.

HTTP methods

GET requests

GET requests retrieve data. Use req_url_query() to add query parameters, which are automatically URL-encoded and appended to the request URL. This is the standard way to filter, paginate, or search API endpoints without sending a request body.

# Simple GET request
resp <- request("https://httpbin.org/get") |>
  req_perform()

# GET with query parameters
resp <- request("https://httpbin.org/get") |>
  req_url_query(search = "R programming", page = 1, limit = 10) |>
  req_perform()

# Parse JSON response
data <- resp |> resp_body_json()

Query parameters are automatically URL-encoded. Chain multiple parameters as needed, each req_url_query() call appends to the existing query string. The response body is frequently JSON; resp_body_json() parses it into an R list that you navigate with standard $ and [[ indexing.

POST requests

POST requests send data to create or update resources. The body is most often JSON, which req_body_json() serializes from an R list and automatically sets the Content-Type header. The server’s response typically includes the created resource or a confirmation object.

# POST with JSON body
resp <- request("https://httpbin.org/post") |>
  req_method("POST") |>
  req_body_json(list(
    username = "datafan",
    action = "create",
    priority = 1
  )) |>
  req_perform()

# Parse the response
result <- resp |> resp_body_json()

Other HTTP verbs

Beyond GET and POST, httr2 supports the full range of HTTP methods through req_method(). PUT replaces an entire resource, PATCH updates only specified fields for efficiency, and DELETE removes resources. The body and header chains work identically across all methods, so once you know the pattern for one verb, you know them all. The following examples demonstrate the consistent request-building pattern for each verb.

# PUT - complete resource replacement (sends all fields)
resp <- request("https://api.example.com/users/123") |>
  req_method("PUT") |>
  req_body_json(list(name = "Updated Name", active = TRUE)) |>
  req_perform()

# PATCH - partial update (sends only changed fields)
resp <- request("https://api.example.com/users/123") |>
  req_method("PATCH") |>
  req_body_json(list(name = "New Name")) |>
  req_perform()

# DELETE - remove a resource (no body needed)
resp <- request("https://api.example.com/users/123") |>
  req_method("DELETE") |>
  req_perform()

Working with headers

Custom headers are essential for authentication and content negotiation. Headers pass metadata alongside the request — tokens for authorization, Media Type specifications for content negotiation, and User-Agent strings for identifying your application to the server. httr2’s req_headers() adds multiple headers in a single call, each specified as a named argument.

# Add multiple headers
req <- request("https://api.github.com/user") |>
  req_headers(
    Authorization = paste("Bearer", Sys.getenv("GITHUB_TOKEN")),
    Accept = "application/vnd.github.v3+json",
    `User-Agent` = "MyRApp/1.0",
    `Accept-Language` = "en-US"
  )

resp <- req |> req_perform()

Common header types include Authorization tokens, Accept for content negotiation, User-Agent for application identification, and custom API keys. Many APIs require specific header formats, so always consult the API documentation for required headers.

Authentication methods

Bearer tokens

Bearer tokens are the standard for modern REST APIs. The token is sent in the Authorization header prefixed with “Bearer”, and req_auth_bearer() handles this formatting automatically. For security, tokens should come from environment variables rather than being hardcoded in the script.

# Simple bearer token
req <- request("https://api.example.com/data") |>
  req_auth_bearer("your-access-token")

# Using environment variable for security
req <- request("https://api.example.com/data") |>
  req_auth_bearer(Sys.getenv("API_TOKEN"))

Never hardcode tokens in your scripts. Use environment variables or secure credential storage. If your token is leaked in a commit, assume it is compromised and rotate it immediately.

Basic authentication

Basic auth encodes credentials in Base64 and sends them in the Authorization header. While simple, this method exposes the username and password on every request unless the connection uses HTTPS. req_auth_basic() handles Base64 encoding and header formatting automatically, accepting the username and password as separate arguments.

req <- request("https://api.example.com/login") |>
  req_auth_basic("username", Sys.getenv("API_PASSWORD"))

OAuth 2.0

httr2 provides comprehensive OAuth 2.0 support through the oauth_client() constructor and oauth_flow_auth_code() function. OAuth 2.0 is the industry standard for delegated authorization — users grant your application access to their data on a third-party service without sharing their password. httr2 handles the browser-based authorization flow, token exchange, and automatic token refresh.

# Define OAuth client
oauth <- oauth_client(
  id = Sys.getenv("OAUTH_CLIENT_ID"),
  token_url = "https://oauth.provider.com/token",
  secret = Sys.getenv("OAUTH_CLIENT_SECRET")
)

# Authorization code flow
token <- oauth |>
  oauth_flow_auth_code(
    auth_url = "https://oauth.provider.com/authorize",
    redirect_uri = "http://localhost:1410"
  )

# Use token in requests
resp <- request("https://api.provider.com/data") |>
  req_auth_bearer(token$access_token) |>
  req_perform()

httr2 automatically handles token refresh when the OAuth provider supports it.

Error handling

Custom error messages

req_error() controls what happens when the server responds with an HTTP error status (4xx or 5xx). By default, httr2 throws an R error, but you can customize the message to include the status code, response body, or other diagnostic information. The is_error argument accepts a formula that returns the text of the error message.

# Throw custom error on HTTP failure
resp <- request("https://httpbin.org/status/404") |>
  req_error(is_error = ~ paste("HTTP error:", .x$status_code)) |>
  req_perform()
# Error: HTTP error: 404

Graceful error handling

Not every HTTP error should crash your script. For bulk API requests or data collection pipelines where individual failures are expected, you can suppress httr2’s automatic error conversion with req_error(is_error = \(resp) FALSE) and wrap the request in purrr::possibly() (see purrr::safely() for the related safe-function wrapper) to return NULL on failure. This lets you continue processing other requests while logging the failures for later investigation.

# Create a safe version of req_perform
safe_perform <- possibly(req_perform, otherwise = NULL)

# Attempt request without throwing
resp <- request("https://httpbin.org/status/500") |>
  req_error(is_error = \(resp) FALSE) |>
  safe_perform()

if (is.null(resp)) {
  cat("Request failed - handling gracefully\n")
} else {
  cat("Request succeeded:", resp_status(resp), "\n")
}

Checking response status

After performing a request, resp_status() returns the HTTP status code as an integer. The helper resp_status_is_success() checks if the status is in the 200-299 range, making conditional response handling more readable than comparing status codes directly. Always check the status before parsing the body, because error responses from APIs are often structured differently from success responses.

# Verify response is successful
resp <- request("https://httpbin.org/get") |> req_perform()

if (resp |> resp_status_is_success()) {
  data <- resp |> resp_body_json()
} else {
  warning("Unexpected status: ", resp |> resp_status())
}

Parsing responses

JSON responses

JSON is the most common API response format. httr2 provides resp_body_json() to parse the response body into an R list, which you can navigate with standard $ and [[ indexing. For responses that contain arrays of objects, converting to a tibble with enframe() produces a tidy data frame suitable for dplyr and ggplot2 workflows.

# Parse JSON to R list
resp <- request("https://httpbin.org/json") |> req_perform()
data <- resp |> resp_body_json()

# Access nested data
data$slideshow$author

# Parse to tibble when response is an array
library(tibble)
df <- resp |> resp_body_json() |> enframe()

XML responses

Some older or enterprise APIs return XML instead of JSON. The xml2 package integrates with httr2 via resp_body_xml(), which returns an XML document object. You can then extract data with XPath queries using xml_find_all() and xml_find_first(), or navigate the document tree with the package’s other functions.

library(xml2)

resp <- request("https://httpbin.org/xml") |> req_perform()
doc <- resp |> resp_body_xml()

# XPath queries
titles <- doc |> xml_find_all(".//title")
xml_text(titles)

Binary files

For downloading files, including images, PDFs, and binary datasets, req_save() writes the response body directly to disk without loading it into memory. This is more efficient than reading the full response body and then writing it, especially for large files. The progress argument enables a progress bar for downloads that take more than a few seconds.

# Download binary file
request("https://httpbin.org/bytes/1024") |>
  req_save("download.bin")

# Download with progress bar
request("https://httpbin.org/bytes/1024000") |>
  req_save("large_download.bin", progress = TRUE)

Rate limiting

Protect yourself from API rate limits by inserting controlled delays between requests. req_throttle() accepts a rate in requests per second and introduces the necessary pauses automatically. When processing collections of items through an API, combining throttling with pagination ensures you stay within rate limits while fetching all available data.

# Throttle requests to 1 per second
req <- request("https://api.example.com/items") |>
  req_throttle(1)

# Paginate through results with throttling
all_items <- list()

for (page in 1:10) {
  resp <- request("https://api.example.com/items") |>
    req_url_query(page = page, per_page = 100) |>
    req_throttle(1) |>
    req_perform()
  
  items <- resp |> resp_body_json()
  all_items <- c(all_items, items)
  
  cat("Fetched page", page, "-", length(items), "items\n")
}

Retry logic

Handle transient failures automatically with req_retry(). Network timeouts, server overload, and temporary DNS issues are common when calling external APIs. httr2’s retry system re-attempts requests that match user-defined transient-error conditions, with configurable exponential backoff to avoid overwhelming the recovering server. The is_transient argument is a function that inspects the response and returns TRUE when a retry is warranted.

# Retry on server errors (5xx)
resp <- request("https://httpbin.org/delay/2") |>
  req_retry(
    max_tries = 3,
    max_seconds = 15,
    is_transient = ~ .x$status_code >= 500
  ) |>
  req_perform()

# Custom transient detection
resp <- request("https://api.example.com/data") |>
  req_retry(
    max_tries = 5,
    backoff = \(n) 2^n,  # Exponential backoff: 2, 4, 8, 16, 32 seconds
    is_transient = \(resp) {
      resp$status_code >= 500 || 
      resp$status_code == 429 ||  # Rate limited
      grepl("timeout", resp$body, ignore.case = TRUE)
    }
  ) |>
  req_perform()

Practical example: GitHub API

The GitHub API is a well-documented REST API that demonstrates common patterns: authentication via headers, pagination through query parameters, and rate limiting. The following example fetches a user’s profile and repositories, handling authentication with a personal access token stored in an environment variable. GitHub’s API returns paginated results, so the repository fetcher loops through pages until fewer than the requested page size are returned.

library(httr2)
library(tibble)

# Fetch user information
github_user <- function(username) {
  request(paste0("https://api.github.com/users/", username)) |>
    req_headers(
      Accept = "application/vnd.github.v3+json",
      `User-Agent` = "R-httr2-demo/1.0"
    ) |>
    req_perform() |>
    resp_body_json()
}

# Fetch user repositories
github_repos <- function(username, max_pages = 3) {
  all_repos <- list()
  
  for (page in 1:max_pages) {
    resp <- request(paste0("https://api.github.com/users/", username, "/repos")) |>
      req_headers(
        Accept = "application/vnd.github.v3+json",
        `User-Agent` = "R-httr2-demo/1.0"
      ) |>
      req_url_query(page = page, per_page = 100, sort = "updated") |>
      req_throttle(1) |>
      req_perform()
    
    repos <- resp |> resp_body_json()
    all_repos <- c(all_repos, repos)
    
    if (length(repos) < 100) break
  }
  
  tibble(
    name = sapply(all_repos, \(x) x$name),
    description = sapply(all_repos, \(x) x$description),
    stars = sapply(all_repos, \(x) x$stargazers_count),
    forks = sapply(all_repos, \(x) x$forks_count),
    language = sapply(all_repos, \(x) x$language),
    updated = sapply(all_repos, \(x) x$updated_at)
  )
}

# Example usage
user <- github_user("r-lib")
cat("Organization:", user$login, "\n")
cat("Public repos:", user$public_repos, "\n\n")

With the user profile fetched, you can retrieve and process their repositories. The github_repos() function paginates through the API, collecting repository metadata into a tibble:

Authentication patterns

Most APIs require authentication. httr2 handles common schemes cleanly. For bearer tokens, req_headers(Authorization = paste("Bearer", token)) or the shorthand req_auth_bearer_token(token) adds the header. For Basic auth, req_auth_basic(username, password) base64-encodes the credentials. For OAuth 2.0, httr2::oauth_client() and req_oauth_auth_code() handle the full authorization code flow with automatic token refresh.

Store secrets in environment variables, not in code: Sys.getenv("MY_API_KEY"). For packages, use the keyring package to store credentials in the system keychain.

Pagination

Many REST APIs return paginated results. httr2::req_perform_iterative() handles pagination automatically given a function that extracts the next page URL or cursor from each response. For cursor-based pagination: define next_req = function(resp, req) { cursor <- resp_body_json(resp)$next_cursor; if (is.null(cursor)) NULL else req |> req_url_query(cursor = cursor) }. For offset-based pagination, increment the offset by the page size.

Debugging requests

req_dry_run(req) prints the exact HTTP request that would be sent, method, URL, headers, and body, without actually sending it. This is invaluable for debugging authentication issues or verifying that parameters are encoded correctly. httr2::last_response() retrieves the most recent response for inspection after an error. resp_raw(resp) returns the raw response bytes for low-level debugging of binary or unusual content types.

Building requests

httr2 provides a pipeable API for HTTP requests. request(url) creates a request object. Chain modifications with req_url_path_append(), req_url_query(), req_headers(), req_body_json(), and req_auth_bearer_token(). req_perform() sends the request and returns a response.

req_url_query(req, page = 1, per_page = 100) appends query parameters. Multiple calls accumulate parameters. req_url_path_append(req, "/items") appends a path segment without URL-encoding issues.

req_body_json(req, list(name = "Alice", age = 30)) sets a JSON request body and automatically sets the Content-Type: application/json header. req_body_form(req, field = "value") sends URL-encoded form data.

req_headers(req, "X-Custom-Header" = "value") adds arbitrary headers. req_auth_basic(req, username, password) sets HTTP Basic authentication. req_auth_bearer_token(req, token) sets a Bearer token.

Response handling

resp_status(resp) returns the HTTP status code. resp_status_desc(resp) returns the description. resp_check_status(resp) throws an error for 4xx and 5xx responses, call this early in your response handling pipeline.

resp_body_json(resp) parses the response body as JSON. resp_body_string(resp) returns the raw text. resp_body_raw(resp) returns the raw bytes. resp_header(resp, "Content-Type") retrieves a specific response header. resp_headers(resp) returns all headers as a list.

For paginated APIs, req_perform_iteratively() handles pagination automatically. Define the iteration logic as a function that extracts the next page URL from a response, and httr2 calls it repeatedly until no next page is found.

Rate limiting and retry

req_throttle(req, rate = 10 / 60) limits to 10 requests per minute by sleeping between requests. The rate is requests per second, so 10 requests per minute is 10/60.

req_retry(req, max_tries = 3, is_transient = httr2::resp_is_error) retries failed requests. The is_transient function determines which errors warrant a retry, by default, 429 (Too Many Requests) and 503 (Service Unavailable) are transient. backoff = ~ 2^.x sets exponential backoff (2, 4, 8 seconds).

req_cache(req, path = tempdir()) caches GET responses to disk. Subsequent identical requests return the cached response without making a network call, respecting the HTTP Cache-Control and Expires headers. For development and testing, caching avoids hitting rate limits while iterating on parsing logic.

OAuth authentication

oauth_client(id, secret, ...) defines an OAuth client. req_oauth_auth_code(req, client, scope = "read") adds OAuth 2.0 authorization code flow authentication, opening a browser for the user to grant access, then exchanging the code for a token.

req_oauth_client_credentials(req, client) uses client credentials flow (machine-to-machine), appropriate for server-side scripts without user interaction.

httr2 stores and refreshes tokens automatically. oauth_token_cached(client, ...) retrieves a cached token or starts the OAuth flow if no valid token exists.

Error handling patterns

safely_perform <- function(req) {
  tryCatch(
    req_perform(req),
    httr2_http_404 = function(e) {
      list(status = 404, body = NULL)
    },
    httr2_http_429 = function(e) {
      Sys.sleep(as.numeric(e$resp$headers[["Retry-After"]]))
      req_perform(req)
    }
  )
}

httr2 throws specific error classes for HTTP errors: httr2_http_404, httr2_http_429, httr2_http_500, and httr2_failure for network errors. Catching specific classes allows different handling per error type.

purrr::safely() wraps a function to return list(result = ..., error = ...) instead of throwing. map(urls, safely(function(url) req_perform(request(url)))) returns results and errors for each URL without stopping on the first failure.

httr2’s design

httr2 redesigns the R HTTP client interface around request objects. Where httr’s GET and POST functions immediately send a request, httr2 separates request construction from execution. You build a request with req_url, req_method, req_headers, req_body_json, and related functions, then send it with req_perform. This separation makes it easy to create request templates, add middleware, test request construction without sending, and handle retries at the pipeline level.

The redesign addresses several httr limitations. Error handling is more explicit, req_error controls how HTTP errors become R errors. Rate limiting and retry logic are built in. OAuth2 flows have better support. For new code, httr2 is the recommended choice; httr remains supported but is not actively enhanced.

Request pipelines and retry

req_retry wraps a request with automatic retry logic. You specify the maximum number of retries and conditions under which to retry, which HTTP status codes indicate a transient error, whether to detect rate-limit responses and wait the appropriate time before retrying. The retry logic handles exponential backoff automatically. Adding retry handling to a request that calls an unreliable API is a single function call in httr2 rather than wrapping the entire request in a loop.

req_throttle limits request rate to stay within API rate limits. Specifying a maximum of n requests per unit time causes httr2 to introduce delays between requests automatically. Combining throttling with retry handles both proactive rate limiting (stay under the limit) and reactive rate limiting (handle 429 responses when you accidentally exceed it).

Authentication

httr2 provides req_auth_bearer_token for Bearer token authentication and req_auth_basic for HTTP Basic authentication. Both add the appropriate Authorization header to the request. For OAuth2, httr2’s oauth_client and oauth_flow_auth_code implement the full authorization code flow with token storage and automatic refresh.

Credentials should always come from environment variables or a secrets manager, never from hardcoded strings in source files. Using Sys.getenv to read credentials inside the request-building code means the credentials are never in the source and the code works in different environments by setting environment variables appropriately. The httr2_oauth_cache function stores OAuth tokens in an encrypted cache file, avoiding repeated authorization prompts while keeping tokens off disk in plain text.

Best practices

  1. Never hardcode credentials - Use environment variables or config files
  2. Handle errors gracefully - Don’t let API failures crash your scripts
  3. Implement retries - Network failures happen; be resilient
  4. Respect rate limits - Use req_throttle() to avoid getting blocked
  5. Log requests - Use req_log() for debugging
  6. Parse incrementally - For large responses, process in chunks

See also