Building a REST Client in R
In the previous tutorials of this series, you learned how to make HTTP requests with httr2 and work with APIs. This tutorial goes deeper into building a production-ready REST client, a reusable abstraction that handles authentication, retries, pagination, and error handling gracefully.
By the end, you’ll have a client that’s reliable enough for real-world data pipelines.
What you’ll learn
This tutorial covers the key concepts and practical techniques for working with Building a REST Client in R. By the end, you will know how to apply the core functions in real data analysis workflows.
Why build a REST client?
When you’re just getting started with an API, direct function calls work fine:
resp <- request("https://api.example.com/data") |>
req_perform()
But as your usage grows, you’ll encounter challenges:
- Authentication tokens expire and need refreshing
- Rate limits require exponential backoff
- Large datasets come in pages
- Network failures happen, your code should handle them
- Different endpoints need different configurations
A well-structured REST client encapsulates all this complexity behind a clean interface. Instead of repeating authentication logic and error handling in every function call, you build it once and reuse it.
Designing your client
Let’s build a client for a hypothetical JSON API. The patterns apply to any REST API.
Step 1: create a client function
The core of your client is a function that initializes a request with defaults:
library(httr2)
create_client <- function(base_url, api_key = NULL) {
req <- request(base_url)
if (!is.null(api_key)) {
req <- req |> req_headers("Authorization" = paste("Bearer", api_key))
}
req
}
This creates a request template you can modify for each call. You start with a base configuration and add specifics as needed.
Step 2: add error handling
HTTR2 makes error handling elegant with req_error():
safe_request <- function(req) {
req |>
req_error(is_error = ~ TRUE) |>
req_perform() |>
resp_check_status()
}
The is_error predicate returns TRUE for any 4xx or 5xx response, turning them into R errors with meaningful messages. By default, HTTR2 only throws errors for 5xx server errors; this makes it treat client errors (like 404 or 401) the same way.
You can also customize error handling for specific status codes:
handle_not_found <- function(req) {
req |>
req_error(status_code = ~ .x == 404, body = ~ "Resource not found") |>
req_perform()
}
Step 3: implement automatic retries
Network failures happen. Use req_retry() for resilience:
robust_request <- function(req, max_retries = 3) {
req |>
req_retry(
max_tries = max_retries,
backoff = ~ exp(.x) * 0.5, # Exponential backoff
is_transient = ~ resp_status(.x) >= 500
) |>
req_perform()
}
The backoff formula starts at 0.5 seconds and doubles with each retry: 0.5s, 1s, 2s, 4s. The is_transient function tells HTTR2 which responses should trigger a retry, here, any 5xx server error.
You can also retry on rate limiting (429) with a longer backoff:
rate_limited_request <- function(req) {
req |>
req_retry(
max_tries = 5,
backoff = ~ if (resp_status(.x) == 429) 60 else exp(.x) * 0.5,
is_transient = ~ resp_status(.x) >= 500 || resp_status(.x) == 429
) |>
req_perform()
}
Step 4: handle pagination
Many APIs return paginated results. Here’s a pattern for collecting all pages:
fetch_all_pages <- function(client, endpoint) {
all_results <- list()
page <- 1
has_more <- TRUE
while (has_more) {
resp <- client |>
req_url_path(endpoint) |>
req_url_query(page = page, per_page = 100) |>
robust_request()
data <- resp_body_json(resp)
all_results <- c(all_results, data$items)
has_more <- !is.null(data$next_page)
page <- page + 1
}
all_results
}
Different APIs use different pagination schemes. Common patterns include:
- Offset-based:
?page=2&per_page=50 - Cursor-based:
?cursor=abc123 - Link headers: Check
Linkheader fornextrelation
Adapt the pattern to match your API’s response format.
Step 5: token refreshing
OAuth tokens expire. Build automatic refresh into your client:
create_oauth_client <- function(base_url, client_id, client_secret) {
# Initial token fetch
token_resp <- request(base_url) |>
req_url_path("oauth/token") |>
req_method("POST") |>
req_body_form(
grant_type = "client_credentials",
client_id = client_id,
client_secret = client_secret
) |>
req_perform() |>
resp_body_json()
token <- token_resp$access_token
expires_at <- Sys.time() + token_resp$expires_in
# Return a function that handles automatic refresh
function(endpoint, ...) {
if (Sys.time() > expires_at) {
# Token expired — refresh it
token_resp <- request(base_url) |>
req_url_path("oauth/token") |>
req_method("POST") |>
req_body_form(
grant_type = "client_credentials",
client_id = client_id,
client_secret = client_secret
) |>
req_perform() |>
resp_body_json()
token <<- token_resp$access_token
expires_at <<- Sys.time() + token_resp$expires_in
}
request(base_url) |>
req_url_path(endpoint) |>
req_headers("Authorization" = paste("Bearer", token)) |>
robust_request()
}
}
The <<- operator updates the token and expiry time in the parent environment. Each call checks if the token is still valid before making a request.
Adding timeouts
Production code should set timeouts to avoid hanging requests:
timed_request <- function(req, timeout = 30) {
req |>
req_timeout(timeout) |>
req_perform()
}
Combine this with your retry logic for a reliable pipeline:
production_request <- function(req) {
req |>
req_timeout(30) |>
req_retry(max_tries = 3, backoff = ~ exp(.x) * 0.5) |>
req_error(is_error = ~ TRUE) |>
req_perform()
}
Putting it all together
Here’s a complete example combining all patterns:
library(httr2)
library(purrr)
# Initialize client
api_call <- create_oauth_client(
"https://api.example.com",
Sys.getenv("CLIENT_ID"),
Sys.getenv("CLIENT_SECRET")
)
# Fetch paginated data with automatic retries
fetch_users <- function() {
fetch_all_pages(api_call, "v1/users")
}
# Fetch a single resource
get_user <- function(user_id) {
api_call(paste0("v1/users/", user_id)) |>
resp_body_json()
}
# Get data
users <- fetch_users()
user <- get_user(12345)
Common pitfalls
- Forgetting to handle 404, Missing resources shouldn’t crash your pipeline
- No retry on 429, Rate limits are transient; retry after the suggested delay
- Hardcoding URLs, Use environment variables for base URLs to support staging and production
- Ignoring response encoding, Some APIs return gzipped responses; HTTR2 handles this automatically
Constructing requests
httr2::request("https://api.example.com") creates a request object. req_url_path_append(req, "users", user_id) builds the URL path. req_url_query(req, page = 1, per_page = 100) adds query parameters. req_headers(req, "Accept" = "application/json") adds headers. req_body_json(req, list(key = "value")) adds a JSON body. All req_* functions return modified request objects, they do not send anything.
Error handling
resp_status(resp) returns the HTTP status code. resp_is_error(resp) checks for 4xx/5xx. req_error(req, body = function(resp) resp_body_json(resp)$message) customizes what error message appears when a request fails. req_retry(req, max_tries = 3) retries on transient server errors (429 Too Many Requests, 503 Service Unavailable). Combine: req |> req_retry(3) |> req_error(body = parse_error_fn) |> req_perform().
Rate limiting
APIs enforce rate limits to prevent abuse. req_throttle(req, rate = 60/60) limits to 60 requests per minute. For APIs that return rate limit headers (X-RateLimit-Remaining, Retry-After), req_retry() respects Retry-After automatically. For APIs without standard headers, add explicit Sys.sleep() calls between requests or implement a token bucket algorithm.
REST API design patterns
REST APIs use standard HTTP methods: GET (retrieve), POST (create), PUT/PATCH (update), DELETE (remove). Resources are identified by URLs. Responses use HTTP status codes to communicate outcomes.
A well-designed R API client wraps these HTTP calls into domain-specific functions. get_user(id) calls GET /users/{id}. create_order(data) calls POST /orders. This encapsulation isolates HTTP details from business logic, making tests easier to write and the API easier to use.
httr2 is the modern package for HTTP clients. request(base_url) %>% req_url_path_append("/users") %>% req_url_query(page = 1) %>% req_auth_bearer_token(token) %>% req_perform() constructs and executes a request. Break down the chain into named functions for reuse.
Client architecture
A clean client uses a constructor to set shared configuration (base URL, authentication) and methods for each endpoint:
new_api_client <- function(base_url, api_key) {
env <- new.env()
env$base_request <- httr2::request(base_url) %>%
httr2::req_headers("X-API-Key" = api_key)
env$get_items <- function(page = 1) {
env$base_request %>%
httr2::req_url_path_append("/items") %>%
httr2::req_url_query(page = page) %>%
httr2::req_perform() %>%
httr2::resp_body_json()
}
env
}
This pattern, a closure-based object with shared base request, scales to many endpoints without repetition. Authentication, base URL, and common headers are set once.
Pagination
Most APIs paginate large collections. Three common patterns: offset pagination (?page=1&per_page=100), cursor pagination (?cursor=abc123&limit=100), and link-header pagination (the response includes a Link: <url>; rel="next" header).
For offset pagination:
fetch_all <- function(client, endpoint) {
page <- 1
all_results <- list()
repeat {
response <- client$get_items(page = page)
if (length(response$data) == 0) break
all_results <- c(all_results, response$data)
page <- page + 1
}
bind_rows(all_results)
}
httr2::req_perform_iteratively() handles pagination automatically when you provide an iteration function that extracts the next page URL from a response.
Authentication patterns
API Key: add as header (X-API-Key: key) or query parameter (?api_key=key). Store in environment variable, never in code: Sys.getenv("API_KEY").
Bearer tokens: req_auth_bearer_token(token) sets Authorization: Bearer token. For OAuth 2.0 flows: req_oauth_auth_code() opens a browser for authorization and handles token storage and refresh.
HTTP Basic: req_auth_basic(username, password) for APIs using basic authentication. Password should come from Sys.getenv() or keyring::key_get(), not hardcoded.
Handling rate limits and errors
req_retry(max_tries = 3, is_transient = resp_is_error) retries transient failures. req_throttle(rate = 10 / 60) limits to 10 requests per minute. Combine both for a reliable client.
resp_check_status(resp) throws an error for 4xx/5xx responses. Handle specific status codes with class-specific catchers: tryCatch(req_perform(req), httr2_http_429 = function(e) { wait_and_retry() }).
For APIs that return errors in the body (not HTTP status): body <- resp_body_json(resp); if (!is.null(body$error)) stop(body$error$message).
Best practices
-
Store credentials in environment variables, not in your code. Use
Sys.getenv()to retrieve them at runtime. -
Log failures with timestamps for debugging. Wrap requests in
tryCatch()to capture details:
try_fetch <- function(req) {
tryCatch(
robust_request(req),
error = function(e) {
message("Request failed: ", e$message)
NULL
}
)
}
-
Set timeouts with
req_timeout()to avoid hanging requests on slow or unresponsive APIs. -
Test with mocked responses using the
httptest2package. Mock the API responses during testing to avoid hitting rate limits and ensure consistent test results. -
Version your client as the API changes. Keep client code in a separate package or module to manage breaking changes cleanly.
Rate limiting and retries
APIs enforce rate limits to prevent abuse. When a request returns HTTP 429 (Too Many Requests), the response typically includes a Retry-After header specifying how many seconds to wait. httr2’s req_retry() handles this automatically: req_retry(req, max_tries = 3, backoff = ~ 2^.x) retries up to three times with exponential backoff. For APIs without proper Retry-After headers, Sys.sleep() between requests respects rate limits manually. Structure batch request loops to check the elapsed time and pause as needed rather than sleeping after every request.
Next steps
Now that you understand building a rest client in r, explore these related topics to deepen your knowledge and apply these techniques in more complex scenarios.
See also
- Working with APIs using httr2 — Making your first API requests with httr2
- Shiny for Python Developers — The first tutorial in this series
- R HTTP Requests with httr — Legacy httr package patterns
- String Processing with stringr — Handle API response strings