rguides

Making HTTP Requests with httr2 in R

The httr2 package is the modern way to make HTTP requests in R. It provides a pipe-friendly interface for working with HTTP verbs, handling authentication, and processing API responses. Whether you’re consuming REST APIs, downloading datasets, or integrating with web services, httr2 gives you the tools to do it reliably.

Installation and setup

Install httr2 from CRAN:

install.packages("httr2")
library(httr2)

The package is part of the tidyverse ecosystem but works independently. It requires R 4.1 or later for the native pipe operator support, though you can use the magrittr pipe if on an older version.

Making your first request

The core function is request(). Create a request object, then modify it with methods:

library(httr2)

# Create a request
req <- request("https://httpbin.org/get")

# Execute the request
resp <- req |> req_perform()

# Check the status
resp |> resp_status()
# [1] 200

# Get the body as text
resp |> resp_body_string()

Each method returns the modified request object, enabling the pipe chain. This makes it easy to build up complex requests step by step, adding headers, query parameters, and authentication tokens before dispatching with req_perform().

HTTP methods

GET requests

GET requests retrieve data. Pass query parameters with req_url_query(), which appends name-value pairs to the URL as percent-encoded query parameters:

request("https://httpbin.org/get") |>
  req_url_query(name = "Alice", age = 30) |>
  req_perform() |>
  resp_body_json()

You can chain multiple query parameters. They get encoded automatically, so special characters, spaces, and non-ASCII text are safely transmitted. The response from resp_body_json() is a parsed R list ready for further processing.

POST requests

POST requests send data to create or update resources. Use req_body_json() for JSON payloads, which automatically sets the Content-Type header and serializes R lists to JSON:

request("https://httpbin.org/post") |>
  req_method("POST") |>
  req_body_json(list(message = "Hello API", priority = "high")) |>
  req_perform() |>
  resp_body_json()

For form-encoded submissions, use req_body_form() which sends the data as application/x-www-form-urlencoded. This is the standard encoding for HTML form submissions and is commonly required by older APIs and web login endpoints:

request("https://httpbin.org/post") |>
  req_body_form(name = "Alice", submit = "Register") |>
  req_perform()

Other HTTP verbs

Beyond GET and POST, httr2 supports PUT for replacing resources, DELETE for removing them, and PATCH for partial updates. All three use req_method() to set the HTTP verb and share the same request-building pattern:

# PUT - complete replacement
request("https://api.example.com/resource/1") |>
  req_method("PUT") |>
  req_body_json(list(updated = TRUE, data = "new content")) |>
  req_perform()

# DELETE - remove a resource
request("https://api.example.com/resource/1") |>
  req_method("DELETE") |>
  req_perform()

# PATCH - partial update
request("https://api.example.com/resource/1") |>
  req_method("PATCH") |>
  req_body_json(list(status = "active")) |>
  req_perform()

Handling headers

Add custom headers with req_headers(). This is essential for API authentication and content negotiation:

request("https://api.github.com/user") |>
  req_headers(
    Authorization = paste("Bearer", Sys.getenv("GITHUB_TOKEN")),
    Accept = "application/vnd.github.v3+json",
    `User-Agent` = "MyRApp/1.0"
  ) |>
  req_perform() |>
  resp_body_json()

Common headers include Authorization tokens, Accept types for content negotiation, and User-Agent strings to identify your application.

Authentication

Most APIs require some form of authentication — bearer tokens, Basic auth, or OAuth 2.0. httr2 provides dedicated helpers for each scheme, handling the Authorization header formatting automatically so you don’t need to construct it manually.

Bearer tokens

Bearer token authentication sends an access token in the Authorization header. Modern REST APIs — especially those using OAuth 2.0 — rely on bearer tokens for stateless authentication. req_auth_bearer() handles the header formatting, prepending “Bearer” to the token value:

request("https://api.example.com/protected") |>
  req_auth_bearer("your-token-here") |>
  req_perform()

Bearer tokens are commonly used in modern REST APIs, especially with OAuth 2.0 implementations.

Basic authentication

Basic auth encodes a username and password as a Base64 string and sends it in the Authorization header. req_auth_basic() handles the encoding and header construction, accepting the username and password as separate arguments. While simple, this method should only be used over HTTPS to prevent credential exposure.

request("https://api.example.com/login") |>
  req_auth_basic("username", "password") |>
  req_perform()

Basic auth sends credentials encoded in Base64. It is simple but less secure for sensitive applications.

OAuth 2.0

httr2 supports OAuth 2.0 authorization code flow:

# Create OAuth client configuration
oauth <- oauth_client(
  id = "your-client-id",
  token_url = "https://provider.com/oauth/token",
  secret = "your-client-secret"
)

# Get authorization token
token <- oauth |>
  oauth_flow_auth_code("https://provider.com/authorize")

# Use token in subsequent requests
request("https://api.example.com/data") |>
  req_auth_bearer(token$access_token) |>
  req_perform()

This handles token refresh automatically when supported by the provider.

Error handling

Use req_error() to control how errors are handled:

# Stop on any HTTP error
request("https://httpbin.org/status/404") |>
  req_error(is_error = ~ paste("HTTP error:", .x$status_code)) |>
  req_perform()

# Handle errors gracefully with possibly
safe_request <- possibly(req_perform, otherwise = NULL)

resp <- request("https://httpbin.org/status/500") |>
  req_error(is_error = \(resp) FALSE) |>
  safe_request()

if (is.null(resp)) {
  cat("Request failed\n")
} else {
  cat("Request succeeded:", resp_status(resp), "\n")
}

Working with responses

Parse JSON

JSON is the most common API response format. resp_body_json() parses the response body into an R list, which you can navigate with the $ operator for nested access:

resp <- request("https://httpbin.org/json") |> 
  req_perform()

# Parse as R list
data <- resp |> resp_body_json()

# Access nested elements
data$slideshow$title

Parse XML

For APIs that return XML instead of JSON, the xml2 package integrates with httr2 via resp_body_xml(). After parsing, use xml_find_all() with XPath expressions to extract specific elements from the document tree:

resp <- request("https://httpbin.org/xml") |>
  req_perform()

# Parse with xml2
library(xml2)
doc <- resp |> resp_body_xml()
xml_find_all(doc, ".//title")

Download files

Binary downloads use req_save() to write the response body directly to a file, avoiding the memory overhead of loading the full content into R. For large downloads, the progress argument enables a progress bar:

# Download binary file
request("https://httpbin.org/bytes/1024") |>
  req_save("download.bin")

# Download with progress
request("https://httpbin.org/bytes/1024000") |>
  req_save("large.bin", progress = TRUE)

Rate limiting

Many APIs impose rate limits to prevent abuse. Use req_throttle() to space out requests, inserting a controlled delay between each HTTP call. This prevents your application from being blocked or receiving 429 (Too Many Requests) errors:

# Wait 1 second between requests
req <- request("https://api.example.com/items") |>
  req_throttle(1)

# Iterate through pages with rate limiting
for (page in 1:10) {
  resp <- request("https://api.example.com/items") |>
    req_url_query(page = page) |>
    req_throttle(1) |>
    req_perform()
  
  # Process response
  items <- resp |> resp_body_json()
  cat("Page", page, "-", length(items), "items\n")
}

Retry logic

Handle transient failures with req_retry():

request("https://httpbin.org/delay/2") |>
  req_retry(
    max_tries = 3,
    max_seconds = 10,
    is_transient = ~ .x$status_code >= 500
  ) |>
  req_perform()

The is_transient function determines which status codes warrant a retry.

Practical example: GitHub API

The GitHub API is a well-documented public REST API that illustrates the core httr2 patterns: authenticating with headers or tokens, requesting specific endpoints, and parsing JSON responses into tidy tibbles for analysis.

# Fetch user information from GitHub
github_user <- function(username) {
  request(paste0("https://api.github.com/users/", username)) |>
    req_headers(
      Accept = "application/vnd.github.v3+json",
      `User-Agent` = "R-httr2-demo"
    ) |>
    req_perform() |>
    resp_body_json()
}

# Get repository information
github_repos <- function(username, limit = 30) {
  request(paste0("https://api.github.com/users/", username, "/repos")) |>
    req_headers(
      Accept = "application/vnd.github.v3+json",
      `User-Agent` = "R-httr2-demo"
    ) |>
    req_url_query(per_page = min(limit, 100)) |>
    req_perform() |>
    resp_body_json()
  
    tibble::tibble(
      name = sapply(response, function(x) x$name),
      stars = sapply(response, function(x) x$stargazers_count),
      url = sapply(response, function(x) x$html_url)
    )
}

# Usage
user <- github_user("hadley")
cat("Name:", user$name, "\n")
cat("Public repos:", user$public_repos, "\n")
cat("Followers:", user$followers, "\n")

repos <- github_repos("hadley", 10)
print(head(repos, 5))

Handling responses

content(response) parses the response body based on the Content-Type header. content(resp, as = "text") returns a raw character string. content(resp, as = "parsed") attempts automatic parsing, JSON becomes a list, HTML becomes an XML document. content(resp, as = "raw") returns a raw byte vector for binary responses like images or files.

http_status(resp)$message returns a human-readable status description. stop_for_status(resp) throws an error for 4xx and 5xx responses, a convenient way to ensure only successful responses proceed.

Rate limiting and retries

httr does not have built-in retry logic. For reliable API clients, use the httr2 package instead, which provides req_retry() and req_throttle(). For httr, implement retry manually with tryCatch() and Sys.sleep(). For rate limiting, track request timestamps and sleep until enough time has elapsed: Sys.sleep(max(0, 1/rate - elapsed_time)).

Working with cookies

httr::cookies(resp) returns the cookies set by a response as a data frame. When using httr::GET() with a handle (httr::handle()), cookies persist across requests automatically, essential for authenticated scraping sessions. add_headers("Cookie" = cookie_string) manually adds a cookie header when you need to supply cookies from an external source rather than from a previous response.

Streaming responses

For large response bodies that should not be loaded into memory at once, httr::GET(url, write_disk("output.bin")) streams the response directly to a file. write_stream(function(x) { ... }) processes the response in chunks as they arrive, useful for Server-Sent Events or large JSON line-delimited files. content(resp, as = "raw") returns the raw bytes for binary responses like images, PDFs, or compressed files.

GET and POST requests

httr::GET(url) sends an HTTP GET request. content(response, as = "text") extracts the response body as a string. content(response, as = "parsed") automatically parses JSON, XML, or HTML based on the Content-Type header. status_code(response) returns the HTTP status code.

httr::POST(url, body = list(key = "value"), encode = "json") sends a POST request with a JSON body. encode = "form" sends URL-encoded form data. encode = "multipart" handles file uploads. encode = "raw" sends the body as-is.

httr::add_headers("Authorization" = paste("Bearer", token)) adds headers. Pass headers as additional arguments to GET/POST, or use add_headers() as a config argument. authenticate(user, password) sets HTTP Basic authentication.

Query parameters

GET(url, query = list(page = 1, per_page = 100, filter = "active")) appends query parameters to the URL, properly URL-encoding the values. This is safer than manually constructing query strings with paste().

modify_url(url, query = list(page = 2)) creates a new URL with updated query parameters. parse_url(url) parses a URL into components; build_url(parsed) reconstructs it. These are useful for pagination where you increment a page parameter.

Sessions and cookies

httr::set_cookies(session_id = "abc123") sends cookies in a request. cookies(response) returns cookies set by the server. handle_setopt() provides lower-level libcurl options for complex cookie scenarios.

For sites requiring login, create a session that persists cookies:

session <- httr::handle("https://example.com")
login_resp <- POST("https://example.com/login",
                   body = list(username = "user", password = "pass"),
                   handle = session)
data_resp <- GET("https://example.com/data", handle = session)

The handle carries cookies across requests, maintaining the authenticated session.

Response processing

http_type(response) returns the content MIME type. Check this before parsing: if (http_type(response) == "application/json") { jsonlite::fromJSON(content(response, as = "text")) }.

stop_for_status(response) throws an error if the status code is 4xx or 5xx. warn_for_status() issues a warning instead. message_for_status() prints an informative message. Use stop_for_status() in production code where request failures should halt execution.

httr2 is the modern replacement for httr with a cleaner pipe-based API, better OAuth support, and explicit retry handling. New code should use httr2; httr receives maintenance-only updates.

Handling rate limits

APIs return HTTP 429 (Too Many Requests) when you exceed rate limits. Inspect Retry-After header for the suggested wait time: as.numeric(headers(response)$retry-after). Sleep that duration and retry.

purrr::insistently(httr::GET, rate = purrr::rate_delay(5)) wraps a function to retry on error with a 5-second delay. Combine with safely() to collect both results and errors across many requests without stopping on the first failure.

For systematic API clients, implement exponential backoff: wait 1 second after the first failure, 2 after the second, 4 after the third. Sys.sleep(2^attempt) inside a retry loop achieves this.

HTTP clients in R

R has several HTTP client options. The curl package is the lowest-level option, wrapping libcurl directly. httr was built on top of curl to provide a more ergonomic interface for common patterns, authentication, cookie management, retry logic. httr2 is the modern successor with a request-pipeline design that separates request construction from execution, making it easier to build reusable request logic. For new code, httr2 is preferred, but httr is still widely used and well-supported.

The distinction between the two versions matters when reading documentation and Stack Overflow answers. httr and httr2 have different APIs: httr uses GET, POST, and similar verb functions that immediately send the request, while httr2 builds a request object with req_url, req_method, req_headers, and related functions, then sends it with req_perform. Code written for one does not work with the other.

Response handling

Every HTTP response has a status code, headers, and a body. The status code indicates success or failure, 200 is success, 4xx codes indicate client errors, 5xx codes indicate server errors. In httr, calling stop_for_status on a response raises an error if the status code indicates failure. Calling warn_for_status raises a warning instead. Adding this check immediately after every request is good practice because it surfaces failures with a useful message rather than allowing them to propagate silently.

Response bodies arrive as raw bytes. For JSON APIs, content with type application/json parses the body as JSON and returns a list. For text responses, content with type “text” decodes the bytes to a string using the encoding specified in the response headers. Passing the raw bytes to a parser manually is necessary for binary formats like images or compressed data.

Authentication patterns

Different APIs use different authentication schemes. HTTP Basic authentication attaches a username and password to every request as a Base64-encoded header. Bearer token authentication sends a token in the Authorization header. API key authentication may use a header, a query parameter, or a POST body field, it varies by API. httr provides authenticate() for Basic auth and add_headers() for custom headers. Always pass credentials through environment variables rather than hardcoding them in scripts.

For APIs that use OAuth2, the httr package has oauth2.0_token for the full OAuth flow and token management. The flow opens a browser for the authorization step, then stores the token for reuse. Token refresh is handled automatically. OAuth2 integration adds complexity but is required for APIs like Google, GitHub, and Slack that mandate it for user-level access.

See also