Working with JSON in R
JSON (JavaScript Object Notation) has become the standard format for data interchange across web APIs, configuration files, and data pipelines. R provides robust tools for working with JSON through the jsonlite package, which is part of the tidyverse ecosystem but can be used independently.
Installing jsonlite
The jsonlite package is available on CRAN and can be installed with:
install.packages("jsonlite")
library(jsonlite)
Reading JSON into R
The primary function for reading JSON is fromJSON(). It accepts URLs, file paths, or character strings:
# From a URL (JSONPlaceholder API example)
users <- fromJSON("https://jsonplaceholder.typicode.com/users")
# From a local file
data <- fromJSON("data.json")
# From a character string
json_str <- '{"name": "Alice", "age": 30}'
person <- fromJSON(json_str)
The function automatically converts JSON arrays to R vectors and objects to data frames or lists depending on the structure.
Writing R objects to JSON
Use toJSON() to convert R objects to JSON format:
# Simple example
df <- data.frame(
name = c("Alice", "Bob"),
score = c(95, 87)
)
json_output <- toJSON(df, pretty = TRUE)
cat(json_output)
The pretty = TRUE argument formats the output with indentation for readability. Use auto_unbox = TRUE to convert single-element arrays to raw types.
Writing to files
# Write JSON to a file
write_json(df, "output.json", pretty = TRUE)
Working with Nested JSON
Real-world JSON often has nested structures. The flatten() function simplifies these:
# Example: nested API response
nested <- fromJSON('{
"person": {
"name": "Alice",
"address": {"city": "Boston", "zip": "02101"}
}
}')
# Flatten to get address.city as a column
flat <- flatten(nested)
For complex nested structures, work with lists directly:
# Access nested elements
data <- fromJSON("complex_api.json")
city <- data$results[[1]]$address$city
Handling Dates and Times
JSON does not have a native date type. The POSIXct class stores dates with timezone information:
# Dates become ISO 8601 strings
df <- data.frame(
event = c("start", "end"),
timestamp = as.POSIXct(c("2024-01-15 09:00", "2024-01-15 17:00"))
)
toJSON(df, POSIXt = "ISO")
Error Handling
When working with external APIs, handle potential errors. The safely function from purrr wraps any function to return a list with result and error components:
library(purrr)
safe_read <- safely(fromJSON, otherwise = NULL)
result <- safe_read("https://api.example.com/data")
if (is.null(result$error)) {
data <- result$result
} else {
message("Failed to fetch: ", result$error$message)
}
Practical Example: Weather API
Here is a complete example fetching weather data from an API:
library(jsonlite)
# Fetch weather data (example API)
url <- "https://api.open-meteo.com/v1/forecast?latitude=52.52&longitude=13.41¤t_weather=true"
weather <- fromJSON(url)
# Extract current temperature
current_temp <- weather$current_weather$temperature
message("Current temperature: ", current_temp, " degrees C")
Comparing jsonlite to alternatives
Different packages serve different purposes when working with JSON in R:
| Package | Use Case |
|---|---|
| jsonlite | General JSON handling, API consumption |
| httr2 | HTTP requests with JSON support |
| tidyjson | Tidyverse-style JSON manipulation |
| rapidjsonr | High-performance JSON for large files |
Performance Tips
For large JSON files, consider these optimizations. Use streaming for files that do not fit in memory:
# Stream JSON from a file
con <- file("large_data.json", "r")
stream_in(con, function(df) {
# Process in chunks
print(nrow(df))
})
close(con)
Prettify only when needed, as it adds overhead:
# Fast serialization (no prettifying)
compact_json <- toJSON(df, pretty = FALSE)
Common Pitfalls
Watch for these common issues when working with JSON in R. First, factor columns convert poorly - convert to character first:
df$category <- as.character(df$category)
Second, NA values become null in JSON, which may cause issues with some APIs. Use na = "string" to preserve them:
toJSON(df, na = "NA")
Third, data frames with different column types may not serialize as expected - check the output carefully.
See Also
- Reading Excel Files with readxl and writexl — Importing spreadsheet data into R
- Fast Data Manipulation with data.table — High-performance data handling in R
- Working with Parquet Files using Arrow — Columnar data formats for R