Maps with ggplot2
Why Maps?
Sometimes the clearest way to communicate a pattern is to show it on a map. A choropleth map---regions shaded by a metric---makes geographic disparities immediately visible. A bubble map reveals the spatial distribution of point-level data. ggplot2 handles both, and with the sf package you can plot any geographic data: country boundaries, census tracts, river networks, or custom GeoJSON shapes.
There are two main approaches. The older path uses map_data() from the maps package with geom_polygon(). The modern standard uses the sf package and geom_sf(). Both are worth knowing; sf is what you’ll reach for with shapefiles and real-world data, while map_data() is still useful for quick built-in maps of the US or world.
Getting Map Data
The map_data() function converts built-in map datasets from the maps package into a data frame ggplot2 can work with.
library(ggplot2)
library(maps)
world <- map_data("world")
str(world)
# long lat group region subregion order
# 1 -125 37.50 1 Canada <NA> 1
# 2 -125 38.00 1 Canada <NA> 2
The columns are straightforward: long and lat are coordinates, group identifies which coordinates belong to the same polygon, and region names the territory. Available maps include "world", "usa", "state", and "county" (US only).
For world maps at multiple resolutions, the rnaturalearth package provides vector data from the Natural Earth project:
install.packages(c("rnaturalearth", "rnaturalearthdata", "sf"))
library(rnaturalearth)
library(sf)
world <- ne_countries(scale = "medium", returnclass = "sf")
This returns an sf object (more on that below).
Basic World and Country Maps
With map_data(), you draw maps using geom_polygon(). The critical piece is group = group --- it tells ggplot2 which coordinates close each polygon.
ggplot(world, aes(x = long, y = lat, group = group)) +
geom_polygon(fill = "lightblue", color = "white", size = 0.2) +
theme_void()
The result is a bare world map. theme_void() removes the axes and gridlines, which maps almost always need.
ggplot2 also provides a convenience function borders() that wraps up the same logic:
ggplot() +
borders("world", fill = "lightgray", color = "white") +
theme_void()
For US state maps, map_data("state") gives you the state boundaries. Add a projection for a cleaner look:
us_states <- map_data("state")
ggplot(us_states, aes(x = long, y = lat, group = group)) +
geom_polygon(fill = "steelblue", color = "white") +
coord_map("albers", lat0 = 39, lat1 = 45) +
theme_void()
coord_map("albers", ...) applies an Albers equal-area projection, the standard choice for US thematic maps because it preserves relative area.
Choropleth Maps
A choropleth shades regions by a data value. The trick is joining your metric to the map data on a shared key, then mapping fill to that metric.
library(dplyr)
# Get state map data
us_states <- map_data("state")
# Your metric (state populations are built into R)
pop_data <- data.frame(
region = tolower(state.name),
population = state.x77[, "Population"]
)
# Join by region name
us_map <- left_join(us_states, pop_data, by = "region")
# Plot choropleth
ggplot(us_map, aes(x = long, y = lat, group = group, fill = population)) +
geom_polygon(color = "white", size = 0.3) +
scale_fill_viridis_c(option = "plasma", name = "Population") +
coord_map("albers", lat0 = 39, lat1 = 45) +
theme_void() +
labs(title = "US State Population")
Watch the region name matching. map_data("state") uses lowercase two-word names like "new york" and "washington", so call tolower() on your keys before joining. Run anti_join(us_states, pop_data, by = "region") to catch any rows that failed to match.
For a discrete color scale with categorical data, use scale_fill_brewer() or scale_fill_viridis_d() instead.
Point Maps and Bubble Maps
Point maps place a dot at each lat/lon coordinate. If you size or color those dots by a variable, you get a bubble map.
# Example cities data frame
cities <- data.frame(
city = c("New York", "Los Angeles", "Chicago"),
lon = c(-74, -118, -87),
lat = c(40.7, 34, 41.9),
population = c(8.4, 4.0, 2.7)
)
ggplot(world, aes(x = long, y = lat)) +
borders("world", fill = "lightgray", color = "white") +
geom_point(
data = cities,
aes(x = lon, y = lat, size = population),
color = "steelblue", alpha = 0.7
) +
scale_size_continuous(name = "Population (M)") +
coord_map("mercator") +
theme_void()
The borders() call draws the world outline, then geom_point() layers the city locations on top. Sizing by population turns this from a point map into a bubble map.
The sf Package
The sf package implements simple features, an open standard for geographic vector data. It has become the standard for working with shapefiles, GeoJSON, and any real geographic data in R. The key advantage: geometries are stored as a column in a data frame, so sf objects work directly with dplyr and ggplot2’s geom_sf().
Loading Data
library(sf)
# From a GeoJSON URL
map_sf <- read_sf("https://raw.githubusercontent.com/R-CoderDotCom/data/main/shapefile_spain/spain.geojson")
# From a local shapefile directory (contains .shp, .shx, .dbf, .prj)
map_sf <- read_sf("path/to/shapefile/")
read_sf() auto-detects the format. The result looks like a tibble but has a geometry column.
geom_sf()
geom_sf() is the main geom for sf objects. It reads the geometry column automatically and draws the appropriate geometry type without you needing to specify x and y aesthetics.
# Basic
ggplot(map_sf) +
geom_sf()
# Choropleth with sf
ggplot(map_sf) +
geom_sf(aes(fill = unemp_rate), color = "white") +
scale_fill_viridis_c(option = "cividis", name = "Unemployment") +
theme_void()
Labels
For text labels on sf polygons, geom_sf_text() and geom_sf_label() handle the geometry extraction automatically:
ggplot(map_sf) +
geom_sf(aes(fill = unemp_rate), color = "white") +
geom_sf_text(aes(label = region_name), size = 2, color = "white") +
theme_void()
For non-overlapping labels, ggrepel works too:
library(ggrepel)
ggplot(map_sf) +
geom_sf(aes(fill = unemp_rate), color = "white") +
geom_text_repel(
aes(label = region_name, geometry = geometry),
stat = "sf_coordinates", size = 2
) +
theme_void()
Coordinate Reference Systems and Projections
Every geographic dataset has a coordinate reference system (CRS) that defines how coordinates map to real-world locations. The sf package tracks this automatically.
library(sf)
# Check the CRS
st_crs(map_sf)
# Transform to a different CRS
map_projected <- st_transform(map_sf, crs = "EPSG:32632") # UTM Zone 32N
Common EPSG codes:
EPSG:4326— WGS 84 (standard lat/lon, what GPS uses)EPSG:3857— Web Mercator (Google Maps, OpenStreetMap)EPSG:5070— NAD83 / Conus Albers (good for US choropleths)
coord_sf()
coord_sf() sets the projection and CRS for your plot. With sf objects, ggplot2 uses this to transform the geometries.
ggplot(world) +
geom_sf() +
coord_sf(crs = "EPSG:3857") # Mercator
If you’re mixing sf layers with non-sf layers (e.g., adding raw lat/lon points), tell coord_sf() what CRS to assume for those raw coordinates:
ggplot(world) +
geom_sf() +
geom_point(aes(x = lon, y = lat, size = value), data = my_points) +
coord_sf(default_crs = sf::st_crs(4326)) # treat x/y as WGS84 lat/lon
Without default_crs, ggplot2 assumes your x/y values are in the map’s projected CRS (meters), which produces garbage points.
For non-sf maps using coord_map(), the projection name goes directly in the function call:
coord_map("albers", lat0 = 39, lat1 = 45) # US Albers
coord_map("mercator")
coord_map("robin") # Robinson projection
Labels and Annotations
Beyond polygon labels, you often want to annotate specific locations. The annotate() function works for points and text without needing sf:
ggplot(world, aes(x = long, y = lat)) +
borders("world", fill = "lightgray", color = "white") +
annotate("point", x = -122, y = 37, color = "red", size = 3) +
annotate("text", x = -122, y = 37.8, label = "San Francisco", hjust = 0, size = 3) +
coord_map("mercator") +
theme_void()
For map inset boxes (a smaller map showing a zoomed region), the ggspatial package provides annotation_map_tile() for base map tiles.
Polishing Your Map
Maps reward careful polishing. A few patterns that work:
library(ggplot2)
library(sf)
library(rnaturalearth)
library(dplyr)
world <- ne_countries(scale = "medium", returnclass = "sf")
ggplot(world) +
geom_sf(aes(fill = pop_est), color = "white", size = 0.1) +
scale_fill_viridis_c(
option = "plasma",
name = "Population",
labels = scales::comma
) +
coord_sf(crs = "EPSG:4326") +
theme_void() +
theme(
legend.position = "bottom",
plot.title = element_text(hjust = 0.5, size = 14)
) +
labs(
title = "World Population Estimate",
caption = "Source: Natural Earth"
)
theme_void() gives you a clean slate. The scales::comma formatter makes large population numbers readable. Moving the legend to the bottom keeps it out of the way.
See Also
- Introduction to ggplot2 --- covers the grammar of graphics foundations that map on to every
geom_*function - Facets and Themes in ggplot2 --- split maps into panels by region, or fine-tune every visual element
- Customizing ggplot2 Charts --- color scales, annotations, and the options that make a map publication-ready