rguides

Geocoding Addresses and Building Maps in R with Leaflet

Geocoding addresses turns human-readable locations into spatial coordinates you can plot, analyze, and combine with other geographic data. This tutorial builds on the spatial data foundations from earlier tutorials in this series. You will learn how to convert addresses to coordinates using geocoding services and create both static and interactive maps in R with ggplot2 and Leaflet.

Prerequisites

Install the required packages:

install.packages(c("sf", "tidyverse", "ggplot2", "leaflet", "tmap", "ggspatial", "tigris"))
library(sf)
library(tidyverse)
library(ggplot2)
library(leaflet)
library(tmap)

Geocoding addresses

Geocoding converts human-readable addresses into spatial coordinates. Several R packages provide access to geocoding services.

Using tidygeocoder

The tidygeocoder package provides a clean, pipe-friendly interface to multiple geocoding services without requiring separate API wrappers. You pass a data frame with an address column, choose a geocoding method (Nominatim is free and requires no API key), and the package appends latitude and longitude columns to your data. This is the fastest way to get from a list of addresses to mappable coordinates:

library(tidygeocoder)

# Create address data
addresses <- tibble(
  name = c("Statue of Liberty", "Tower of London", "Sydney Opera House"),
  address = c(
    "Liberty Island, New York, NY 10004, USA",
    "Tower of London, London, UK",
    "Bennelong Point, Sydney NSW 2000, Australia"
  )
)

# Geocode using Nominatim (free, OpenStreetMap-based)
geocoded <- addresses %>%
  geocode(address, method = "osm", lat = latitude, long = longitude)

print(geocoded)
# # A tibble: 3 × 4
#   name                   address                           latitude longitude
# 1 Statue of Liberty      Liberty Island, New York, NY 1…    40.7    -74.0  
# 2 Tower of London        Tower of London, London, UK        51.5    -0.076
# 3 Sydney Opera House     Bennelong Point, Sydney NSW 2…   -33.9    151.

Converting to sf objects

Once you have coordinates in a data frame, convert the result into an sf object for spatial operations. The st_as_sf() function takes longitude and latitude columns and turns them into a geometry column with point features. Setting crs = 4326 declares the coordinates are in WGS84 (the standard for GPS and web mapping), which ensures all subsequent spatial operations interpret the coordinates correctly:

geocoded_sf <- geocoded %>%
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326)

print(geocoded_sf)
# Simple feature collection with 3 features and 2 fields
# Geometry type: POINT
# Dimension: XY
# Bounding box:  xmin: -74.0  ymin: -33.9  xmax: 151.  ymax: 51.5
# CRS: 4326

Rate limiting and best practices

When geocoding at scale, respect the rate limits of the geocoding service you are using. Nominatim, the free OpenStreetMap-based service, allows roughly one request per second. Hitting the API too fast results in your IP being temporarily blocked. For batch jobs, add a Sys.sleep(1) delay between requests or use the batch method built into tidygeocoder for services that support it:

# For bulk geocoding, add delays between requests
geocode_slowly <- function(addresses) {
  results <- vector("list", length(addresses))
  for (i in seq_along(addresses)) {
    results[[i]] <- geo(addresses[i])
    Sys.sleep(1)  # Respect Nominatim's 1-second limit
  }
  bind_rows(results)
}

# Alternative: use batch geocoding services for large datasets
# Services like Google, ArcGIS, or US Census geocoder offer batch APIs

Static maps with ggplot2

The ggplot2 package, combined with sf, creates publication-quality static maps using the familiar grammar-of-graphics syntax. The key geom is geom_sf(), which renders spatial features stored in sf objects. Supplementary annotation functions like annotation_scale() and annotation_north_arrow() from the ggspatial package add professional cartographic elements that make a map self-contained and ready for publication.

Basic sf visualization

The following example loads US state boundary data from the spData package, filters to a region, and plots each state as a filled polygon with a label. The geom_sf_label() function places text directly on the map geometry:

library(ggspatial)

# Load example data
library(spData)
data(us_states)

# Filter to a region for a cleaner map
us_ne <- us_states %>%
  filter(NAME %in% c("New York", "New Jersey", "Pennsylvania", 
                     "Connecticut", "Massachusetts", "Vermont", 
                     "New Hampshire", "Maine"))

# Create base map
ggplot(data = us_ne) +
  geom_sf(aes(fill = NAME)) +
  geom_sf_label(aes(label = NAME), size = 2.5) +
  annotation_scale(location = "bl", width_hint = 0.5) +
  annotation_north_arrow(location = "br", which_north = "true") +
  labs(title = "Northeastern United States",
       fill = "State") +
  theme_minimal() +
  theme(legend.position = "bottom")

Adding points to maps

Overlay geocoded point locations on top of a polygon base map by adding a second geom_sf() layer. Each geom_sf() call can reference a different data source, which makes layering straightforward: the base map polygons go in one sf object, and your geocoded points go in another. Use coord_sf() to zoom the viewport to a specific bounding box without removing data:

# Create a map showing our geocoded locations
# First, get US states background
us_main <- us_states %>% filter(!(NAME %in% c("Alaska", "Hawaii")))

# Add geocoded points
geocoded_sf_usa <- geocoded %>%
  filter(longitude > -130, longitude < -50) %>%
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326)

ggplot() +
  geom_sf(data = us_main, fill = "gray95", color = "gray80") +
  geom_sf(data = geocoded_sf_usa, color = "red", size = 3, shape = 19) +
  geom_sf_label(data = geocoded_sf_usa, 
                aes(label = name), 
                nudge_y = 2, size = 3) +
  coord_sf(xlim = c(-85, -65), ylim = c(35, 50)) +
  theme_minimal() +
  labs(title = "Major Landmarks in the Northeastern US")

Choropleth maps

Choropleth maps color geographic regions by a data variable, turning a polygon layer into a heatmap. The fill aesthetic on geom_sf() maps a column to color, and scale_fill_viridis_c() provides a perceptually uniform color gradient. Computing the fill variable — such as population density from total population and area — often requires an intermediate mutate() step before plotting:

# Calculate population density
us_main <- us_main %>%
  mutate(pop_density = total_pop_2015 / as.numeric(st_area(geometry)) * 1e6)

# Create choropleth
ggplot(data = us_main) +
  geom_sf(aes(fill = pop_density), color = "white", size = 0.1) +
  scale_fill_viridis_c(name = "Pop. per sq km",
                      trans = "log10",
                      labels = scales::comma) +
  labs(title = "Population Density by State, 2015",
       subtitle = "United States") +
  theme_minimal() +
  theme(legend.position = "right")

Interactive maps with leaflet

Leaflet creates interactive web maps that support panning, zooming, and clicking on markers. Unlike static ggplot2 maps, leaflet maps render as HTML widgets that can be embedded in Quarto documents, Shiny apps, or standalone web pages. The basic pipeline chains together leaflet(), addTiles(), and marker layers using the pipe operator:

library(leaflet)

# Basic interactive map
leaflet() %>%
  addTiles() %>%
  addMarkers(lng = -74.0445, lat = 40.6892,
             popup = "Statue of Liberty") %>%
  addMarkers(lng = -0.0759, lat = 51.5081,
             popup = "Tower of London") %>%
  addMarkers(lng = 151.2153, lat = -33.8568,
             popup = "Sydney Opera House")

Using sf with leaflet

Pass an sf object directly to leaflet() and leaflet extracts the coordinates from the geometry column automatically. This approach skips the manual extraction of longitude/latitude columns and works seamlessly with the sf objects you create from geocoded results. Use addCircleMarkers() instead of addMarkers() for cleaner visual styling with customizable radius, color, and opacity:

# Convert sf to format leaflet understands
leaflet(geocoded_sf) %>%
  addTiles() %>%
  addCircleMarkers(radius = 8, color = "red", fillOpacity = 0.7,
                   popup = ~name) %>%
  addPopupNotifications()

Custom base maps

The default OpenStreetMap tiles work for most purposes, but leaflet supports dozens of tile providers with different visual styles. addProviderTiles() accepts a provider name from the built-in providers list; CartoDB.Positron gives a clean light background suitable for data overlays, while Esri.WorldImagery provides satellite imagery. Changing the tile layer does not affect your data layers:

# Using different tile providers
leaflet() %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  addMarkers(lng = -74.0445, lat = 40.6892,
             popup = "Statue of Liberty")

# Satellite imagery
leaflet() %>%
  addProviderTiles(providers$Esri.WorldImagery) %>%
  addMarkers(lng = -74.0445, lat = 40.6892,
             popup = "Statue of Liberty")

Choropleth maps with leaflet

For polygon-based thematic maps, use addPolygons() with a fillColor mapped to a data variable. The colorNumeric(), colorBin(), and colorFactor() functions create color palettes that integrate with leaflet’s rendering. Add a color legend with addLegend() so readers can interpret the fill values:

# Create a color palette
pal <- colorNumeric("viridis", domain = NULL)

# Choropleth with polygons
leaflet(us_states) %>%
  addTiles() %>%
  addPolygons(
    fillColor = ~pal(pop_density),
    weight = 1,
    opacity = 1,
    color = "white",
    fillOpacity = 0.7
  ) %>%
  addLegend(
    pal = pal,
    values = ~pop_density,
    title = "Population Density"
  )

For large polygon datasets, rmapshaper::ms_simplify() reduces vertex count and speeds up rendering without noticeably changing the visual appearance.

Interactive maps with tmap

The tmap package offers a middle ground between static and interactive mapping. Its key feature is a mode toggle: tmap_mode("plot") produces static output for printed reports and PDFs, while tmap_mode("view") produces interactive leaflet-based maps for exploration. The same map-building code works in both modes, which means you write the map once and switch the output format based on your audience without rewriting anything:

library(tmap)

# Switch to interactive mode
tmap_mode("view")

# Create interactive choropleth
tm_shape(us_main) +
  tm_polygons("pop_density", 
              title = "Population Density",
              style = "log10",
              palette = "viridis") +
  tm_layout(title = "US Population Density (2015)")

Common geocoding challenges

Geocoding is powerful but not foolproof. Addresses are messy — they contain abbreviations, typos, incomplete information, and local naming conventions that vary between countries. The code examples below address the three most common issues you will encounter in real projects.

Handling ambiguous addresses

Place names that exist in multiple locations return multiple results from a geocoding service. “Springfield” alone could match dozens of cities across the United States. The fix is to add disambiguating context: include the state, postal code, or country in the address string to narrow the search to a specific location:

# Use additional parameters to disambiguate
geo("Springfield", method = "osm")
# Might return multiple results

# Be more specific
geo("Springfield, Illinois", method = "osm")  # Springfield, IL
geo("Springfield, Massachusetts", method = "osm")  # Springfield, MA

Handling non-standard addresses

Real-world addresses arrive in inconsistent formats: apartment unit prefixes, extra whitespace, special characters, and missing components all cause geocoding failures. A preprocessing function that normalizes common variations (replacing # with Number, collapsing multiple spaces, trimming whitespace) dramatically improves the match rate. For addresses that still fail exact matching, many geocoding services support fuzzy matching that returns the closest available match:

# Clean addresses before geocoding
clean_address <- function(address) {
  address %>%
    str_replace_all("#", "Number ") %>%
    str_replace_all("\\s+", " ") %>%
    str_trim()
}

# Use fuzzy matching when exact matches fail
geo("123 Main St New York NY", method = "osm", limit = 5)

US census geocoding

For US addresses specifically, the tigris package provides access to the Census Bureau’s geocoding service, which is free and does not require an API key. The Census Geocoder handles street addresses, intersections, and landmarks across the United States with good accuracy. For production batch jobs, the Census Bureau also offers a file-based batch geocoding API that processes large address lists asynchronously:

library(tigris)

# Batch geocoding with Census API
addresses <- data.frame(
  street = c("1600 Pennsylvania Ave NW", "350 Fifth Ave"),
  city = c("Washington", "New York"),
  state = c("DC", "NY")
)

# Note: For production use, consider the batch geocoding API
# https://geocoding.geo.census.gov/geocoder/locations/addressbatch

Geocoding

Geocoding converts addresses or place names to geographic coordinates. The tidygeocoder package provides a unified interface to multiple geocoding services. geo(address = "Eiffel Tower, Paris", method = "osm") uses OpenStreetMap’s Nominatim service (free, no API key required). geo(address = vec, method = "census") uses the US Census geocoder for US addresses. Commercial services (Google Maps, HERE) require API keys but handle messier address data better.

Batch geocoding a data frame: geocode(df, address = col_name, method = "osm") adds lat and long columns to the data frame. Add delays between requests when geocoding large batches to respect rate limits.

Building maps with ggplot2

ggplot2::geom_sf() renders sf objects as a ggplot layer. Combine a basemap polygon layer with a point layer: ggplot() + geom_sf(data = countries) + geom_sf(data = points, aes(color = category)). coord_sf(crs = st_crs(3857)) reprojects the plot to Web Mercator. scale_fill_distiller() applies diverging or sequential color scales to polygon fill values.

Map projections

Geographic coordinate systems (longitude/latitude) are not suitable for distance measurements or area calculations, one degree of longitude varies from ~111km at the equator to 0km at the poles. Reproject to a local projected CRS before computing distances: st_transform(sf_obj, crs = 32632) for UTM zone 32N (central Europe). st_distance(point1, point2) returns the distance in the units of the CRS (meters for projected CRS).

Geocoding with tidygeocoder

The tidygeocoder package provides a tidy interface to multiple geocoding services. geocode(df, address = address_col) sends the address column to a geocoding API and appends latitude and longitude columns. The method argument selects the service: "osm" (OpenStreetMap Nominatim, free), "census" (US Census Bureau, free for US addresses), "google" (requires API key), or "here" (requires API key).

Batch geocoding respects rate limits automatically. For OpenStreetMap, add progress_bar = TRUE to monitor progress on large inputs. Results come back as a tibble with lat and long columns that you can pipe directly into sf::st_as_sf(coords = c("long", "lat"), crs = 4326) to create an sf object.

Reverse geocoding takes coordinates and returns addresses: reverse_geocode(df, lat = lat_col, long = lng_col, method = "osm") appends an address column. This is useful for annotating GPS tracks with place names or for displaying user-readable locations in dashboards.

Building interactive maps with leaflet

The leaflet package creates interactive maps that render in HTML, in RMarkdown, Quarto, or Shiny. The pipeline is: leaflet() initializes a map; addTiles() adds an OpenStreetMap base layer; addMarkers() or addCircleMarkers() adds point layers; addPolygons() adds area layers.

Popups display information when a marker is clicked. Pass HTML strings to the popup argument: addMarkers(lng = ~lon, lat = ~lat, popup = ~paste("<b>", name, "</b><br>", description)). The ~ syntax refers to columns in the data bound with leaflet(data).

For choropleth maps, use addPolygons() with a fillColor mapped to a data variable. The colorNumeric(), colorBin(), and colorFactor() functions create color palettes compatible with leaflet. Add a legend with addLegend(). For large polygon datasets, rmapshaper::ms_simplify() reduces vertex count and speeds up rendering.

Coordinate reference systems

All spatial work requires consistent coordinate reference systems. GPS coordinates and most web mapping use WGS84 (EPSG:4326) with decimal degrees. For distance calculations and area measurements, you need a projected CRS in meters, such as a UTM zone or a national grid.

sf::st_crs(x) returns the CRS of an sf object. sf::st_transform(x, crs = 3857) reprojects to Web Mercator (used by tiled web maps). sf::st_transform(x, crs = 32633) reprojects to UTM zone 33N. Use crsuggest::suggest_crs(x) to find appropriate projected CRS options for your data’s geographic extent.

For distance calculations between WGS84 points, sf::st_distance() gives geodesic distances (along the Earth’s surface) when the CRS is geographic, and Euclidean distances when projected. Always verify the CRS before computing distances, mixing geographic and projected coordinates gives nonsensical results.

Mapping packages compared

ggplot2 with geom_sf() produces publication-quality static maps. coord_sf() controls the map extent and projection. Layers follow the usual ggplot grammar: multiple geom_sf() calls with different data stack spatial layers, and scale_fill_viridis_c() applies a color scale to a fill aesthetic.

tmap is built specifically for thematic mapping and has a mode toggle: tmap_mode("plot") for static output, tmap_mode("view") for interactive leaflet-based output. The same map code works in both modes, making it convenient for reports (static) and exploratory analysis (interactive).

mapview is the fastest option for exploratory work: mapview(sf_object) produces an interactive map with one function call, automatically choosing a sensible color mapping and adding a popup with all attribute columns. For production maps, switch to leaflet for full control or tmap for a cleaner API.

What geocoding does

Geocoding converts human-readable location descriptions, addresses, place names, postal codes, into geographic coordinates that can be plotted on a map or used in spatial analysis. The reverse process, converting coordinates back to human-readable addresses, is called reverse geocoding. Both operations depend on external services that maintain databases of locations and their coordinates. No geocoding happens locally; all address-to-coordinate conversions require an API call to a geocoding provider.

The quality of geocoding results varies by geography. Major cities in North America and Europe have highly accurate address-level geocoding. Rural areas, developing countries, and locations with inconsistent address systems may return imprecise coordinates or fail entirely. For analysis that depends on precise locations, validate geocoding results visually by plotting them on a map and spot-checking a sample of unusual or ambiguous addresses.

Rate limits and caching

Geocoding APIs enforce rate limits to prevent abuse. Free tiers typically allow a few thousand requests per day. Batch geocoding large address lists requires either a paid plan or breaking the work into smaller batches spread over time. Store geocoded coordinates permanently, in a database column or a CSV file, so you never geocode the same address twice. The tidygeocoder package makes it straightforward to store results and check a local cache before making an API call.

Always check the terms of service of the geocoding provider you use. Many free geocoding services prohibit storing results permanently or using them for commercial purposes. Google Maps Platform, Mapbox, and HERE all have usage policies that govern what you can do with geocoded data. Using the wrong provider for commercial work can create licensing problems.

Mapping geocoded results

Once you have coordinates, plotting them is straightforward. For static maps, ggplot2 with geom_point on top of a map background from the maps or rnaturalearth package produces publication-quality output. For interactive maps, the leaflet package renders in HTML with zoomable backgrounds and clickable markers. The choice between static and interactive output depends on the audience: print and PDF reports need static maps, while web-based reports benefit from interactive ones.

Clustering markers prevents visual clutter when mapping many points in a small area. leaflet’s addMarkers with clusterOptions groups nearby markers at low zoom levels and expands them as the user zooms in. For static maps, density visualization with geom_density_2d or hexagonal binning with geom_hex shows geographic concentration without overlapping individual points.

Summary

You have learned how to:

  • Convert addresses to coordinates using geocoding services
  • Transform geocoded results into sf objects for spatial analysis
  • Create publication-quality static maps with ggplot2
  • Build interactive maps with leaflet for web delivery
  • Use tmap for flexible static and interactive mapping

These skills form the foundation for location-based analysis and visualization in R.

Next steps

Now that you understand geocoding and mapping in r, explore these related topics to deepen your knowledge and apply these techniques in more complex scenarios.

See also