ggplot2::geom_point()
geom_point(mapping = NULL, data = NULL, stat = "identity", position = "identity", ...) geom_point() draws a point for each observation. It’s the go-to geom for scatter plots — showing the relationship between two continuous variables, or the distribution of one variable against another.
Syntax
geom_point(mapping = NULL, data = NULL, stat = "identity", position = "identity", ...)
Most arguments are shared across all geoms. The ones you’ll use most:
| Argument | What it does |
|---|---|
mapping | Aesthetic mappings from aes() |
data | Data frame for this layer (usually inherited from ggplot()) |
stat | Statistical transformation — almost always "identity" |
position | Position adjustment — "identity" by default, or "jitter" to spread overlapping points |
Aesthetics
geom_point() supports all position aesthetics plus visual ones. The most common:
| Aesthetic | What it controls |
|---|---|
x | Position on x-axis |
y | Position on y-axis |
colour | Point colour (continuous or categorical) |
size | Point size in mm |
shape | Point shape (0–25) |
alpha | Transparency (0–1) |
fill | Fill colour for shape 21–25 |
stroke | Border width for shapes 21–25 |
Map aesthetics to variables to encode data in the plot. Set aesthetics to constants to style the points:
# Mapped: colour varies with a variable
ggplot(mtcars, aes(x = wt, y = mpg, colour = cyl)) + geom_point()
# Set: all points are the same size
ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point(size = 3)
Basic Scatter Plot
library(ggplot2)
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point()
A simple two-variable scatter. The x-axis shows car weight, the y-axis shows miles per gallon. You can immediately see the negative relationship.
Mapping Additional Variables
The power of geom_point() is encoding additional variables through aesthetics:
# Colour by a third variable
ggplot(mtcars, aes(x = wt, y = mpg, colour = hp)) +
geom_point()
# Size and colour by different variables
ggplot(mtcars, aes(x = wt, y = mpg, size = hp, colour = factor(cyl))) +
geom_point()
Continuous variables mapped to colour get a gradient scale. Categorical variables get a discrete colour palette. Use scale_colour_viridis_d() or similar for colourblind-safe palettes.
Shape Aesthetic
The shape aesthetic accepts integers 0–25. Different shapes have different capabilities:
# Use shape 21 (filled circle) to allow both colour AND fill
ggplot(mtcars, aes(x = wt, y = mpg, fill = factor(cyl), colour = factor(gear))) +
geom_point(shape = 21, size = 3)
Shape 21–25 have a fill and a stroke (border colour). Shapes 0–20 either have no fill or no stroke, depending.
Overplotting and Jitter
Overplotting happens when many points share the same coordinates — a common problem with discrete or categorical data. The fix is position_jitter:
# Default: points at exact coordinates
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
geom_point()
# Jittered: small random offset spreads the points
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
geom_point(position = position_jitter(width = 0.3, height = 0))
width is the maximum jitter in the x direction, height in the y direction. For a categorical x-axis, horizontal jitter (width) spreads points without changing their y position.
geom_jitter() is a shorthand for geom_point(position = "jitter"):
# Equivalent to the jitter call above
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
geom_jitter(width = 0.3, height = 0)
Alpha for Large Data
When you have thousands of points, use alpha to reveal density:
ggplot(faithful, aes(x = eruptions, y = waiting)) +
geom_point(alpha = 0.3)
alpha = 0.3 makes each point 30% opaque. Where points overlap, the density shows through as darker regions. Common values for large datasets: 0.1 to 0.5.
Size Legend and Scaling
When size is mapped to a continuous variable, ggplot2 creates a size legend showing the mapping:
ggplot(mtcars, aes(x = wt, y = mpg, size = hp)) +
geom_point()
Control the legend appearance with scale_size_continuous():
ggplot(mtcars, aes(x = wt, y = mpg, size = hp)) +
geom_point() +
scale_size_continuous(range = c(1, 6))
range sets the minimum and maximum point sizes in mm.
Combining with Smoothers
A common pattern is points + a smoothing layer:
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
geom_smooth()
geom_smooth() adds a regression line with a confidence interval. method = "lm" forces a linear model; method = "loess" (default for small n) fits a local polynomial.
Colour by Group with Legend Merging
When multiple layers map the same aesthetic, ggplot2 merges legends intelligently:
ggplot(mtcars, aes(x = wt, y = mpg, colour = factor(cyl))) +
geom_point(size = 2) +
geom_smooth(se = FALSE, method = "lm")
The colour legend shows both the point colours and the line colours from geom_smooth().
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
mapping | aesthetic | NULL | Aesthetic mappings from aes() |
data | data.frame | NULL | Layer-specific data |
stat | string | "identity" | Leave data as-is |
position | string/position | "identity" | Identity or "jitter" |
na.rm | logical | FALSE | Remove missing values silently |
show.legend | logical/NA | NA | Show legend for this layer |
inherit.aes | logical | TRUE | Inherit aesthetics from ggplot() |
Additional parameters passed through ... go to layer().
See Also
- /tutorials/r-data-visualization/ggplot2-basics/ — first steps with ggplot2
- /tutorials/r-data-visualization/introduction-to-ggplot2/ — layered grammar of graphics concept
- /reference/tidyverse/ggplot2_aes/ — the aesthetic mapping system geom_point uses