rguides

ggplot2::geom_point()

geom_point(mapping = NULL, data = NULL, stat = "identity", position = "identity", ...)

geom_point() draws a point for each observation. It’s the go-to geom for scatter plots — showing the relationship between two continuous variables, or the distribution of one variable against another.

Syntax

geom_point(mapping = NULL, data = NULL, stat = "identity", position = "identity", ...)

Most arguments are shared across all geoms. The ones you’ll use most:

ArgumentWhat it does
mappingAesthetic mappings from aes()
dataData frame for this layer (usually inherited from ggplot())
statStatistical transformation — almost always "identity"
positionPosition adjustment — "identity" by default, or "jitter" to spread overlapping points

Aesthetics

geom_point() supports all position aesthetics plus visual ones. The most common:

AestheticWhat it controls
xPosition on x-axis
yPosition on y-axis
colourPoint colour (continuous or categorical)
sizePoint size in mm
shapePoint shape (0–25)
alphaTransparency (0–1)
fillFill colour for shape 21–25
strokeBorder width for shapes 21–25

Map aesthetics to variables to encode data in the plot. Set aesthetics to constants to style the points:

# Mapped: colour varies with a variable
ggplot(mtcars, aes(x = wt, y = mpg, colour = cyl)) + geom_point()

# Set: all points are the same size
ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point(size = 3)

Basic Scatter Plot

library(ggplot2)

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point()

A simple two-variable scatter. The x-axis shows car weight, the y-axis shows miles per gallon. You can immediately see the negative relationship.

Mapping Additional Variables

The power of geom_point() is encoding additional variables through aesthetics:

# Colour by a third variable
ggplot(mtcars, aes(x = wt, y = mpg, colour = hp)) +
  geom_point()

# Size and colour by different variables
ggplot(mtcars, aes(x = wt, y = mpg, size = hp, colour = factor(cyl))) +
  geom_point()

Continuous variables mapped to colour get a gradient scale. Categorical variables get a discrete colour palette. Use scale_colour_viridis_d() or similar for colourblind-safe palettes.

Shape Aesthetic

The shape aesthetic accepts integers 0–25. Different shapes have different capabilities:

# Use shape 21 (filled circle) to allow both colour AND fill
ggplot(mtcars, aes(x = wt, y = mpg, fill = factor(cyl), colour = factor(gear))) +
  geom_point(shape = 21, size = 3)

Shape 21–25 have a fill and a stroke (border colour). Shapes 0–20 either have no fill or no stroke, depending.

Overplotting and Jitter

Overplotting happens when many points share the same coordinates — a common problem with discrete or categorical data. The fix is position_jitter:

# Default: points at exact coordinates
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
  geom_point()

# Jittered: small random offset spreads the points
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
  geom_point(position = position_jitter(width = 0.3, height = 0))

width is the maximum jitter in the x direction, height in the y direction. For a categorical x-axis, horizontal jitter (width) spreads points without changing their y position.

geom_jitter() is a shorthand for geom_point(position = "jitter"):

# Equivalent to the jitter call above
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
  geom_jitter(width = 0.3, height = 0)

Alpha for Large Data

When you have thousands of points, use alpha to reveal density:

ggplot(faithful, aes(x = eruptions, y = waiting)) +
  geom_point(alpha = 0.3)

alpha = 0.3 makes each point 30% opaque. Where points overlap, the density shows through as darker regions. Common values for large datasets: 0.1 to 0.5.

Size Legend and Scaling

When size is mapped to a continuous variable, ggplot2 creates a size legend showing the mapping:

ggplot(mtcars, aes(x = wt, y = mpg, size = hp)) +
  geom_point()

Control the legend appearance with scale_size_continuous():

ggplot(mtcars, aes(x = wt, y = mpg, size = hp)) +
  geom_point() +
  scale_size_continuous(range = c(1, 6))

range sets the minimum and maximum point sizes in mm.

Combining with Smoothers

A common pattern is points + a smoothing layer:

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_smooth()

geom_smooth() adds a regression line with a confidence interval. method = "lm" forces a linear model; method = "loess" (default for small n) fits a local polynomial.

Colour by Group with Legend Merging

When multiple layers map the same aesthetic, ggplot2 merges legends intelligently:

ggplot(mtcars, aes(x = wt, y = mpg, colour = factor(cyl))) +
  geom_point(size = 2) +
  geom_smooth(se = FALSE, method = "lm")

The colour legend shows both the point colours and the line colours from geom_smooth().

Parameters

ParameterTypeDefaultDescription
mappingaestheticNULLAesthetic mappings from aes()
datadata.frameNULLLayer-specific data
statstring"identity"Leave data as-is
positionstring/position"identity"Identity or "jitter"
na.rmlogicalFALSERemove missing values silently
show.legendlogical/NANAShow legend for this layer
inherit.aeslogicalTRUEInherit aesthetics from ggplot()

Additional parameters passed through ... go to layer().

See Also