Posterior Predictive Checks
Posterior predictive checks (PPCs) are a fundamental tool in Bayesian analysis. They let you ask: “If I repeatedly drew new data from my model, would it look like the data I actually observed?” If the answer is yes, your model is consistent with the data. If no, something is wrong.
This tutorial teaches you how to perform meaningful posterior predictive checks using brms and bayesplot, with practical examples you can apply to your own models.
The Intuition Behind PPCs
The core idea is straightforward. After fitting a Bayesian model, you have a full posterior distribution—not just point estimates. Each draw from this posterior represents a plausible set of parameters that could have generated your data.
A posterior predictive check works like this:
- Draw a parameter vector from the posterior
- Use those parameters to simulate new data
- Compare the simulated data to your actual observations
If the real data looks like a typical outcome from the model, you’re in good shape. If the real data looks unusual compared to what the model typically produces, your model is missing something important.
Setting Up Your Environment
# Required packages
install.packages(c("brms", "bayesplot", "ggplot2", "tidybayes"))
library(brms)
library(bayesplot)
library(ggplot2)
library(tidybayes)
For this tutorial, we’ll use the familiar mtcars dataset, fitting a model to predict fuel efficiency:
# Fit a simple model for demonstration
model <- brm(
mpg ~ wt + cyl + hp,
data = mtcars,
family = gaussian(),
chains = 4,
seed = 123,
refresh = 0
)
Basic Posterior Predictive Checks
The pp_check Function
bayesplot provides the pp_check function for quick visual checks:
# Most basic PPC: compare observed y to simulated y's
pp_check(model)
This plots your actual data (one dot per observation) overlaid on a distribution of simulated datasets from the posterior. If your model fits well, the observed data should fall in the middle of the simulated distribution.
Types of PPC Plots
bayesplot offers several PPC plot styles:
# Histogram overlay
pp_check(model, type = "hist", n = 100)
# Scatter: observed vs predicted mean
pp_check(model, type = "scatter")
# Scatter: observed vs predicted with intervals
pp_check(model, type = "scatter_avg")
# Rootogram (counts)
pp_check(model, type = "rootogram")
Interpreting the Plots
The key question is: do your observations look plausible under the model?
- Good fit: Observations fall in the bulk of the simulated distribution
- Poor fit: Observations cluster in the tails or outside the simulated range
For example, if you see extreme outliers in your data that the model never reproduces, that’s a clear signal your model needs improvement.
Test Statistics as PPCs
Beyond visual checks, you can use test statistics to quantify model fit.
Defining Test Statistics
A test statistic (or discrepancy measure) summarizes some aspect of your data. Common choices include:
# Example test statistics
mean(mtcars$mpg) # Sample mean
sd(mtcars$mpg) # Sample standard deviation
max(mtcars$mpg) # Maximum value
sum(mtcars$mpg < 15) # Count of low mpg cars
Comparing Observed to Simulated
# Get posterior predictive samples
yrep <- predict(model)
# Calculate test statistic for each simulated dataset
test_stat <- apply(yrep, 1, mean) # Mean for each simulated dataset
# Observed test statistic
obs_stat <- mean(mtcars$mpg)
# Visualize
ggplot() +
geom_histogram(aes(test_stat), bins = 30, fill = "steelblue", alpha = 0.7) +
geom_vline(aes(xintercept = obs_stat), color = "red", linewidth = 2) +
labs(
title = "Posterior Predictive Check: Sample Mean",
x = "Simulated Mean MPG",
y = "Frequency"
)
The red line shows where your observed data falls. If it’s in the tails of the distribution, your model systematically mispredicts the mean.
Formal Test Statistics
For a more formal approach, calculate the proportion of simulated datasets with test statistics more extreme than observed:
# Bayesian p-value
p_value <- mean(test_stat >= obs_stat)
# Two-sided p-value
p_value_two_sided <- mean(abs(test_stat - mean(test_stat)) >=
abs(obs_stat - mean(test_stat)))
# Ideal: p-value around 0.5
# Suspicious: p-value near 0 or 1
cat("One-sided p-value:", p_value, "\n")
cat("Two-sided p-value:", p_value_two_sided, "\n")
A p-value near 0.5 suggests good fit. Extreme p-values (near 0 or 1) indicate problems.
Categorical and Count Data
Binary Outcomes
For binary data, use proportion-based checks:
# Example: Proportion of "successes"
prop_success <- function(y) mean(y == 1)
# Apply to posterior predictions
yrep_binary <- predict(model, transform = plogis) > 0.5
test_props <- apply(yrep_binary, 1, prop_success)
obs_prop <- prop_success(mtcars$mpg > 20)
ggplot() +
geom_histogram(aes(test_props), bins = 30, fill = "darkgreen", alpha = 0.7) +
geom_vline(aes(xintercept = obs_prop), color = "red", linewidth = 2)
Count Data
For counts, compare distributions:
# For count outcomes, check distribution shape
pp_check(model, type = "bar")
pp_check(model, type = "freqpoly")
Group-Level PPCs
For hierarchical models, check each group separately:
# If you have a hierarchical model
# Check predictions for specific groups
pp_check(model, group = "group_name", type = "scatter")
# Or overlay observations for specific groups
pp_check(model, group = "school_id", type = "intervals")
Predictive Calibration
Calibration Plots
Calibration checks whether predicted probabilities match observed frequencies:
# For probability predictions
pp_check(model, type = "calibration")
This shows how well your model’s uncertainty matches the true outcome rates.
Interval Coverage
Check whether your uncertainty intervals contain the truth at the expected rate:
# Get prediction intervals
yrep_pred <- predict(model, interval = "prediction")
# Check 50%, 80%, 95% interval coverage
coverage <- function(yrep, y, prob) {
lower <- apply(yrep, 2, quantile, (1 - prob) / 2)
upper <- apply(yrep, 2, quantile, (1 + prob) / 2)
mean(y >= lower & y <= upper)
}
coverage_50 <- coverage(yrep_pred, mtcars$mpg, 0.50)
coverage_80 <- coverage(yrep_pred, mtcars$mpg, 0.80)
coverage_95 <- coverage(yrep_pred, mtcars$mpg, 0.95)
cat("50% interval covers:", coverage_50, "of observations\n")
cat("80% interval covers:", coverage_80, "of observations\n")
cat("95% interval covers:", coverage_95, "of observations\n")
If your 95% intervals only cover 80% of observations, your uncertainty is underestimated.
Custom PPCs with tidybayes
For more control, use tidybays:
# Generate predictions
predictions <- add_predicted_draws(mtcars, model)
# Custom PPC: distribution comparison
ggplot(predictions, aes(x = .prediction, group = .draw)) +
geom_density(alpha = 0.1, color = "gray") +
geom_density(aes(x = mpg), data = mtcars, color = "red", linewidth = 2) +
labs(title = "Posterior Predictive Distribution vs Observed")
Common Patterns and What They Mean
Right-Skewed Residuals
If your observed data consistently exceeds predictions for high values:
pp_check(model, type = "scatter")
This suggests you need a different likelihood (perhaps log-normal) or a transformation.
Systematic Bias
If all predictions are too low:
# Your model consistently underpredicts
# Consider adding predictors or changing the model family
Heteroscedasticity
If prediction accuracy varies with the predictor:
# Residuals vary across the range of x
# Consider adding a variance model or using a robust likelihood
Adding Test Statistics in brms
For integrated PPCs within brms, use the loo package:
library(loo)
# Calculate LOO-PIT (Leave-One-Out Probability Integral Transform)
# Good calibration: uniform distribution
loo_model <- loo(model)
plot(loo_model)
Summary
Posterior predictive checks are essential for Bayesian model validation:
| Check Type | What It Tests | When to Use |
|---|---|---|
| pp_check visual | Overall distribution fit | Always start here |
| Test statistics | Specific aspects of data | When you care about particular patterns |
| Group-level | Hierarchical structure | For multilevel models |
| Calibration | Uncertainty accuracy | For prediction intervals |
| LOO-PIT | Calibration | For model comparison |
See Also
-
dplyr::filter - Basic data filtering for understanding examples
-
Prior Selection in Bayesian Models — Choosing appropriate priors that lead to valid PPCs
-
Getting Started with brms — Fitting Bayesian models that you’ll validate with PPCs
-
Linear Regression in R — Understanding regression fundamentals before going Bayesian
Next Steps
After mastering PPCs:
- Explore leave-one-out cross-validation (LOO) for model comparison
- Learn about posterior-predictive simulation for model expansion
- Apply PPCs to your specific domain problems