Posterior Predictive Checks

March 16, 2026 · 6 min read · Updated March 16, 2026 · advanced

bayesian posterior predictive brms bayesplot model-validation mcmc

Posterior predictive checks (PPCs) are a fundamental tool in Bayesian analysis. They let you ask: “If I repeatedly drew new data from my model, would it look like the data I actually observed?” If the answer is yes, your model is consistent with the data. If no, something is wrong.

This tutorial teaches you how to perform meaningful posterior predictive checks using brms and bayesplot, with practical examples you can apply to your own models.

The Intuition Behind PPCs

The core idea is straightforward. After fitting a Bayesian model, you have a full posterior distribution—not just point estimates. Each draw from this posterior represents a plausible set of parameters that could have generated your data.

A posterior predictive check works like this:

Draw a parameter vector from the posterior
Use those parameters to simulate new data
Compare the simulated data to your actual observations

If the real data looks like a typical outcome from the model, you’re in good shape. If the real data looks unusual compared to what the model typically produces, your model is missing something important.

Setting Up Your Environment

# Required packages
install.packages(c("brms", "bayesplot", "ggplot2", "tidybayes"))

library(brms)
library(bayesplot)
library(ggplot2)
library(tidybayes)

For this tutorial, we’ll use the familiar mtcars dataset, fitting a model to predict fuel efficiency:

# Fit a simple model for demonstration
model <- brm(
  mpg ~ wt + cyl + hp,
  data = mtcars,
  family = gaussian(),
  chains = 4,
  seed = 123,
  refresh = 0
)

Basic Posterior Predictive Checks

The pp_check Function

bayesplot provides the pp_check function for quick visual checks:

# Most basic PPC: compare observed y to simulated y's
pp_check(model)

This plots your actual data (one dot per observation) overlaid on a distribution of simulated datasets from the posterior. If your model fits well, the observed data should fall in the middle of the simulated distribution.

Types of PPC Plots

bayesplot offers several PPC plot styles:

# Histogram overlay
pp_check(model, type = "hist", n = 100)

# Scatter: observed vs predicted mean
pp_check(model, type = "scatter")

# Scatter: observed vs predicted with intervals
pp_check(model, type = "scatter_avg")

# Rootogram (counts)
pp_check(model, type = "rootogram")

Interpreting the Plots

The key question is: do your observations look plausible under the model?

Good fit: Observations fall in the bulk of the simulated distribution
Poor fit: Observations cluster in the tails or outside the simulated range

For example, if you see extreme outliers in your data that the model never reproduces, that’s a clear signal your model needs improvement.

Test Statistics as PPCs

Beyond visual checks, you can use test statistics to quantify model fit.

Defining Test Statistics

A test statistic (or discrepancy measure) summarizes some aspect of your data. Common choices include:

# Example test statistics
mean(mtcars$mpg)           # Sample mean
sd(mtcars$mpg)             # Sample standard deviation
max(mtcars$mpg)            # Maximum value
sum(mtcars$mpg < 15)       # Count of low mpg cars

Comparing Observed to Simulated

# Get posterior predictive samples
yrep <- predict(model)

# Calculate test statistic for each simulated dataset
test_stat <- apply(yrep, 1, mean)  # Mean for each simulated dataset

# Observed test statistic
obs_stat <- mean(mtcars$mpg)

# Visualize
ggplot() +
  geom_histogram(aes(test_stat), bins = 30, fill = "steelblue", alpha = 0.7) +
  geom_vline(aes(xintercept = obs_stat), color = "red", linewidth = 2) +
  labs(
    title = "Posterior Predictive Check: Sample Mean",
    x = "Simulated Mean MPG",
    y = "Frequency"
  )

The red line shows where your observed data falls. If it’s in the tails of the distribution, your model systematically mispredicts the mean.

Formal Test Statistics

For a more formal approach, calculate the proportion of simulated datasets with test statistics more extreme than observed:

# Bayesian p-value
p_value <- mean(test_stat >= obs_stat)

# Two-sided p-value
p_value_two_sided <- mean(abs(test_stat - mean(test_stat)) >= 
                          abs(obs_stat - mean(test_stat)))

# Ideal: p-value around 0.5
# Suspicious: p-value near 0 or 1
cat("One-sided p-value:", p_value, "\n")
cat("Two-sided p-value:", p_value_two_sided, "\n")

A p-value near 0.5 suggests good fit. Extreme p-values (near 0 or 1) indicate problems.

Categorical and Count Data

Binary Outcomes

For binary data, use proportion-based checks:

# Example: Proportion of "successes"
prop_success <- function(y) mean(y == 1)

# Apply to posterior predictions
yrep_binary <- predict(model, transform = plogis) > 0.5
test_props <- apply(yrep_binary, 1, prop_success)
obs_prop <- prop_success(mtcars$mpg > 20)

ggplot() +
  geom_histogram(aes(test_props), bins = 30, fill = "darkgreen", alpha = 0.7) +
  geom_vline(aes(xintercept = obs_prop), color = "red", linewidth = 2)

Count Data

For counts, compare distributions:

# For count outcomes, check distribution shape
pp_check(model, type = "bar")
pp_check(model, type = "freqpoly")

Group-Level PPCs

For hierarchical models, check each group separately:

# If you have a hierarchical model
# Check predictions for specific groups
pp_check(model, group = "group_name", type = "scatter")

# Or overlay observations for specific groups
pp_check(model, group = "school_id", type = "intervals")

Predictive Calibration

Calibration Plots

Calibration checks whether predicted probabilities match observed frequencies:

# For probability predictions
pp_check(model, type = "calibration")

This shows how well your model’s uncertainty matches the true outcome rates.

Interval Coverage

Check whether your uncertainty intervals contain the truth at the expected rate:

# Get prediction intervals
yrep_pred <- predict(model, interval = "prediction")

# Check 50%, 80%, 95% interval coverage
coverage <- function(yrep, y, prob) {
  lower <- apply(yrep, 2, quantile, (1 - prob) / 2)
  upper <- apply(yrep, 2, quantile, (1 + prob) / 2)
  mean(y >= lower & y <= upper)
}

coverage_50 <- coverage(yrep_pred, mtcars$mpg, 0.50)
coverage_80 <- coverage(yrep_pred, mtcars$mpg, 0.80)
coverage_95 <- coverage(yrep_pred, mtcars$mpg, 0.95)

cat("50% interval covers:", coverage_50, "of observations\n")
cat("80% interval covers:", coverage_80, "of observations\n")
cat("95% interval covers:", coverage_95, "of observations\n")

If your 95% intervals only cover 80% of observations, your uncertainty is underestimated.

Custom PPCs with tidybayes

For more control, use tidybays:

# Generate predictions
predictions <- add_predicted_draws(mtcars, model)

# Custom PPC: distribution comparison
ggplot(predictions, aes(x = .prediction, group = .draw)) +
  geom_density(alpha = 0.1, color = "gray") +
  geom_density(aes(x = mpg), data = mtcars, color = "red", linewidth = 2) +
  labs(title = "Posterior Predictive Distribution vs Observed")

Common Patterns and What They Mean

Right-Skewed Residuals

If your observed data consistently exceeds predictions for high values:

pp_check(model, type = "scatter")

This suggests you need a different likelihood (perhaps log-normal) or a transformation.

Systematic Bias

If all predictions are too low:

# Your model consistently underpredicts
# Consider adding predictors or changing the model family

Heteroscedasticity

If prediction accuracy varies with the predictor:

# Residuals vary across the range of x
# Consider adding a variance model or using a robust likelihood

Adding Test Statistics in brms

For integrated PPCs within brms, use the loo package:

library(loo)

# Calculate LOO-PIT (Leave-One-Out Probability Integral Transform)
# Good calibration: uniform distribution
loo_model <- loo(model)
plot(loo_model)

Summary

Posterior predictive checks are essential for Bayesian model validation:

Check Type	What It Tests	When to Use
pp_check visual	Overall distribution fit	Always start here
Test statistics	Specific aspects of data	When you care about particular patterns
Group-level	Hierarchical structure	For multilevel models
Calibration	Uncertainty accuracy	For prediction intervals
LOO-PIT	Calibration	For model comparison

Next Steps

After mastering PPCs:

Explore leave-one-out cross-validation (LOO) for model comparison
Learn about posterior-predictive simulation for model expansion
Apply PPCs to your specific domain problems