Getting Started with brms

March 16, 2026 · 9 min read ·Updated May 29, 2026 ·intermediate

brmsbayesianregressionstanmcmcstatistics

The brms package provides a powerful interface for fitting Bayesian regression models in R. Built on top of Stan, a state-of-the-art platform for Bayesian inference, brms allows you to specify complex statistical models using familiar R formula syntax. This tutorial walks you through fitting your first Bayesian models with brms.

What you’ll learn

This tutorial covers the key concepts and practical techniques for working with Getting Started with brms. By the end, you will know how to apply the core functions in real data analysis workflows.

What is brms?

brms stands for Bayesian Regression Models using Stan. It translates R formula notation into Stan code, handles the compilation and sampling, and returns results in a format that integrates smoothly with tidyverse workflows.

Key advantages of using brms:

Familiar syntax: Use the same ~ formula notation as lm() and glm()
Flexible models: Handle multilevel/hierarchical models, ordinal outcomes, censored data, and more
Automatic MCMC: Stan handles Markov Chain Monte Carlo sampling under the hood
Tidy output: Results play nicely with dplyr, tidybayes, and ggplot2

Installation

First, install brms and its dependencies:

# Install brms
install.packages("brms")

# Install Stan if not already installed
install.packages("rstan")

# Recommended packages for visualization
install.packages("bayesplot")
install.packages("tidybayes")

Loading the packages:

library(brms)
library(bayesplot)
library(tidybayes)
library(ggplot2)
library(dplyr)

Your first Bayesian model

Let’s start with a simple linear regression. We’ll use the mtcars dataset to predict miles per gallon (mpg) from weight (wt) and number of cylinders (cyl):

# Fit a Bayesian linear regression
model1 <- brm(
  mpg ~ wt + cyl,
  data = mtcars,
  family = gaussian(),
  chains = 4,
  iter = 2000,
  seed = 123
)

Breaking down the arguments:

mpg ~ wt + cyl: Formula specifying mpg as the outcome, with wt and cyl as predictors
data: The data frame
family: The probability distribution for the outcome (gaussian for continuous outcomes)
chains: Number of Markov chains to run (4 is standard)
iter: Total iterations per chain (including warmup)
seed: For reproducibility

Examining model results

The summary() function displays the core results:

summary(model1)

Output includes:

Population-level effects: Coefficients for your predictors
Sigma: The residual standard deviation
Rhat: Convergence diagnostic (should be close to 1.0)
ESS: Effective sample size (higher is better)

For a more tidy output:

fixef(model1)

This shows the posterior estimates for each coefficient:

        Estimate Est.Error  Q2.5 Q97.5
Intercept  39.69      1.96 35.80 43.63
wt         -5.65      0.58 -6.78 -4.52
cyl        -1.51      0.78 -3.04 -0.02

Interpretation:

Estimate: Posterior mean (the most likely value)
Est.Error: Standard deviation of the posterior
Q2.5, Q97.5: 95% credible interval bounds

Interpreting credible intervals

The 95% credible interval [-6.78, -4.52] for wt means there’s a 95% probability that the true coefficient falls in this range, given the data and prior. This is fundamentally different from a confidence interval—it’s a direct statement about the parameter, not about hypothetical repeated samples.

# Plot posterior distributions
mcmc_plot(model1, type = "areas", pars = "^b_")

Visualizing posterior distributions

bayesplot provides excellent visualizations:

# Posterior density plots
mcmc_areas(model1, pars = c("b_wt", "b_cyl"), 
           prob = 0.95) +
  labs(title = "Posterior Distributions of Coefficients")

Priors in brms

brms uses weakly informative priors by default. You can specify custom priors using the prior argument:

# Specify custom priors
model2 <- brm(
  mpg ~ wt + cyl,
  data = mtcars,
  family = gaussian(),
  prior = c(
    prior(normal(0, 10), class = Intercept),
    prior(normal(-3, 1), class = b, coef = "wt"),
    prior(normal(0, 2), class = b, coef = "cyl"),
    prior(exponential(1), class = sigma)
  ),
  chains = 4,
  seed = 123
)

Prior specifications:

class = Intercept: Prior for the intercept
class = b: Prior for all slope coefficients
class = b, coef = "wt": Prior for specific coefficient
class = sigma: Prior for residual standard deviation

Bayesian logistic regression

brms supports many families. Here’s logistic regression for a binary outcome:

# Create a binary outcome: high vs low efficiency
mtcars$high_eff <- ifelse(mtcars$mpg > median(mtcars$mpg), 1, 0)

# Fit Bayesian logistic regression
model3 <- brm(
  high_eff ~ wt + disp,
  data = mtcars,
  family = bernoulli(link = "logit"),
  chains = 4,
  seed = 123
)

summary(model3)

The interpretation changes slightly—coefficients are on the log-odds scale.

Multilevel/Hierarchical models

One of brms’ strengths is handling multilevel data. Here’s a simple random-intercepts model:

# Simulate multilevel data
set.seed(123)
n_schools <- 8
n_students <- 30

school_data <- data.frame(
  school = rep(1:n_schools, each = n_students),
  student = 1:n_students
)
school_data$school_effect <- rnorm(n_schools, 0, 2)[school_data$school]
school_data$ability <- rnorm(n_students, 0, 1)
school_data$score <- 50 + school_data$school_effect + school_data$ability + rnorm(nrow(school_data), 0, 3)

# Fit multilevel model
model_school <- brm(
  score ~ ability + (1 | school),
  data = school_data,
  chains = 4,
  seed = 123
)

The (1 | school) syntax adds a random intercept for each school.

Posterior predictions

Generate predictions from your model:

# Get posterior predictive samples
pp <- posterior_predict(model1, newdata = data.frame(wt = c(2, 3, 4), cyl = c(4, 6, 8)))

# pp is a matrix: rows = iterations, columns = new observations
dim(pp)
# [1] 4000    3

# Summarize predictions
apply(pp, 2, mean)
apply(pp, 2, quantile, probs = c(0.025, 0.975))

Checking convergence

Always verify that your MCMC chains converged properly:

# Trace plots
mcmc_trace(model1, pars = c("b_wt", "b_cyl"))

# Rhat values (should be < 1.05)
rhat(model1)

# Effective sample sizes
neff_ratio(model1)

If Rhat > 1.1 or effective sample sizes are low, you may need more iterations or better priors.

Model specification

brms uses R’s formula syntax extended with random effects and non-linear terms. bf(y ~ x + (1|group)) adds a random intercept per group. bf(y ~ x, sigma ~ x) allows the variance to vary with x. Non-linear models: bf(y ~ a * exp(-b * x), a ~ 1, b ~ 1, nl = TRUE). The formula interface handles a wide range of model structures without writing Stan code directly.

Running the model

brm(formula, data, family = gaussian(), prior = priors, chains = 4, iter = 2000, warmup = 1000) fits the model. chains = 4 runs 4 MCMC chains in parallel; cores = 4 enables parallel execution. iter = 2000 with warmup = 1000 gives 1000 post-warmup samples per chain (4000 total). brm() compiles a Stan model on the first run, subsequent runs reuse the compiled code and are faster.

Prior specification

get_prior(formula, data) shows the default priors brms would use. Set priors with set_prior("normal(0, 1)", class = "b") for regression coefficients, set_prior("exponential(1)", class = "sigma") for the standard deviation. set_prior("lkj(1)", class = "cor") sets the prior on correlation matrices. Prior predictive checks with sample_prior = "only" verify priors generate plausible data before fitting.

Posterior analysis

summary(fit) shows estimates, standard errors, and convergence diagnostics. plot(fit) produces trace plots and density plots per parameter. mcmc_plot(fit, type = "intervals") from bayesplot visualizes credible intervals. posterior_summary(fit) returns a data frame of all parameter summaries. as_draws_df(fit) converts the posterior to a data frame for custom analysis.

Why brms

brms (Bayesian Regression Models using Stan) translates R formula syntax into Stan programs, compiles them, and runs MCMC sampling. You write brm(y ~ x + (1 | group), data = df) and brms handles the Stan code, posterior sampling, and result extraction.

Without brms, fitting a Bayesian hierarchical model requires writing Stan code directly, hundreds of lines for a moderately complex model. brms generates efficient Stan code from a compact formula specification and provides a consistent interface for diagnostics, posterior extraction, and prediction.

brms supports: linear regression, generalized linear models (binomial, Poisson, negative binomial, beta, gamma), ordinal and count regression, zero-inflated models, hierarchical (mixed) models, multivariate models, distributional regression (modeling variance, shape, and other parameters alongside the mean), and non-linear models.

A first brms model

library(brms)

fit <- brm(
  y ~ x1 + x2 + (1 + x1 | group),
  data = df,
  family = gaussian(),
  prior = c(
    prior(normal(0, 10), class = b),
    prior(normal(0, 10), class = Intercept),
    prior(exponential(1), class = sigma)
  ),
  chains = 4,
  iter = 2000,
  warmup = 1000,
  cores = 4
)

chains = 4 runs 4 independent Markov chains. iter = 2000 is the number of iterations per chain. warmup = 1000 is the adaptation phase (discarded). cores = 4 runs chains in parallel. The effective number of posterior samples is (iter - warmup) * chains = 4000.

Convergence diagnostics

Before interpreting results, check that the MCMC chains converged. summary(fit) shows the Rhat statistic for each parameter. Rhat near 1.0 indicates convergence; values above 1.05 suggest chains have not converged and longer sampling or better parameterization is needed.

plot(fit) shows trace plots (chains over iterations, should look like white noise) and density plots (posterior distributions). pp_check(fit) overlays the posterior predictive distribution with observed data, the most direct way to assess model fit.

bayesplot::mcmc_trace(as.array(fit)) shows trace plots. bayesplot::mcmc_rank_overlay(as.array(fit)) shows rank plots, which are better than trace plots for diagnosing mixing. Ideal rank plots are uniform across ranks.

Extracting results

fixef(fit) returns fixed effect estimates with credible intervals. ranef(fit) returns group-level deviations. posterior_summary(fit) shows all parameters. as_draws_df(fit) returns posterior samples as a tidy data frame.

conditional_effects(fit, effects = "x1") plots the posterior predictive mean and credible interval for each predictor, marginalizing over other predictors. This is the Bayesian analog of the partial dependence plot.

predict(fit, newdata = new_df, summary = FALSE) returns a matrix of posterior predictive samples, one row per posterior sample, one column per new observation. posterior_predict(fit, newdata) is equivalent. Use these for uncertainty quantification rather than just point predictions.

Model comparison

loo(fit1) computes Leave-One-Out cross-validation using Pareto-Smoothed Importance Sampling (PSIS-LOO). loo_compare(loo(fit1), loo(fit2)) compares models. The model with lower LOO-IC is preferred; differences larger than twice the standard error are meaningful.

bayes_factor(fit1, fit2) computes the Bayes factor, the ratio of marginal likelihoods. Requires save_pars = save_pars(all = TRUE) in the brm() call. Bayes factors greater than 10 provide strong evidence for one model over another.

Summary

You’ve learned the fundamentals of fitting Bayesian regression models with brms:

Concept	Description
Formula syntax	Familiar R notation with `~` for regression
Families	gaussian, bernoulli, poisson, and many more
Priors	Default weakly informative or custom-specified
Credible intervals	Direct probability statements about parameters
MCMC diagnostics	Check Rhat, ESS, and trace plots

The Bayesian approach gives you richer information than frequentist methods—you get full posterior distributions rather than single point estimates, allowing direct probability statements about your parameters.

Next steps

Continue building your Bayesian skills:

Explore more families (poisson, ordinal, zero-inflated)
Add random effects for nested and crossed designs
Learn about posterior predictive model checking with bayesplot
Master prior specification for your specific domains