Introduction to Bayesian Thinking
Bayesian statistics offers a fundamentally different way of thinking about uncertainty. Instead of viewing probability as the long-run frequency of events, Bayesian thinking treats probability as a measure of belief that gets updated as new evidence arrives. This tutorial introduces you to the core concepts that form the foundation of Bayesian analysis.
What Is Bayesian Thinking?
At its heart, Bayesian thinking is about learning from evidence. You start with an initial belief (called a prior), you observe some data (the likelihood), and you update your belief to get a new, improved belief (the posterior).
This process mirrors how we naturally reason. If you believe it’s unlikely to rain today but you see dark clouds forming, you update your belief toward “it’s probably going to rain.” Bayesian statistics formalizes this intuitive process.
The mathematical heart of Bayesian inference is Bayes’ theorem:
$$P(\theta | data) = \frac{P(data | \theta) \times P(\theta)}{P(data)}$$
In words:
- Posterior = (Likelihood × Prior) / Evidence
- The posterior is your updated belief after seeing the data
Frequentist vs Bayesian: A Simple Example
Imagine you’re testing a new coin to see if it’s fair. A frequentist approach would flip the coin many times and ask: “What’s the probability of getting this result if the coin is fair?” A Bayesian would instead ask: “Given the results I observed, what do I now believe about the coin’s bias?”
Let’s see this in practice with R:
# Suppose we flip a coin 10 times and get 8 heads
# Is the coin fair?
# Frequentist approach: p-value calculation
# Probability of 8+ heads out of 10 if p=0.5
pbinom(7, size = 10, prob = 0.5, lower.tail = FALSE)
# [1] 0.0546875
# Not significant at alpha = 0.05
# But what do we actually believe about the coin?
The frequentist answer is binary: reject or don’t reject the null hypothesis. The Bayesian answer is more informative: here’s a distribution of plausible values for the coin’s bias.
Setting Up Your First Bayesian Analysis
You’ll need some key packages:
# Install Bayesian packages
install.packages("rstanarm") # Stan for Bayesian regression
install.packages("bayesplot") # Visualization
install.packages("tidybayes") # Tidy workflow
# Load them
library(rstanarm)
library(bayesplot)
library(tidybayes)
A Simple Coin Flipping Example
Let’s implement a complete Bayesian analysis from scratch:
# Observed data: 8 heads out of 10 flips
n_heads <- 8
n_flips <- 10
# Prior: Before seeing data, let's say we think the coin is probably fair
# We represent this with a Beta distribution
# Beta(2, 2) is a reasonable prior - centered at 0.5 but with some uncertainty
prior_alpha <- 2
prior_beta <- 2
# Likelihood: Binomial - what we expect to see for different values of p
# We're computing: P(data | p) propto p^heads * (1-p)^tails
# Posterior: Beta(prior_alpha + heads, prior_beta + tails)
posterior_alpha <- prior_alpha + n_heads
posterior_beta <- prior_beta + (n_flips - n_heads)
# What does our posterior look like?
posterior_mean <- posterior_alpha / (posterior_alpha + posterior_beta)
posterior_mean
# [1] 0.7142857
# We now believe the coin has about 71% probability of landing heads
The posterior Beta(10, 4) captures our updated belief. The mean is 10/(10+4) approx 0.71. This is a weighted average of our prior (0.5) and the observed data (0.8), with more weight on the data because we have more observations.
Visualizing the Posterior
Let’s see what the prior and posterior look like:
library(ggplot2)
# Create a sequence of values for p (coin bias)
p_values <- seq(0, 1, length.out = 100)
# Calculate prior and posterior densities
prior_density <- dbeta(p_values, prior_alpha, prior_beta)
posterior_density <- dbeta(p_values, posterior_alpha, posterior_beta)
# Plot both
plot_data <- data.frame(
p = p_values,
prior = prior_density,
posterior = posterior_density
)
ggplot(plot_data) +
geom_line(aes(p, prior), linetype = "dashed", color = "blue") +
geom_line(aes(p, posterior), color = "red") +
geom_vline(xintercept = 0.5, color = "gray", linetype = "dotted") +
labs(
x = "Probability of Heads (p)",
y = "Density",
title = "Prior (dashed) vs Posterior (solid)"
) +
theme_minimal()
The posterior is more concentrated (less uncertain) than the prior because we’ve observed data. The peak is around 0.7, suggesting the coin is likely biased toward heads.
Credible Intervals
One of Bayesian statistics’ strengths is the credible interval—a range of plausible values for the parameter. Unlike confidence intervals, credible intervals have a straightforward interpretation:
# 95% credible interval from the posterior
qbeta(0.025, posterior_alpha, posterior_beta) # Lower bound
# [1] 0.4126918
qbeta(0.975, posterior_alpha, posterior_beta) # Upper bound
# [1] 0.9126918
# Interpretation: There's a 95% probability that the true p
# lies between 0.41 and 0.91, given our data and prior.
This is intuitively meaningful: we’re 95% confident the true coin bias falls in this range. No frequentist hand-waving about repeated sampling.
What If You Have Stronger Prior Beliefs?
The prior represents what you believed before seeing the data. Different priors lead to different posteriors:
# Strong prior: Beta(100, 100) - very confident the coin is fair
strong_prior <- c(100, 100)
strong_posterior <- strong_prior + c(n_heads, n_flips - n_heads)
mean(strong_posterior) / sum(strong_posterior)
# [1] 0.5192308
# Weak prior: Beta(1, 1) - no prior belief (uniform)
weak_prior <- c(1, 1)
weak_posterior <- weak_prior + c(n_heads, n_flips - n_heads)
mean(weak_posterior) / sum(weak_posterior)
# [1] 0.8181818
With a strong prior, the data has less influence on the posterior. With a weak (uninformative) prior, the posterior is almost entirely driven by the data.
Making Decisions with the Posterior
The posterior gives you everything you need for decision-making:
# What's the probability the coin is biased toward heads (p > 0.5)?
pbeta(0.5, posterior_alpha, posterior_beta, lower.tail = FALSE)
# [1] 0.9598694
# That's about a 96% chance the coin favors heads!
# Would you bet on heads?
A More Realistic Example: Estimating a Rate
Suppose you’re observing website visitors and want to estimate the conversion rate. You’ve seen 15 conversions out of 500 visitors:
# Observed data
conversions <- 15
visitors <- 500
# Use a weakly informative prior: Beta(1, 1) is uniform
# Or Beta(2, 20) if we expect conversion rates around 10%
prior <- c(2, 20)
# Posterior
posterior <- prior + c(conversions, visitors - conversions)
# Posterior mean (estimated conversion rate)
posterior[1] / sum(posterior)
# [1] 0.03207547 # About 3.2%
# 95% credible interval
c(
qbeta(0.025, posterior[1], posterior[2]),
qbeta(0.975, posterior[1], posterior[2])
)
# [1] 0.01852356 0.05037876
The conversion rate is likely between 1.9% and 5%, which is practical information for business decisions.
Why Learn Bayesian Methods?
Bayesian statistics offers several advantages:
- Natural interpretation: Credible intervals mean exactly what you think they mean
- Incorporates prior knowledge: Use existing information in your analysis
- Handles complex models: MCMC makes many problems tractable
- Answers the questions you actually ask: “What’s the probability my hypothesis is true?”
Summary
You’ve learned the core concepts of Bayesian thinking:
| Concept | Description |
|---|---|
| Prior | Your belief before seeing data |
| Likelihood | How probable the data is for different parameter values |
| Posterior | Updated belief after incorporating the data |
| Credible interval | Range of plausible values for the parameter |
The Bayesian workflow is straightforward: specify your prior, collect data, compute the posterior, and make decisions. In the next tutorial, you’ll learn how to use brms to fit Bayesian regression models in R.
See Also
- Linear Regression in R — Frequentist foundation for understanding regression concepts
- Logistic Regression in R — Binary outcome modeling, conceptually related to Bayesian classification
- Hypothesis Testing in R — Frequentist approach to statistical inference
Next Steps
Continue your Bayesian journey with the next tutorials in this series:
- Getting Started with brms — Fit your first Bayesian regression model
- Prior Selection — Learn how to encode your prior knowledge
- Posterior Predictive Checks — Validate your Bayesian model’s fit
You’ll soon see how Bayesian methods provide a flexible framework for answering complex statistical questions.