Introduction to Bayesian Thinking

March 16, 2026 · 7 min read · Updated March 16, 2026 · beginner

bayesian statistics probability prior posterior inference

Bayesian statistics offers a fundamentally different way of thinking about uncertainty. Instead of viewing probability as the long-run frequency of events, Bayesian thinking treats probability as a measure of belief that gets updated as new evidence arrives. This tutorial introduces you to the core concepts that form the foundation of Bayesian analysis.

What Is Bayesian Thinking?

At its heart, Bayesian thinking is about learning from evidence. You start with an initial belief (called a prior), you observe some data (the likelihood), and you update your belief to get a new, improved belief (the posterior).

This process mirrors how we naturally reason. If you believe it’s unlikely to rain today but you see dark clouds forming, you update your belief toward “it’s probably going to rain.” Bayesian statistics formalizes this intuitive process.

The mathematical heart of Bayesian inference is Bayes’ theorem:

$$P(\theta | data) = \frac{P(data | \theta) \times P(\theta)}{P(data)}$$

In words:

Posterior = (Likelihood × Prior) / Evidence
The posterior is your updated belief after seeing the data

Frequentist vs Bayesian: A Simple Example

Imagine you’re testing a new coin to see if it’s fair. A frequentist approach would flip the coin many times and ask: “What’s the probability of getting this result if the coin is fair?” A Bayesian would instead ask: “Given the results I observed, what do I now believe about the coin’s bias?”

Let’s see this in practice with R:

# Suppose we flip a coin 10 times and get 8 heads
# Is the coin fair?

# Frequentist approach: p-value calculation
# Probability of 8+ heads out of 10 if p=0.5
pbinom(7, size = 10, prob = 0.5, lower.tail = FALSE)
# [1] 0.0546875
# Not significant at alpha = 0.05

# But what do we actually believe about the coin?

The frequentist answer is binary: reject or don’t reject the null hypothesis. The Bayesian answer is more informative: here’s a distribution of plausible values for the coin’s bias.

Setting Up Your First Bayesian Analysis

You’ll need some key packages:

# Install Bayesian packages
install.packages("rstanarm")    # Stan for Bayesian regression
install.packages("bayesplot")   # Visualization
install.packages("tidybayes")   # Tidy workflow

# Load them
library(rstanarm)
library(bayesplot)
library(tidybayes)

A Simple Coin Flipping Example

Let’s implement a complete Bayesian analysis from scratch:

# Observed data: 8 heads out of 10 flips
n_heads <- 8
n_flips <- 10

# Prior: Before seeing data, let's say we think the coin is probably fair
# We represent this with a Beta distribution
# Beta(2, 2) is a reasonable prior - centered at 0.5 but with some uncertainty

prior_alpha <- 2
prior_beta <- 2

# Likelihood: Binomial - what we expect to see for different values of p
# We're computing: P(data | p) propto p^heads * (1-p)^tails

# Posterior: Beta(prior_alpha + heads, prior_beta + tails)
posterior_alpha <- prior_alpha + n_heads
posterior_beta <- prior_beta + (n_flips - n_heads)

# What does our posterior look like?
posterior_mean <- posterior_alpha / (posterior_alpha + posterior_beta)
posterior_mean
# [1] 0.7142857

# We now believe the coin has about 71% probability of landing heads

The posterior Beta(10, 4) captures our updated belief. The mean is 10/(10+4) approx 0.71. This is a weighted average of our prior (0.5) and the observed data (0.8), with more weight on the data because we have more observations.

Visualizing the Posterior

Let’s see what the prior and posterior look like:

library(ggplot2)

# Create a sequence of values for p (coin bias)
p_values <- seq(0, 1, length.out = 100)

# Calculate prior and posterior densities
prior_density <- dbeta(p_values, prior_alpha, prior_beta)
posterior_density <- dbeta(p_values, posterior_alpha, posterior_beta)

# Plot both
plot_data <- data.frame(
  p = p_values,
  prior = prior_density,
  posterior = posterior_density
)

ggplot(plot_data) +
  geom_line(aes(p, prior), linetype = "dashed", color = "blue") +
  geom_line(aes(p, posterior), color = "red") +
  geom_vline(xintercept = 0.5, color = "gray", linetype = "dotted") +
  labs(
    x = "Probability of Heads (p)",
    y = "Density",
    title = "Prior (dashed) vs Posterior (solid)"
  ) +
  theme_minimal()

The posterior is more concentrated (less uncertain) than the prior because we’ve observed data. The peak is around 0.7, suggesting the coin is likely biased toward heads.

Credible Intervals

One of Bayesian statistics’ strengths is the credible interval—a range of plausible values for the parameter. Unlike confidence intervals, credible intervals have a straightforward interpretation:

# 95% credible interval from the posterior
qbeta(0.025, posterior_alpha, posterior_beta)  # Lower bound
# [1] 0.4126918

qbeta(0.975, posterior_alpha, posterior_beta) # Upper bound
# [1] 0.9126918

# Interpretation: There's a 95% probability that the true p
# lies between 0.41 and 0.91, given our data and prior.

This is intuitively meaningful: we’re 95% confident the true coin bias falls in this range. No frequentist hand-waving about repeated sampling.

What If You Have Stronger Prior Beliefs?

The prior represents what you believed before seeing the data. Different priors lead to different posteriors:

# Strong prior: Beta(100, 100) - very confident the coin is fair
strong_prior <- c(100, 100)
strong_posterior <- strong_prior + c(n_heads, n_flips - n_heads)

mean(strong_posterior) / sum(strong_posterior)
# [1] 0.5192308

# Weak prior: Beta(1, 1) - no prior belief (uniform)
weak_prior <- c(1, 1)
weak_posterior <- weak_prior + c(n_heads, n_flips - n_heads)

mean(weak_posterior) / sum(weak_posterior)
# [1] 0.8181818

With a strong prior, the data has less influence on the posterior. With a weak (uninformative) prior, the posterior is almost entirely driven by the data.

Making Decisions with the Posterior

The posterior gives you everything you need for decision-making:

# What's the probability the coin is biased toward heads (p > 0.5)?
pbeta(0.5, posterior_alpha, posterior_beta, lower.tail = FALSE)
# [1] 0.9598694

# That's about a 96% chance the coin favors heads!
# Would you bet on heads?

A More Realistic Example: Estimating a Rate

Suppose you’re observing website visitors and want to estimate the conversion rate. You’ve seen 15 conversions out of 500 visitors:

# Observed data
conversions <- 15
visitors <- 500

# Use a weakly informative prior: Beta(1, 1) is uniform
# Or Beta(2, 20) if we expect conversion rates around 10%
prior <- c(2, 20)

# Posterior
posterior <- prior + c(conversions, visitors - conversions)

# Posterior mean (estimated conversion rate)
posterior[1] / sum(posterior)
# [1] 0.03207547  # About 3.2%

# 95% credible interval
c(
  qbeta(0.025, posterior[1], posterior[2]),
  qbeta(0.975, posterior[1], posterior[2])
)
# [1] 0.01852356 0.05037876

The conversion rate is likely between 1.9% and 5%, which is practical information for business decisions.

Why Learn Bayesian Methods?

Bayesian statistics offers several advantages:

Natural interpretation: Credible intervals mean exactly what you think they mean
Incorporates prior knowledge: Use existing information in your analysis
Handles complex models: MCMC makes many problems tractable
Answers the questions you actually ask: “What’s the probability my hypothesis is true?”

Summary

You’ve learned the core concepts of Bayesian thinking:

Concept	Description
Prior	Your belief before seeing data
Likelihood	How probable the data is for different parameter values
Posterior	Updated belief after incorporating the data
Credible interval	Range of plausible values for the parameter

The Bayesian workflow is straightforward: specify your prior, collect data, compute the posterior, and make decisions. In the next tutorial, you’ll learn how to use brms to fit Bayesian regression models in R.

Next Steps

Continue your Bayesian journey with the next tutorials in this series:

Getting Started with brms — Fit your first Bayesian regression model
Prior Selection — Learn how to encode your prior knowledge
Posterior Predictive Checks — Validate your Bayesian model’s fit

You’ll soon see how Bayesian methods provide a flexible framework for answering complex statistical questions.