ANOVA in R

· 5 min read · Updated March 7, 2026 · intermediate
anova statistics hypothesis-testing f-test r

Analysis of Variance (ANOVA) is a statistical method for comparing means across three or more groups. In this tutorial, you’ll learn how to perform one-way ANOVA, two-way ANOVA, check the underlying assumptions, and run post-hoc tests to identify which specific groups differ.

One-Way ANOVA

One-way ANOVA compares the means of three or more groups that are defined by a single categorical variable. For example, you might compare test scores across three different teaching methods.

The aov() Function

R provides the aov() function for ANOVA. Here’s a practical example using plant growth data:

# Load the data (built-in dataset)
data(PlantGrowth)

# Examine the structure
str(PlantGrowth)
# 'data.frame':	30 obs. of  2 variables:
#  $ weight: num  4.17 5.58 5.18 ...
#  $ group : Factor w/ 3 levels "ctrl","trt1","trt2"

# View the groups
unique(PlantGrowth$group)
# [1] ctrl trt1 trt2
# Levels: ctrl trt1 trt2

# Run one-way ANOVA
model <- aov(weight ~ group, data = PlantGrowth)

# View the results
summary(model)
#             Df Sum Sq Mean Sq F value Pr(>F)
# group        2  3.766  1.8832   4.846  0.0159 *
# Residuals   27 10.492  0.3886

The key output is the p-value (Pr(>F)). With p = 0.0159, we reject the null hypothesis that all group means are equal. At least one group differs significantly from the others.

Interpreting the Output

The ANOVA table contains several components:

  • Df (Degrees of Freedom): group has k-1 degrees (k = number of groups), residuals have n-k
  • Sum Sq: Sum of squares between groups and within groups (residuals)
  • Mean Sq: Sum of squares divided by degrees of freedom
  • F value: The test statistic (ratio of between-group to within-group variance)
  • Pr(>F): The p-value

A significant p-value tells you something is different, but it doesn’t tell you which groups differ. That’s where post-hoc tests come in.

Checking ANOVA Assumptions

ANOVA makes several assumptions that you should verify before trusting the results:

1. Independence

This comes from your study design. If measurements in one group affect another, ANOVA isn’t appropriate. Random sampling and assignment help ensure independence.

2. Normality

The residuals (or each group’s data) should be approximately normally distributed:

# Check normality with Shapiro-Wilk test
shapiro.test(residuals(model))
# 	Shapiro-Wilk normality test
# data:  residuals(model)
# W = 0.9304, p-value = 0.05685

# Or test each group separately
by(PlantGrowth$weight, PlantGrowth$group, shapiro.test)

For large samples (n > 30 per group), ANOVA is robust to minor departures from normality due to the Central Limit Theorem.

3. Homogeneity of Variances

Variances should be equal across groups. Use Levene’s test:

# Install and load car package if needed
# install.packages("car")
library(car)

leveneTest(weight ~ group, data = PlantGrowth)
# Levene's Test for Homogeneity of Variance (center = median)
#        Df F value Pr(>F)
# group   2  0.6931  0.5097

A non-significant p-value (like 0.51) indicates equal variances. If this assumption fails, consider using a Welch’s ANOVA or a Kruskal-Wallis test.

4. No Significant Outliers

Outliers can distort results. Check with:

# Identify outliers
boxplot(weight ~ group, data = PlantGrowth)

# Or get numeric values
boxplot.stats(PlantGrowth$weight)$out

Post-Hoc Tests

When ANOVA is significant, you need post-hoc tests to determine which specific pairs of groups differ. The most common is Tukey’s HSD (Honestly Significant Difference).

TukeyHSD()

# Run Tukey HSD
tukey <- TukeyHSD(model)

# View results
tukey
#   Tukey multiple comparisons of means
#     95% family-wise confidence level
#
# Fit: aov(formula = weight ~ group, data = PlantGrowth)
#
# $group
#              diff        lwr       upr     p adj
# trt1-ctrl  -0.371 -1.0622259 0.3202259 0.3908711
# trt2-ctrl   0.494 -0.1972259 1.1852259 0.1979969
# trt2-trt1   0.865  0.1737741 1.5562259 0.0108022

# Plot the results
plot(tukey)

The p adj column shows adjusted p-values (adjusted to control the family-wise error rate). Here, only trt2 vs trt1 is significant (p = 0.011).

Two-Way ANOVA

Two-way ANOVA extends the concept to two categorical independent variables. This lets you test the effect of each factor while controlling for the other, plus check for interactions.

Running Two-Way ANOVA

# Create a dataset with two factors
# Using mtcars: transmission (am) and cylinders (cyl)
data(mtcars)

# Convert to factors
mtcars$am <- factor(mtcars$am, labels = c("Automatic", "Manual"))
mtcars$cyl <- factor(mtcars$cyl)

# Two-way ANOVA (without interaction)
model_two <- aov(mpg ~ am + cyl, data = mtcars)
summary(model_two)
#             Df  Sum Sq Mean Sq   F value Pr(>F)
# am           1  376.47  376.47   14.759 0.000764 ***
# cyl          2  456.99  228.49    8.964 0.001134 **
# Residuals   28  712.83   25.46

# Two-way ANOVA (with interaction)
model_int <- aov(mpg ~ am * cyl, data = mtcars)
summary(model_int)

The formula am * cyl includes both main effects and the interaction. Use + for main effects only, and * for main effects plus interaction.

Interpreting Interaction Effects

When you have a significant interaction, the effect of one factor depends on the level of the other. Visualize this:

# Interaction plot
interaction.plot(
  x.factor = mtcars$cyl,
  trace.factor = mtcars$am,
  response = mtcars$mpg,
  xlab = "Cylinders",
  ylab = "Miles per Gallon",
  trace.label = "Transmission"
)

Parallel lines mean no interaction. Non-parallel lines indicate an interaction effect.

Common Pitfalls

Confusing Statistical and Practical Significance

A significant p-value doesn’t always mean the difference matters in practice. Always check effect sizes (eta-squared or omega-squared) alongside p-values.

Ignoring Assumptions

Running ANOVA without checking assumptions can lead to false conclusions. The test is reasonably robust to mild violations, but extreme violations require alternative approaches.

Multiple Comparisons Without Adjustment

Running multiple t-tests increases the chance of a false positive. Always use adjusted p-values (like Tukey’s HSD provides) or control the false discovery rate.

Summary

  • Use aov() for one-way and two-way ANOVA in R
  • Check assumptions: independence, normality, homogeneity of variance, no outliers
  • Use TukeyHSD() for post-hoc pairwise comparisons
  • For two-way ANOVA, test for interaction effects with * in the formula
  • Visualize interactions with interaction.plot()

Next Steps

Explore Linear Regression in R to extend these concepts to continuous predictors, or Chi-Square Test for comparing categorical variables.