rguides

How to Calculate Correlation Between Columns in R

Calculate correlation between columns to measure the strength and direction of a relationship between two variables. Base R’s cor() function handles Pearson (linear), Spearman (rank-based), and Kendall (concordant pairs) methods. Pearson is the standard choice for continuous data with a linear relationship; Spearman works for non-normal data or ordinal variables; Kendall is most reliable with small samples that have many tied values. Passing an entire data frame to cor() returns a correlation matrix for all numeric columns at once, which is the fastest way to scan for pairwise relationships in exploratory analysis.

df <- data.frame(
  height = c(150, 160, 170, 180, 175, 165, 155, 185),
  weight = c(50, 60, 65, 80, 75, 70, 55, 90)
)

# Pearson (default)
cor(df$height, df$weight)               # 0.985
cor(df$height, df$weight, method = "spearman")  # 1
cor(df$height, df$weight, method = "kendall")   # 0.953

For missing values, use the use argument: "pairwise.complete.obs" uses all non-missing pairs, "complete.obs" drops any row with a missing value. Pass an entire data frame to cor() to get a correlation matrix for all numeric columns at once.

For hypothesis testing, cor.test(x, y) returns a p-value and confidence interval alongside the coefficient, which matters when you need to determine whether a correlation is statistically significant.

cor.test(df$height, df$weight)

See also