How to Calculate Correlation Between Columns in R
Calculate correlation between columns to measure the strength and direction of a relationship between two variables. Base R’s cor() function handles Pearson (linear), Spearman (rank-based), and Kendall (concordant pairs) methods. Pearson is the standard choice for continuous data with a linear relationship; Spearman works for non-normal data or ordinal variables; Kendall is most reliable with small samples that have many tied values. Passing an entire data frame to cor() returns a correlation matrix for all numeric columns at once, which is the fastest way to scan for pairwise relationships in exploratory analysis.
df <- data.frame(
height = c(150, 160, 170, 180, 175, 165, 155, 185),
weight = c(50, 60, 65, 80, 75, 70, 55, 90)
)
# Pearson (default)
cor(df$height, df$weight) # 0.985
cor(df$height, df$weight, method = "spearman") # 1
cor(df$height, df$weight, method = "kendall") # 0.953
For missing values, use the use argument: "pairwise.complete.obs" uses all non-missing pairs, "complete.obs" drops any row with a missing value. Pass an entire data frame to cor() to get a correlation matrix for all numeric columns at once.
For hypothesis testing, cor.test(x, y) returns a p-value and confidence interval alongside the coefficient, which matters when you need to determine whether a correlation is statistically significant.
cor.test(df$height, df$weight)
See also
- var(), Variance calculation
- sd(), Standard deviation
- Descriptive Statistics in R, Summary statistics guide