rguides

How to Merge Data Frames with Different Key Names in R

When you merge data frames in R, the columns you join on often have different names. The key is using named character vectors in dplyr’s by parameter or by.x/by.y in base R’s merge(). Both approaches let you merge data across mismatched column names without renaming columns beforehand.

library(dplyr)

employees <- data.frame(
  id = c(1, 2, 3, 4),
  name = c("Alice", "Bob", "Carol", "David"),
  dept_id = c(10, 20, 10, 30)
)

departments <- data.frame(
  department_id = c(10, 20, 30),
  department = c("Engineering", "Sales", "Marketing")
)

result <- employees |>
  left_join(departments, by = c("dept_id" = "department_id"))

The syntax c("dept_id" = "department_id") tells dplyr to match dept_id from the left table to department_id in the right table. All dplyr join types (inner_join, full_join, right_join) accept this same syntax. For multiple keys with different names, add more pairs to the named vector.

In base R, merge() handles mismatched names with the by.x and by.y parameters. Use all.x = TRUE for a left join or all = TRUE for a full outer join.

result <- merge(
  employees,
  departments,
  by.x = "dept_id",
  by.y = "department_id",
  all.x = TRUE
)

Make sure key columns share the same data type before joining—convert with as.integer() or as.character() if types differ. The dplyr approach is generally more readable inside a pipe chain, while base R merge() works without any package dependencies.

See also