How to merge data frames with different key names in R

· 3 min read · Updated March 14, 2026 · beginner
r joining dplyr data.table merge

When merging data frames in R, the columns you want to join on often have different names. This guide shows you how to handle mismatched column names in dplyr, base R, and data.table.

Sample Data

Create two data frames with different key column names:

# First data frame: employees have a dept_id
employees <- data.frame(
  id = c(1, 2, 3, 4),
  name = c("Alice", "Bob", "Carol", "David"),
  dept_id = c(10, 20, 10, 30)
)

# Second data frame: departments use department_id
departments <- data.frame(
  department_id = c(10, 20, 30),
  department = c("Engineering", "Sales", "Marketing")
)

The join key is dept_id in employees but department_id in departments.

With dplyr

Use the by parameter with a named character vector:

library(dplyr)

result <- employees %>%
  left_join(departments, by = c("dept_id" = "department_id"))

The syntax c(“dept_id” = “department_id”) tells dplyr to match dept_id from the left table to department_id in the right table.

Output

#   id  name dept_id  department
# 1  1 Alice      10 Engineering
# 2  2   Bob      20      Sales
# 3  3 Carol      10 Engineering
# 4  4 David      30   Marketing

All dplyr Join Types

All join functions accept the same by syntax:

# Inner join
inner <- employees %>%
  inner_join(departments, by = c("dept_id" = "department_id"))

# Full outer join
full <- employees %>%
  full_join(departments, by = c("dept_id" = "department_id"))

# Right join
right <- employees %>%
  right_join(departments, by = c("dept_id" = "department_id"))

Multiple Keys with Different Names

When joining on multiple columns that have different names:

# Another example with two keys
orders <- data.frame(
  order_id = c(101, 102, 103),
  customer_id = c(1, 2, 1),
  amount = c(100, 200, 150)
)

customers <- data.frame(
  cust_id = c(1, 2, 3),
  name = c("Alice", "Bob", "Carol")
)

# Join on both customer_id = cust_id
result <- orders %>%
  left_join(customers, by = c("customer_id" = "cust_id"))

With Base R

The merge() function handles different column names with the by.x and by.y parameters:

# Left join equivalent
result <- merge(
  employees,
  departments,
  by.x = "dept_id",
  by.y = "department_id",
  all.x = TRUE
)

# Inner join
result <- merge(
  employees,
  departments,
  by.x = "dept_id",
  by.y = "department_id"
)

# Full outer join
result <- merge(
  employees,
  departments,
  by.x = "dept_id",
  by.y = "department_id",
  all = TRUE
)

Base R Merge Options

ParameterValueEffect
all.xTRUELeft join
all.yTRUERight join
allTRUEFull outer join
allFALSE (default)Inner join

With data.table

Use the on parameter with column names:

library(data.table)

employees_dt <- as.data.table(employees)
departments_dt <- as.data.table(departments)

# Left join
result <- employees_dt[departments_dt, on = c("dept_id" = "department_id")]

# Inner join
result <- merge(
  employees_dt,
  departments_dt,
  by.x = "dept_id",
  by.y = "department_id"
)

The on parameter accepts named character vectors similar to dplyr.

Common Problems

Column Name Conflicts

When both tables have columns with the same name (other than the join key):

# Both have a name column - use suffix to differentiate
result <- employees %>%
  left_join(
    departments,
    by = c("dept_id" = "department_id"),
    suffix = c("_emp", "_dept")
  )

Missing Matches

Check for unmatched rows after joining:

# Find rows in employees with no matching department
unmatched <- employees %>%
  anti_join(departments, by = c("dept_id" = "department_id"))

# This tells you which dept_id values do not exist in departments

Type Mismatch

Make sure key columns are the same type:

# If dept_id is character but department_id is numeric
employees$dept_id <- as.integer(employees$dept_id)

# Or convert in the join
result <- employees %>%
  left_join(
    departments,
    by = c("dept_id" = "department_id")
  )

See Also