rguides

dplyr::if_else

if_else(condition, true, false, missing = NULL, ..., ptype = NULL, size = deprecated())

if_else() is dplyr’s type-strict vectorized if-else. It evaluates a logical condition element-wise and returns corresponding values from a true branch and a false branch. The key difference from base R’s ifelse() is that if_else() enforces type consistency, handles NA values in the condition explicitly, and preserves types like factors that ifelse() would destroy.

Basic syntax

library(dplyr)

x <- c(-5:5, NA)

if_else(x < 0, "negative", "positive")
#>  [1] "negative" "negative" "negative" "negative" "negative" "positive"
#>  [7] "positive" "positive" "positive" "positive" "positive" NA

The condition must be a logical vector. The true and false arguments are recycled to match the size of condition, so you can supply scalar values that get applied to every element. When the condition is NA, the output for that position is also NA by default.

Handling NAs with the missing argument

Base ifelse() passes NA values through from the condition without any way to override this behavior. if_else() provides the missing argument to control exactly what value should fill positions where the condition is NA. This is valuable in data cleaning pipelines where missing conditions need a specific sentinel value rather than propagating as NA.

The first example below shows the default behavior where NA in the condition produces NA in the output. The second demonstrates how missing = "unknown" replaces those missing-condition positions with a meaningful label:

x <- c(-5:5, NA)

# Without missing — NA stays NA
if_else(x < 0, "negative", "positive")
#>  [1] "negative" "negative" "negative" "negative" "negative" "positive"
#>  [7] "positive" "positive" "positive" "positive" "positive" NA

# With missing — explicit label for NA condition values
if_else(x < 0, "negative", "positive", missing = "unknown")
#>  [1] "negative" "negative" "negative" "negative" "negative" "positive"
#>  [7] "positive" "positive" "positive" "positive" "positive" "unknown"

Type strictness

if_else() requires true and false to be coercible to a common type, and it raises an error rather than silently coercing incompatible types. This strictness catches bugs early that would otherwise produce confusing output downstream. The first example shows the correct usage with matching numeric types, while the second demonstrates the type error you get when mixing integer and character:

# This works — both sides are numeric
if_else(c(TRUE, FALSE), 1, 0)
#> [1] 1 0

# This throws an error — can't reconcile integer and character
if_else(c(TRUE, FALSE), 1, "zero")
#> Error: `true` and `false` must be compatible types.

This is a feature, not a bug. Type coercion in ifelse() often produces surprising results that go undetected until they corrupt downstream analysis.

Factor preservation

A major practical advantage of if_else() over ifelse() is that it preserves factor levels. Base ifelse() strips factor attributes and returns the underlying integer codes, which silently turns categorical labels into meaningless numbers. With if_else(), the factor type and its levels survive the conditional intact:

x <- factor(c("a", "b", "c", "a"))

# Base ifelse — factors become integers
ifelse(x == "a", x, NA)
#> [1]  1 NA NA  1   # integers, not "a"

# dplyr if_else — factors are preserved
if_else(x == "a", x, NA)
#> [1] a    <NA> <NA> a
#> Levels: a b c

Inside mutate()

if_else() is most commonly used inside mutate() to conditionally create or transform columns. The expression evaluates element-wise, binding the result to a new column in the data frame. When the column is large, if_else() is faster than base ifelse() because it avoids the overhead of S3 method dispatch:

starwars |>
  mutate(
    height_category = if_else(height < 100, "short", "tall"),
    .keep = "used"
  )
#> # A tibble: 87 × 2
#>   height height_category
#>    <int> <chr>
#> 1     172 tall
#> 2     167 tall
#> 3      96 short
#> 4     202 tall
#> ...

Specifying output type with ptype

The ptype argument overrides automatic type detection and forces the output to a specific type. This is useful in programmatic contexts where you need to guarantee a consistent return type regardless of the input values. Passing ptype = character() ensures the output is always a character vector, even when the true and false values might otherwise be inferred as something else:

# Force character output even when inputs are coercible
if_else(c(TRUE, FALSE), "yes", "no", ptype = character())
#> [1] "yes" "no"

if_else vs ifelse

ifelse()if_else()
Type coercionSilent, often surprisingStrict, throws error
NA in conditionPasses throughmissing parameter
Factor preservationDrops to integersPreserves levels
SpeedSlowerFaster (in dplyr context)
LocationBase Rdplyr

For new dplyr-based code, if_else() is almost always the better choice. Reserve ifelse() for quick scripts where type strictness doesn’t matter.

For multiple conditions, case_when() handles more branches cleanly, but if_else() is the right tool for a simple binary choice.

if_else() is stricter than base R’s ifelse() in two ways: it requires true and false to be the same type, and it handles NA values in condition using the optional missing argument. In base ifelse(), mixing types silently coerces to a common type, which can produce unexpected results. if_else() raises an error if types differ, making the type requirement explicit and bugs visible earlier.

The missing argument controls what value is returned when condition is NA. Without it, NA in condition produces NA in the output regardless of true and false. This is often the right behavior, but when a specific default is needed for NA rows (e.g., in a scoring function where NA should mean “not applicable” = 0), set missing = 0 rather than adding a separate replace_na() step after.

See also

  • base R ifelse, the base R alternative
  • dplyr::case_when, multi-branch conditional, useful when you have more than two cases
  • dplyr::mutate — creating and transforming columns, where if_else() is most often used