rguides

replace_na

replace_na(data, replace, ...)

Description

replace_na() replaces missing values (NA) with a specified replacement value. It operates on both vectors and data frames, and always returns an object of the same type as the input. The original object is not modified in place.

replace_na() is part of the tidyr package and has been fully integrated with the vctrs framework since tidyr 1.0.0.

Parameters

  • data — A vector or data frame containing NA values to replace.
  • replace — A scalar replacement value for vectors. For data frames, a named list with one named element per column to replace. Column names in the list must match column names in the data frame exactly.
  • ... — Additional arguments passed to methods. Currently unused; kept for S3 generic compatibility.

Return Value

Returns an object of the same type as data. If data is a vector, a vector is returned. If data is a data frame, a data frame is returned.

Supported Types

replace_na() works with any vector type that the vctrs framework supports, including:

TypeNotes
Numeric (integer, double)Replacement is cast to the target type
CharacterReplacement is cast to character
FactorReplacement must be an existing level; attempting to add a new level raises an error
LogicalReplacement is cast to logical
List / list-colsUse list(value) as the replacement; NULL is the list-col equivalent of NA
Date / POSIXctReplacement is cast to the target date/time type

For data frames, replace must be a named list. Column names in the list must exactly match the columns you want to replace.

Examples

Replace NAs in a vector

library(tidyr)

x <- c(10, 20, NA, 40, NA)
replace_na(x, 0)
#> [1] 10 20  0 40  0

Replace NAs in a data frame

library(tidyr)
library(tibble)

df <- tibble(
  age  = c(25, NA, 35, NA),
  name = c("Alice", "Bob", NA, "Diana")
)

replace_na(df, list(age = 0, name = "Unknown"))
#> # A tibble: 4 × 2
#>   age name
#>  <dbl> <chr>
#> 1    25 Alice
#> 2     0 Bob
#> 3    35 Unknown
#> 4     0 Diana

Replace NULLs in a list-column

library(tidyr)
library(tibble)

df_list <- tibble(id = 1:3, items = list(c("a", "b"), NULL, "c"))
replace_na(df_list, list(items = list(character(0))))
#> # A tibble: 3 × 2
#>      id items
#>   <int> <list>
#> 1     1 <chr [2]>
#> 2     2 <chr [0]>
#> 3     3 <chr [1]>

Type casting of replacement values

The replacement value is cast to the type of the target column:

replace_na(c(1L, 2L, NA), 0.5)   # 0.5 cast to integer → 0
replace_na(c("a", NA), 1)        # 1 cast to character → "1"

Common Gotchas

  • replace must be a named list for data frames, even for a single column. Use replace_na(df, list(x = 0)), not replace_na(df, x = 0). Passing a bare named argument like x = 0 is not the same as providing a named list.

  • Replacement must be length 1 (a scalar). replace_na(c(1, NA, 2), c(0, 99)) raises an error. Pass a single value instead.

  • Factors require an existing level. replace_na(factor(c("cat", NA, "dog")), "mouse") errors because "mouse" is not an existing level. Use forcats::fct_expand() to add a new level first, or use base R subsetting to work around this.

  • No NAs present — silently returns input unchanged. If data contains no missing values, the function returns data as-is, with no warning or error.

  • Columns not named in replace are left untouched. If a data frame column does not appear in the named list, its NA values are not replaced.

  • Type casting may silently change the replacement. Because replace is cast to the target type, a replacement of 0.5 in an integer column becomes 0, and a replacement of 1 in a character column becomes "1".

  • replace_na() does not have a matrix method. Passing a matrix falls through to the default vector method, which treats the matrix as a flattened vector. Use apply() or similar to replace NAs in specific columns of a matrix.

See Also