replace_na
replace_na(data, replace, ...) Description
replace_na() replaces missing values (NA) with a specified replacement value. It operates on both vectors and data frames, and always returns an object of the same type as the input. The original object is not modified in place.
replace_na() is part of the tidyr package and has been fully integrated with the vctrs framework since tidyr 1.0.0.
Parameters
data— A vector or data frame containingNAvalues to replace.replace— A scalar replacement value for vectors. For data frames, a named list with one named element per column to replace. Column names in the list must match column names in the data frame exactly....— Additional arguments passed to methods. Currently unused; kept for S3 generic compatibility.
Return Value
Returns an object of the same type as data. If data is a vector, a vector is returned. If data is a data frame, a data frame is returned.
Supported Types
replace_na() works with any vector type that the vctrs framework supports, including:
| Type | Notes |
|---|---|
| Numeric (integer, double) | Replacement is cast to the target type |
| Character | Replacement is cast to character |
| Factor | Replacement must be an existing level; attempting to add a new level raises an error |
| Logical | Replacement is cast to logical |
| List / list-cols | Use list(value) as the replacement; NULL is the list-col equivalent of NA |
| Date / POSIXct | Replacement is cast to the target date/time type |
For data frames, replace must be a named list. Column names in the list must exactly match the columns you want to replace.
Examples
Replace NAs in a vector
library(tidyr)
x <- c(10, 20, NA, 40, NA)
replace_na(x, 0)
#> [1] 10 20 0 40 0
Replace NAs in a data frame
library(tidyr)
library(tibble)
df <- tibble(
age = c(25, NA, 35, NA),
name = c("Alice", "Bob", NA, "Diana")
)
replace_na(df, list(age = 0, name = "Unknown"))
#> # A tibble: 4 × 2
#> age name
#> <dbl> <chr>
#> 1 25 Alice
#> 2 0 Bob
#> 3 35 Unknown
#> 4 0 Diana
Replace NULLs in a list-column
library(tidyr)
library(tibble)
df_list <- tibble(id = 1:3, items = list(c("a", "b"), NULL, "c"))
replace_na(df_list, list(items = list(character(0))))
#> # A tibble: 3 × 2
#> id items
#> <int> <list>
#> 1 1 <chr [2]>
#> 2 2 <chr [0]>
#> 3 3 <chr [1]>
Type casting of replacement values
The replacement value is cast to the type of the target column:
replace_na(c(1L, 2L, NA), 0.5) # 0.5 cast to integer → 0
replace_na(c("a", NA), 1) # 1 cast to character → "1"
Common Gotchas
-
replacemust be a named list for data frames, even for a single column. Usereplace_na(df, list(x = 0)), notreplace_na(df, x = 0). Passing a bare named argument likex = 0is not the same as providing a named list. -
Replacement must be length 1 (a scalar).
replace_na(c(1, NA, 2), c(0, 99))raises an error. Pass a single value instead. -
Factors require an existing level.
replace_na(factor(c("cat", NA, "dog")), "mouse")errors because"mouse"is not an existing level. Useforcats::fct_expand()to add a new level first, or use base R subsetting to work around this. -
No NAs present — silently returns input unchanged. If
datacontains no missing values, the function returnsdataas-is, with no warning or error. -
Columns not named in
replaceare left untouched. If a data frame column does not appear in the named list, itsNAvalues are not replaced. -
Type casting may silently change the replacement. Because
replaceis cast to the target type, a replacement of0.5in an integer column becomes0, and a replacement of1in a character column becomes"1". -
replace_na()does not have a matrix method. Passing a matrix falls through to the default vector method, which treats the matrix as a flattened vector. Useapply()or similar to replace NAs in specific columns of a matrix.
See Also
dplyr::filter()— subset rows of a data frame by conditiondplyr::mutate()— add or transform columns in a data frame