is.nan()
is.nan(x) The is.nan() function tests whether elements in an object are NaN (Not a Number). NaN represents an undefined or unrepresentable value, such as the result of 0/0.
Syntax
is.nan(x)
is.nan() is a type-testing primitive that inspects the internal representation of each element, not its printed form or class attribute. It accepts any R object — vectors, lists, data frames — and returns a logical value or vector indicating which elements are NaN. Because NaN is a special floating-point value defined by the IEEE 754 standard, is.nan() only returns TRUE for numeric inputs that evaluate to NaN; character strings and factors always return FALSE regardless of their content.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
x | any object | , | Object to test for NaN values |
Examples
Basic usage
# Division resulting in NaN
x <- 0 / 0
is.nan(x)
# [1] TRUE
# Square root of negative number
y <- sqrt(-1)
is.nan(y)
# [1] TRUE
Before comparing NaN with other special values, it helps to understand what produces a NaN in the first place. R generates NaN from undefined mathematical operations: zero divided by zero, the square root of a negative number on the real line, or infinity minus infinity. These are computation failures, not missing data — they signal that a formula produced a result with no meaningful numerical interpretation. Recognizing this source helps you decide whether to treat NaN as a bug to fix or as an expected outcome to filter out downstream.
NaN vs NA vs NULL
# NaN is a special value
x1 <- NaN
is.nan(x1) # [1] TRUE
is.na(x1) # [1] TRUE (NaN is also considered NA)
is.null(x1) # [1] FALSE
# NA is missing
x2 <- NA
is.nan(x2) # [1] FALSE
is.na(x2) # [1] TRUE
is.null(x2) # [1] FALSE
# NULL is absence of value
x3 <- NULL
is.nan(x3) # [1] FALSE
is.na(x3) # [1] FALSE
is.null(x3) # [1] TRUE
The key relationship to internalize is that NaN is a subset of NA in R: is.na(NaN) returns TRUE, but is.nan(NA) returns FALSE. This means a call to is.na() catches both missing values and undefined numerical results, while is.nan() is selective — it only fires for genuine computation failures. When writing data-cleaning pipelines, most code uses is.na() because it covers both cases. Reserve is.nan() for numerical debugging, where distinguishing a math error from a user-supplied blank lets you trace the problem to its source.
Operations producing NaN
# Various NaN-producing operations
is.nan(0 / 0)
# [1] TRUE
is.nan(sqrt(-1))
# [1] TRUE
is.nan(log(-1))
# [1] TRUE
is.nan(Inf - Inf)
# [1] TRUE
The operations shown above are the standard NaN generators in R: 0/0, sqrt(-1), log(-1), and Inf - Inf. Each produces a warning alongside the NaN result. In practice, NaN values often appear silently inside larger vectors after a computation runs across mixed data — for example, applying log() to a column that contains a stray zero or negative number. Running sum(is.nan(result)) after any vectorized numerical operation gives you a quick count of how many positions produced undefined output.
Common patterns
Handling NaN in computations
# Replace NaN with NA or a default value
x <- c(1, 2, NaN, 4, NaN)
x[is.nan(x)] <- NA
x
# [1] 1 2 NA 4 NA
# Using ifelse
x <- c(1, 2, NaN, 4)
ifelse(is.nan(x), 0, x)
# [1] 1 2 0 4
The x[is.nan(x)] <- NA pattern normalizes NaN values into standard missing-value tokens, which downstream functions like mean(na.rm = TRUE) already know how to handle. The ifelse(is.nan(x), 0, x) approach replaces undefined results with a domain-appropriate default — zero for counts, or the column mean for continuous measurements. Choosing between replacement and removal depends on context: for summary statistics, removal is usually safe; for row-level predictions, imputation may be necessary to avoid dropping observations.
Cleaning numeric data
# Check for any NaN in data
data <- c(1, 2, NaN, 4, 5)
any(is.nan(data))
# [1] TRUE
# Remove NaN
data[!is.nan(data)]
# [1] 1 2 4 5
The any(is.nan(data)) check is a lightweight diagnostic you can drop into any data-validation script. It returns a single logical value indicating whether undefined results exist anywhere in the vector, which is faster and more concise than counting positions with sum(). The data[!is.nan(data)] subset removes all NaN entries in one line, which is useful for cleaning a numeric column before passing it to a modeling function that cannot handle non-finite values. Both patterns are idiomatic in R and appear frequently in exploratory data analysis.
Numerical analysis
# Checking for undefined results
calculate <- function(x) {
if (x < 0) {
return(NaN)
}
sqrt(x)
}
sapply(c(4, 0, -4), calculate)
# [1] 2 0 NaN
The calculate() function above demonstrates defensive programming: it checks for a negative input and returns NaN explicitly rather than letting sqrt() produce a warning. This pattern is useful in package development, where you want to signal an invalid computation without halting execution with stop(). The sapply() call then applies this guarded function across a vector, producing a clean result where the third element is NaN while the first two return correct values. This shows how is.nan() integrates into function design rather than just post-hoc cleanup.
NaN vs NA and when is.nan() is needed
NaN (Not a Number) and NA (Not Available) are distinct in R. NaN results from undefined mathematical operations: 0/0, Inf - Inf, sqrt(-1) (on a real number). NA represents missing data inserted by the user or by data import.
is.na() returns TRUE for both NA and NaN values. is.nan() returns TRUE only for NaN, not for NA. Use is.nan() when you specifically need to identify undefined numerical results rather than missing data, for example to diagnose numerical instability in a computation.
In practice, most data cleaning code uses is.na() because it catches both. Reserve is.nan() for numerical debugging, where distinguishing a computation-produced NaN from user-supplied NA helps identify the source of a problem.
is.nan() returns a logical vector of the same length as the input. It always returns FALSE for non-numeric inputs (characters, logicals) and for NA, this is the key distinction from is.na(). In data validation workflows, running any(is.nan(result)) after a numerical computation checks for undefined results without flagging legitimate missing values. Replace NaN values with NA using result[is.nan(result)] <- NA to normalize them before downstream processing.
Vectorized operations like division and square root can silently produce NaN at specific points. 0/0 gives NaN, sqrt(-1) gives NaN with a warning, Inf - Inf gives NaN. After complex numerical computations, running sum(is.nan(result)) gives a count of problematic outputs, useful both during development and in production error-checking code. If the count is non-zero, trace back through the computation to find exactly where undefined mathematical operations are occurring and producing bad output.
NaN and NA interact differently with arithmetic: NaN + 1 is NaN, and NA + 1 is NA. This makes them behaviorally similar in most contexts. The key distinction is semantic: NA means a value is unknown or missing (e.g., a survey question left blank), while NaN means a computation produced an undefined mathematical result (e.g., 0/0 or log(-1)). When reading data from external sources, NA is the appropriate type for missing values. NaN should only appear as the result of a calculation, never as an input placeholder for missing data.
When cleaning data, check for both with is.na(x) | is.nan(x) — or simply is.na(x), since is.na(NaN) returns TRUE. The combination is.nan(x) & !is.na(x) never holds because NaN is a subset of NA. Use is.nan() only when you specifically need to identify computation failures rather than general missingness, for example when diagnosing why a model returned unexpected results.