How to Check NA Values in R Data Frames
Check NA values before running any analysis — missing data turns up in nearly every real dataset and silently propagates through calculations if you do not address it. is.na() gives you a logical vector for each element, and wrapping it with sum() counts how many TRUE results you have. For data frames, colSums(is.na(df)) gives per-column tallies so you can quickly spot which variables have the biggest missing-data problem and decide whether to impute, drop, or encode the gaps.
x <- c(1, 2, NA, 4, 5)
is.na(x)
# [1] FALSE FALSE TRUE FALSE FALSE
sum(is.na(x))
# [1] 1
For data frames, colSums(is.na(df)) gives per-column counts and sum(is.na(df)) gives the total. Use mean(is.na(x)) when you need the proportion rather than the raw count:
df <- data.frame(
name = c("Alice", "Bob", NA, "Diana"),
age = c(25, NA, 35, 40)
)
colSums(is.na(df))
sum(is.na(df))
anyNA(x) is faster than any(is.na(x)) when you only need to know whether any NA exists at all — it stops scanning after the first hit.
To find or remove rows with any missing value, complete.cases() returns a logical vector that is TRUE for complete rows:
df[complete.cases(df), ] # keep only complete rows
sum(!complete.cases(df)) # count rows with at least one NA
This is equivalent to na.omit(df) but gives you more control. Once you have mapped the missingness, decide whether to impute, drop, or encode it as a separate category based on the analysis context.