Non-Standard Evaluation (NSE) in R
Non-Standard Evaluation (NSE) is one of those R concepts that trips up newcomers but becomes essential once you understand it. If you’ve ever used dplyr::filter() or ggplot2::aes(), you’ve benefited from NSE—even if you didn’t know it.
What is Non-Standard Evaluation?
In most programming languages, when you pass an argument to a function, it’s evaluated first, then the value is passed in. Standard evaluation works like this:
# Standard evaluation - x is evaluated to 5 first
my_func(1 + 4) # receives 5
But R allows functions to capture the unevaluated expression itself. This is NSE. When you write:
filter(df, age > 30)
The filter() function doesn’t receive the value age > 30—it receives the expression age > 30 as a promise object. It can then decide how to evaluate that expression, which is what makes the tidyverse work.
How NSE Works in R
R has several mechanisms for working with unevaluated expressions:
quote()- Captures an expression without evaluating itsubstitute()- Captures the expression as it appears in the calling environmenteval()- Evaluates an expression in a specified environment
Here’s a quick demo:
# quote() captures the expression
expr <- quote(1 + 2)
expr
# 1 + 2
# eval() evaluates it
eval(expr)
# [1] 3
The key difference between quote() and substitute(): substitute() also performs environment substitution, which is crucial for NSE functions.
Capturing Expressions with substitute()
substitute() is the workhorse of base R NSE. It grabs the expression from the parent frame:
capture_expr <- function(x) {
substitute(x)
}
capture_expr(1 + 2)
# 1 + 2
capture_expr(mean(x, na.rm = TRUE))
# mean(x, na.rm = TRUE)
This is exactly what base R functions like subset() use:
# This works because subset() uses NSE
subset(mtcars, cyl == 4)
Building NSE Functions
Let’s build a simple NSE function to see how it works. We’ll create a filter_gt() function that filters a data frame where a column is greater than a threshold:
filter_gt <- function(df, column, threshold) {
# Capture the unevaluated column name
col_expr <- substitute(column)
# Build the expression
filter_expr <- quote(
df[which(df[[col_name]] > threshold), ]
)
# Substitute the actual column name
filter_expr <- substitute(filter_expr,
list(col_name = as.character(col_expr)))
eval(filter_expr)
}
# Usage
filter_gt(mtcars, cyl, 6)
This is a simplified example, but it shows the pattern: capture the expression, build a new expression, then evaluate it.
NSE in the Tidyverse
The tidyverse uses a more sophisticated system called tidyeval, built on top of the rlang package. The key functions are:
enquo()- Capture an argument as a quosure (quoted expression + environment)enquos()- Capture multiple arguments!!(bang-bang) - Unquote an expression into its surrounding context{{ }}(curly-curly) - Pronoun for referencing captured arguments
Here’s a practical example:
library(dplyr)
library(rlang)
filter_above_threshold <- function(df, col, threshold) {
# Capture the column expression
col <- enquo(col)
df %>%
filter(!!col > threshold)
}
# Usage
mtcars %>% filter_above_threshold(cyl, 6)
The !! operator (called “bang-bang”) inserts the captured expression into the filter() call. This is how dplyr processes your bare column names.
For multiple columns, use across() with where():
summarise_all_above <- function(df, threshold) {
df %>%
summarise(across(where(is.numeric), ~ mean(.x[.x > threshold])))
}
Best Practices
-
Always provide both NSE and standard evaluation versions - Use
...for NSE and explicit.varsor similar for standard evaluation, likedplyr::select()does. -
Document your NSE clearly - Users need to know they can pass bare column names.
-
Test with non-standard inputs - Pass a symbol, a string, and an expression to see how your function handles each.
-
Use tidyeval for new code - The rlang approach is more robust than base R NSE.
-
Consider
deparse(substitute())for messages - This gives you the user-facing name:
my_function <- function(x) {
x_name <- deparse(substitute(x))
message(paste("Operating on:", x_name))
}
Common Pitfalls
Forgetting to unquote - If you capture with enquo() but forget !!, you’ll get unexpected results:
# WRONG - will error or behave strangely
filter(df, col) # col is a variable, not the column name
# RIGHT
filter(df, !!col_expr)
Environment issues - Expressions carry their environment. If you build an expression in one function and eval in another, you might get scoping bugs. Quosures (from enquo()) help by bundling expression + environment.
Mixing quoted and unquoted - Be consistent. Either your function takes strings (easy but verbose) or bare expressions (concise but requires understanding NSE).
See Also
rlang— Metaprogramming with rlang for advanced tidyevaldplyr-filter— How dplyr’s filter() uses NSE internallypurrr-map— Functional iteration with purrr