filter
filter applies linear filtering to a time series. It computes moving averages via convolution or autoregressive filters via recursion. This is the stats package’s filter(), not dplyr’s filter() — they’re completely different functions for different tasks.
Signature
filter(x, filter, method = c("convolution", "recursive"),
sides = 2, circular = FALSE, init)
Arguments:
x— a univariate or multivariate time seriesfilter— filter coefficients in reverse time ordermethod—"convolution"(moving average) or"recursive"(autoregression)sides—1for past values only,2for centered around lag 0circular—TRUEwraps filter around series endsinit— initial values for recursive filters (default: zeros)
Returns: A time series object.
Convolution Filters (Moving Averages)
The default method is convolution, which computes a weighted moving average. A 3-point moving average:
x <- 1:10
filter(x, rep(1, 3))
# Time Series:
# Start = 1
# End = 10
# Frequency = 1
# [1] NA 2 3 4 5 6 7 8 9 10
Notice the first two values are NA — the centered filter needs values on both sides.
The sides Argument
sides = 2 (default) centers the filter around lag 0:
# Centered: current value + one on each side
filter(1:10, rep(1, 3), sides = 2)
# [1] NA 2 3 4 5 6 7 8 9 NA
sides = 1 uses only past values (trailing moving average):
# Past values only: more common for real-time processing
filter(1:10, rep(1, 3), sides = 1)
# [1] NA NA 2 3 4 5 6 7 8 9
With sides = 1, the first length(filter) - 1 values are NA because there’s not enough history yet.
Circular Filtering
By default, missing values are assumed outside the series. Set circular = TRUE to wrap the filter around the ends:
x <- 1:10
filter(x, rep(1, 3), sides = 1, circular = TRUE)
# [1] 7 2 3 4 5 6 7 8 9 4
# First value wraps: (x[10] + x[1] + x[2]) / 3 = (10 + 1 + 2) / 3 = 4.33 truncated to 4
This is useful for cyclical data like angles or day-of-week values.
Recursive Filters (Autoregression)
The "recursive" method applies autoregressive filtering. There’s an implied coefficient 1 at lag 0:
y[i] = x[i] + f[1]*y[i-1] + f[2]*y[i-2] + ... + f[p]*y[i-p]
# Simple exponential smoothing-like filter
# y[i] = x[i] + 0.5*y[i-1]
filter(1:10, 0.5, method = "recursive")
# [1] 1.0 2.5 4.2 6.1 8.0 10.0 12.0 14.0 16.0 18.0
With init, you can specify starting values for the recursive computation:
# Pre-seed the filter
filter(1:5, c(0.5, 0.3), method = "recursive", init = c(10, 20))
Practical Examples
Smoothing a Noisy Signal
# Simulate noisy data
set.seed(42)
signal <- sin(seq(0, 4 * pi, length.out = 100))
noise <- rnorm(100, sd = 0.3)
x <- signal + noise
# 5-point moving average smooths the noise
smoothed <- filter(x, rep(1, 5) / 5)
Weighted Moving Average
# Exponential-like weighting via filter coefficients
weights <- c(0.1, 0.2, 0.4, 0.2, 0.1)
filter(1:20, weights)
# Centered weighted moving average
Technical Analysis (Finance)
Moving averages are the backbone of technical analysis:
# 20-day simple moving average
sma_20 <- filter(prices, rep(1, 20) / 20)
# MACD signal line (exponential-like via recursive)
ema_12 <- filter(prices, 2/13, method = "recursive")
Seasonal Adjustment
Remove a seasonal pattern using a filter that averages over the season length:
# For monthly data with annual seasonality (period = 12)
# 12-month centered moving average
ma_12 <- filter(monthly_data, rep(1, 12) / 12, sides = 2)
# Then subtract to detrend
Filter Coefficients in Reverse Time Order
The filter coefficients are specified in reverse time order. This matches the convention for AR and MA coefficients in time series analysis:
# For y[i] = 0.5*x[i] + 0.3*x[i-1] + 0.2*x[i-2]
# Write coefficients in reverse: c(0.2, 0.3, 0.5)
filter(x, c(0.2, 0.3, 0.5))
This trips people up. A symmetric 3-point average is rep(1, 3) — already symmetric, so reverse order doesn’t matter. But for asymmetric filters, remember: first coefficient is the oldest lag.
Missing Values
filter allows NA values in the input series:
x <- c(1, 2, NA, 4, 5)
filter(x, rep(1, 3))
# [1] NA NA NA 3 4
# NA propagates through the filter window
Missing values in the filter itself cause the output to be missing everywhere.
Common Pitfalls
Assuming Centered Filter Length Must Be Odd
With sides = 2, an even-length filter is allowed but asymmetrically positioned — more of the filter extends forward in time than backward:
# Even-length centered filter: more forward than backward
filter(1:10, rep(1, 4), sides = 2)
# Position: uses x[i-1], x[i], x[i+1], x[i+2]
Confusing filter() with dplyr::filter()
These are completely different functions:
# stats::filter — time series linear filtering
filter(AirPassengers, rep(1, 12))
# dplyr::filter — row selection from data frames
filter(df, age > 18, status == "active")
Loading dplyr masks stats::filter. Use stats::filter() explicitly if both are loaded.
Filter vs convolve
convolve() with type = "filter" uses the FFT and can be faster for long filters on long series, but it doesn’t return a time series and doesn’t handle NA values properly. Use filter() when you need proper time series semantics.
See Also
- /reference/base-functions/apply/ — apply functions to array margins
- /reference/base-functions/lapply/ — apply a function to each list element, returning a list
- /reference/tidyverse/dplyr_filter/ — row selection from data frames (dplyr)
- /reference/tidyverse/purrr_keep/ — keep elements of a list that satisfy a condition