trimws()
trimws(x, which = c("both", "left", "right"), whitespace = "[[:space:]]") The trimws() function in R removes leading and/or trailing whitespace from character strings. It’s part of base R, making it available without any additional packages. This function is essential for cleaning text data, particularly when working with user input, file imports, or any scenario where unwanted spaces might cause issues with string matching or processing.
Syntax
trimws(x, which = c("both", "left", "right"), whitespace = "[[:space:]]")
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
x | character vector | Required | A character vector whose strings should be trimmed |
which | string | "both" | Which side to trim: "both", "left" (or "l"), or "right" (or "r") |
whitespace | string | "[[:space:]]" | Regular expression pattern matching whitespace characters |
Return value
Returns a character vector of the same length as x, with leading and/or trailing whitespace removed according to the which parameter. NA values in x are preserved as NA.
Examples
Basic usage
# Trim whitespace from both ends
x <- " some text "
trimws(x)
# [1] "some text"
# Trim only from the left
trimws(x, "left")
# [1] "some text "
# Trim only from the right
trimws(x, "right")
# [1] " some text"
# Using shorthand - "l" and "r" work too
trimws(x, "l")
# [1] "some text "
The which argument gives you precise control over which side of the string to clean. The default "both" is the most common choice and handles the typical case of user-submitted form fields with accidental spaces on either end. The "left" option is useful for log files or indented text where only the leading whitespace is unwanted, while "right" is the right choice for fixed-width export formats where fields are padded with trailing spaces. The shorthand forms "l" and "r" work as aliases and save a few keystrokes in interactive use.
Working with character vectors
# trimws() is vectorized - processes each element
words <- c(" apple", "banana ", " cherry ", "date")
trimws(words)
# [1] "apple" "banana" "cherry" "date"
# Useful after reading data with extra spaces
dirty_data <- c(" 2024-01-01 ", " 2024-01-02 ", "2024-01-03")
trimws(dirty_data)
# [1] "2024-01-01" "2024-01-02" "2024-01-03"
Because trimws() is vectorized, passing a character vector cleans every element in a single call without an explicit loop. This is especially useful after reading data from CSV or fixed-width files where whitespace padding is common — a single trimws() call on the entire column normalizes all values at once. The function returns a new character vector of the same length, so you can assign the result directly back to the column or pipe it into the next transformation step without intermediate variables.
Handling different types of whitespace
# Default handles regular spaces, tabs, newlines
messy <- "\t text with tabs and newlines\n "
trimws(messy)
# [1] "text with tabs and newlines"
# Use whitespace parameter for custom patterns (e.g., non-breaking spaces)
text_with_nbsp <- paste0("\u00a0", "clean me", "\u00a0") # non-breaking space
trimws(text_with_nbsp) # default doesn't remove nbsp
# [1] "\u00a0clean me\u00a0"
# Custom pattern to handle non-breaking spaces
trimws(text_with_nbsp, whitespace = "[\\h\\v]")
# [1] "clean me"
The default whitespace pattern matches standard ASCII whitespace characters — spaces, tabs, newlines, carriage returns, and form feeds — but does not cover Unicode whitespace like non-breaking spaces (\u00a0) that frequently appear in data copied from web pages or word processors. The custom pattern [\\h\\v] uses PCRE horizontal and vertical whitespace classes to catch these additional characters. When your data comes from web scraping or PDF extraction, combining trimws() with a broader whitespace regex or preprocessing with gsub() for non-breaking spaces is a reliable cleanup strategy.
Practical applications
# Cleaning user input for comparison
user_input <- " john@example.com "
db_email <- "john@example.com"
# Without trimming - comparison fails
user_input == db_email
# [1] FALSE
# With trimming - comparison succeeds
trimws(user_input) == db_email
# [1] TRUE
# Preparing data for analysis
raw_names <- c(" Alice ", "Bob ", " Charlie")
trimws(raw_names)
# [1] "Alice" "Bob" "Charlie"
# Combining with other string functions
full_name <- " john doe "
gsub(" +", " ", trimws(full_name)) # collapse multiple spaces
# [1] "john doe"
The combination of trimws() and gsub() handles both external and internal whitespace in one pipeline: trimws() strips the edges, and gsub(" +", " ", ...) collapses runs of multiple spaces into a single space. This two-step pattern is the base R equivalent of stringr::str_squish(), which performs both operations in a single call. The example with the email comparison shows the most critical use case — untrimmed user input will fail equality checks against database values even when the visible text matches, a bug that is invisible in printed output but causes joins and filters to silently miss records.
Handling NA values
# NA values are preserved
text_with_na <- c(" hello ", NA, " world ", NA)
trimws(text_with_na)
# [1] "hello" NA "world" NA
# Use with other NA-handling functions
text <- c(" text1 ", NA, " text2 ")
trimws(text)
# [1] "text1" NA "text2"
Common patterns
-
Data Import Cleaning: After reading CSV or text files, use
trimws()on character columns to remove accidental leading/trailing spaces that can break joins or comparisons. -
Form Input Validation: Trim user form submissions before validation or storage to ensure consistent data.
-
Text Normalization: Combine
trimws()withtolower()orgsub()for consistent text normalization before analysis. -
String Matching: Always trim before comparing strings to avoid false negatives from invisible whitespace.
How trimws() handles different whitespace
trimws() removes spaces, tabs (\t), newlines (\n), carriage returns (\r), and form feeds (\f), the same characters matched by \\s in regex. It does not remove non-breaking spaces ( ) that sometimes appear in data copied from web pages or PDF documents. For those, combine with gsub(" ", " ", x) before trimming.
The which argument controls which side to trim:
"both"(default), trims leading and trailing whitespace"left", trims only the start of the string"right", trims only the end
For data imported from CSV or spreadsheet files, right-trimming is the most common need, trailing spaces are common in fixed-width export formats. Left-trimming matters for indented text or logs.
trimws() was added in R 3.2.0 as a more explicit and readable alternative to gsub("^\\s+|\\s+$", "", x). The stringr equivalent is str_trim(x, side = "both").
See also
- gsub()
- paste()
- nchar()
stringr::str_squish()combinesstr_trim()and internal whitespace collapsing in one call.trimws()in base R is the equivalent ofstr_trim(). Both remove leading and trailing whitespace. Thewhichargument intrimws()accepts"both"(default),"left", or"right". For strings with internal whitespace runs (multiple consecutive spaces), follow withgsub("\s+", " ", x)to normalize internal spacing.str_squish()fromstringrcombines both operations — it trims outer whitespace and collapses internal whitespace in one call.