rguides

Building CLI Tools in R: A Practical Guide

Building CLI tools in R lets you automate tasks, create reusable scripts, and integrate R into larger pipelines. This guide covers the main approaches: argparse, optparse, Rscript with shebangs, and making your scripts executable.

Installing CLI packages

Two packages dominate R CLI development: argparse and optparse. Both are available from CRAN.

# Install both packages
install.packages(c("argparse", "optparse"))

Using argparse

The argparse package provides a Python-inspired interface for building CLI tools. Once installed, it handles the common patterns you need in command-line applications: required positional arguments, optional --flag style options, automatic type checking, and built-in --help messages that document your interface. Define a parser, register the arguments your script accepts, call parse_args(), and the package maps command-line input into R list elements you access with the $ operator.

library(argparse)

# Create the parser
parser <- ArgumentParser(description = "Process data files")

# Add positional argument
parser$add_argument("input_file", 
                    help = "Path to input CSV file")

# Add optional arguments
parser$add_argument("--output", 
                    default = "output.csv",
                    help = "Path to output file [default: %(default)s]")

parser$add_argument("--filter",
                    help = "Filter column name")

parser$add_argument("--verbose",
                    action = "store_true",
                    help = "Print extra output")

# Parse the arguments
args <- parser$parse_args()

# Use the arguments
if (args$verbose) {
  cat("Processing:", args$input_file, "\n")
}

# Your processing logic here
data <- read.csv(args$input_file)

if (!is.null(args$filter)) {
  data <- data[data[[args$filter]] > 0, ]
}

write.csv(data, args$output, row.names = FALSE)

With argparse, positional arguments go directly on the command line and optional flags use --flag value syntax. The --verbose flag uses action = "store_true", which means including it sets args$verbose to TRUE without the user typing an explicit value. argparse validates input against the parser definition before your processing logic runs, so invalid arguments produce an error immediately rather than causing cryptic failures downstream. Here is how you invoke the script:

Rscript process_data.R data.csv --output result.csv --verbose

argparse key features

  • Positional arguments: Required inputs specified by position
  • Optional arguments: Flags and options with --prefix
  • Type checking: Automatic conversion to integer, numeric, or logical
  • Choices: Restrict values to a predefined set
  • Automatic help: --help generates usage information
parser$add_argument("--format",
                    choices = c("csv", "tsv", "json"),
                    default = "csv",
                    help = "Output format [default: %(default)s]")

Using optparse

The optparse package offers a similar interface for building CLI tools but uses a functional style rather than method chaining. Each option is defined with make_option(), and options are bundled into an OptionParser for parsing. The package uses dest parameters to map flag names to output list fields, which means short flags like -v and long flags like --verbose can both write to the same destination variable. Optparse is slightly simpler than argparse and works well when you only need optional flags without positional arguments.

library(optparse)

# Define options
option_list <- list(
  make_option(c("-v", "--verbose"),
              action = "store_true",
              default = TRUE,
              help = "Print extra output [default]"),
  
  make_option(c("-q", "--quiet"),
              action = "store_false",
              dest = "verbose",
              help = "Print little output"),
  
  make_option(c("-c", "--count"),
              type = "integer",
              default = 5,
              help = "Number of samples [default: %default]",
              metavar = "number")
)

# Parse
opt <- parse_args(OptionParser(option_list = option_list))

# Use options
if (opt$verbose) {
  cat("Generating", opt$count, "samples\n")
}

# Generate random samples
samples <- rnorm(opt$count)
cat(samples, "\n")

With optparse, each entry in the option_list vector becomes a recognized flag. The action = "store_false" with dest = "verbose" means including --quiet sets opt$verbose to FALSE, overriding the default of TRUE. Short flags use a single dash and long options use two dashes, following standard Unix conventions. Here are examples of how to run an optparse script:

Rscript generate_samples.R -c 10
Rscript generate_samples.R -q -c 100

Shebangs and executable scripts

A shebang line (#!) on the first line of a script tells the operating system which interpreter to invoke. For R, #!/usr/bin/env Rscript searches the PATH for the R interpreter, making the script portable across systems where R might be installed in different directories. The shebang approach eliminates the need to type Rscript script.R — you can invoke the script directly by name once it has execute permissions. The --vanilla flag keeps the runtime clean by skipping .Rprofile loading and workspace saving on exit.

#!/usr/bin/env Rscript --vanilla

# Hello World CLI in R
args <- commandArgs(trailingOnly = TRUE)

if (length(args) == 0) {
  cat("Usage: hello.R NAME\n")
  quit(status = 1)
}

cat("Hello,", args[1], "!\n")

This example uses commandArgs(trailingOnly = TRUE) to collect arguments after the script name. The trailingOnly parameter strips the script path itself, leaving only the user-supplied arguments. The script checks whether any arguments were provided and exits with a non-zero status if they were not, which is standard CLI behaviour.

Save as hello.R and make it executable:

chmod +x hello.R
./hello.R World

The --vanilla flag prevents R from loading user-level configuration files or saving the workspace on exit, which keeps your CLI scripts fast and predictable. Without it, scripts can behave differently across users depending on their .Rprofile settings, or they may prompt the user to save the workspace when the script finishes.

Finding rscript

Use #!/usr/bin/env Rscript for portability across systems. The env command searches the PATH for Rscript rather than hardcoding the install location, which means the same script works on macOS (Homebrew at /opt/homebrew/bin), Linux (/usr/bin), and custom R installations. On systems where env itself is at a non-standard path or when you need to target a specific R version, specify the full path to the Rscript binary:

#!/usr/bin/local/Rscript
# or
#!/opt/R/bin/Rscript

Exit codes

Exit codes are how command-line tools tell the shell whether they succeeded or failed. A status of 0 means success; any non-zero value signals an error. R provides quit(status = n) for explicit control, while calling stop() causes the script to exit with an error status automatically. The shell acts on these exit codes in conditional statements: cmd && success_command runs only on success, while cmd || failure_command runs only on failure. Returning the correct exit code lets your R scripts integrate with shell pipelines, CI systems, and Makefile rules that check $? after execution:

# Success (default)
quit(status = 0)

# Failure
quit(status = 1)

# Error with message
stop("Something went wrong")

Bash checks the exit status of every command and stores it in the special $? variable. The || operator runs the right-hand command only when the left-hand command fails, providing a concise way to handle errors inline. This pattern works for R scripts run with Rscript or as shebang executables:

./script.R || echo "Script failed"

Complete example: data processor

Here is a complete CLI tool that combines argparse argument parsing with real data processing logic. This script reads a CSV file, groups rows by a specified column, and computes either a sum or a mean aggregation. It handles the case where no grouping is requested by returning the raw data, and it writes results either to stdout or to an output file depending on whether the --output flag is provided. Notice the shebang at the top: it makes the script executable without the Rscript prefix.

#!/usr/bin/env Rscript --vanilla

library(argparse)

parser <- ArgumentParser(description = "Filter and summarize CSV data")

parser$add_argument("file",
                    help = "Input CSV file")

parser$add_argument("--group-by",
                    help = "Column to group by")

parser$add_argument("--sum",
                    help = "Column to sum",
                    default = NULL)

parser$add_argument("--mean",
                    help = "Column to average",
                    default = NULL)

parser$add_argument("--output",
                    "-o",
                    default = NULL,
                    help = "Output file (default: stdout)")

args <- parser$parse_args()

The parser defines a single positional argument for the input file path plus four optional flags. The --group-by argument tells the script which column to use for grouping rows. The --sum and --mean flags let the user choose the aggregation method; when both are omitted, the script returns the raw data without aggregation. The --output flag (with short alias -o) determines where results are written, defaulting to stdout when not specified.

After parsing, the arguments are available in the args list. The processing logic reads the input file and applies the requested aggregation:

# Read data
data <- read.csv(args$file)

# Process
if (!is.null(args$group_by)) {
  if (!is.null(args$sum)) {
    result <- aggregate(data[[args$sum]],
                        by = list(group = data[[args$group_by]]),
                        FUN = sum)
    names(result) <- c(args$group_by, args$sum)
  } else if (!is.null(args$mean)) {
    result <- aggregate(data[[args$mean]],
                        by = list(group = data[[args$group_by]]),
                        FUN = mean)
    names(result) <- c(args$group_by, args$mean)
  }
} else {
  result <- data
}

# Output
if (is.null(args$output)) {
  print(result)
} else {
  write.csv(result, args$output, row.names = FALSE)
  cat("Wrote", args$output, "\n")
}

The aggregate() call groups data by the column specified with --group-by and applies either sum() or mean() to the target column. The if/else if structure makes --sum and --mean mutually exclusive — if the user supplies both, only --sum is applied, which is a reasonable default for unambiguous behaviour. The fallback else branch returns unaggregated data when no grouping column is provided.

After saving the script and making it executable, invoke it from the command line. The positional file argument is required; all flags are optional. Omitting --output prints results to the terminal:

chmod +x process.R
./process.R sales.csv --group-by region --sum revenue --output summary.csv

Choosing between argparse and optparse

Featureargparseoptparse
Positional argsYesLimited
Automatic helpYesManual
Python argparse styleYesNo
Simpler syntaxNoYes

Recommendation: Use argparse for new projects—it handles more cases and produces cleaner help messages.

Theming and styling

cli uses ANSI escape codes for colors and styles. cli_text("{.code foo()}") formats as inline code. {.strong text} bolds; {.emph text} italicizes; {.val value} displays a value. Styles are automatically disabled when output is redirected to a file or a non-interactive session. ansi_has_any() detects whether the terminal supports ANSI codes at runtime.

Spinners and status

cli_process_start("Connecting") and cli_process_done() wrap a process with a spinner that animates while the process runs. cli_status_update() changes the spinner message mid-process. cli_alert_success(), cli_alert_warning(), and cli_alert_danger() display colored status messages with appropriate icons at completion. These give command-line tools a professional, informative output experience.

Verbosity control

Good CLI packages allow users to control output verbosity. cli::local_verbosity("verbose") sets the verbosity for the current scope. Check the current level with cli::cli_verbosity(). Build functions that emit detailed progress with cli_inform() at the verbose level and only errors and warnings at the default level. Package authors can expose this via a verbose = getOption("mypkg.verbose", default = FALSE) argument that maps to cli verbosity settings.

Building CLI apps with cli

The cli package provides tools for building rich, user-friendly command-line interfaces in R. It handles text formatting, progress bars, status indicators, and structured output with semantic markup that degrades gracefully in non-interactive environments.

cli_h1(), cli_h2(), cli_h3() print styled headings. cli_alert_info(), cli_alert_success(), cli_alert_warning(), cli_alert_danger() display status messages with colored icons. These automatically detect whether the output is a terminal (and apply ANSI colors) or a file/pipe (and output plain text).

cli_ul() and cli_ol() create unordered and ordered lists. cli_li() adds items. Nest lists with multiple levels: cli_ul(); cli_li("Item"); cli_ul(); cli_li("Nested"); cli_end(); cli_end(). The {.code}, {.file}, {.url}, {.pkg}, {.fun} inline markup styles format different types of content consistently: cli_alert_info("Install {.pkg ggplot2} first").

Progress bars

cli_progress_bar(total = n) creates a progress bar. cli_progress_update() increments it by one. cli_progress_done() marks it complete. Within loops:

cli_progress_bar("Processing files", total = length(files))
for (f in files) {
  process(f)
  cli_progress_update()
}
cli_progress_done()

cli_progress_step("Fetching data") creates a spinner for steps of unknown duration. It spins until you call the next cli_progress_step() or cli_progress_done(). This is useful for multi-step operations where each step has an unknown duration.

For vectorized operations with purrr::map(), purrr::map() has no native progress bar. Use cli_progress_along(x) as the iterator: purrr::map(cli_progress_along(files), function(i) process(files[[i]])) shows a progress bar as map iterates.

Pluralization

cli handles pluralization automatically: cli_alert_info("Found {n} file{?s}"). The {?s} token appends “s” when n != 1 and nothing when n == 1. For irregular plurals: {?is/are} outputs “is” for singular and “are” for plural.

Inline interpolation uses glue-style {expr}. cli_text("Processing {.file {path}}") formats the path variable with file styling. String vectors are collapsed to a comma-separated list automatically: cli_text("Columns: {.code {names(df)}}") outputs “Columns: a, b, c”.

Using cli in packages

For packages, use cli rather than message() and cat() for user-facing output. cli respects options(cli.default_handler) and rlang::local_options(cli.default_handler = ...) for capturing output in tests. withr::local_options(cli.default_handler = function(msg) NULL) silences cli output during testing.

rlang::inform() uses cli markup when available, making it the recommended way to emit non-error messages in packages. This gives your package’s messages the same formatting as tidyverse packages.

Condition objects carry structured data: rlang::inform("Data has {n} missing values", .data = list(n = n)). Handlers can extract the data from the condition object for programmatic use, while the message renders as human-readable text in interactive sessions.

Argument parsing for CLI scripts

For R scripts run from the command line (Rscript script.R --flag value), optparse and argparse provide Python-argparse-style argument parsing. argparse::ArgumentParser() creates a parser; add_argument("--output", type = "character", default = "output.csv") adds arguments; parse_args() returns a list.

The docopt package uses a docstring to define the interface: write the help text first, and docopt parses it to generate the parser. This documentation-first approach keeps the help text and the parser in sync automatically.

For interactive use, rstudioapi::showDialog() and rstudioapi::askForPassword() provide GUI dialogs in RStudio, but these do not work in non-interactive scripts.

Why cli over cat and message

The cli package solves the problem that base R’s message and cat functions produce unformatted text output that looks the same regardless of whether the output is a terminal, a log file, or a pipe. cli detects whether the output device supports ANSI color codes and adapts accordingly. In a terminal, output is colored and formatted. In a non-interactive session like CI logs or when output is redirected to a file, cli strips the formatting and produces clean plain text. This automatic adaptation means you write formatting code once and it behaves correctly everywhere.

cli also handles the semantic distinction between different types of messages. Informational messages, warnings, and errors each have their own formatting and output destination. Progress messages are updated in place rather than appending new lines. Headers and bullets create visual hierarchy. These distinctions matter for building tools that communicate clearly to users at different verbosity levels.

Building hierarchical output

cli’s semantic markup uses curly-brace inline classes to control formatting. Wrapping text in curly braces with a class like .pkg, .fn, .path, .val, or .code applies terminal styling appropriate to that class, package names in blue, file paths in italic, values highlighted. This keeps formatting consistent without manually specifying colors, and the class names document the semantic meaning of each highlighted term.

Lists in cli are built with cli_ul for unordered, cli_ol for ordered, and cli_dl for definition lists. Items are added with cli_li inside the list context. Nesting works by opening a new list context inside an existing one. The visual hierarchy in the output reflects the logical hierarchy of the content, which helps users quickly scan output to find relevant information.

Error and warning messages

cli provides cli_abort, cli_warn, and cli_inform as replacements for stop, warning, and message. The advantage of the cli versions is that they support the same inline markup and multi-line formatting as other cli functions, so error messages can include formatted code, file paths, and bulleted lists of possible fixes. Well-formatted error messages that point to the source of the problem reduce debugging time.

cli_abort generates a rlang error condition, which means it participates in the condition system. Callers can catch and handle cli errors the same way they handle any other R error. For package development, cli_abort is the standard approach to user-facing errors because it produces consistent, styled output that fits the cli aesthetic used throughout the tidyverse and many modern R packages.

Summary

Building CLI tools in R comes down to a few key decisions:

  • Use argparse or optparse for argument parsing
  • Use shebangs (#!/usr/bin/env Rscript) for executable scripts
  • Make scripts executable with chmod +x
  • Return proper exit codes for shell integration
  • argparse is recommended for new CLI projects

See also