Modular R Code with box
Modern R projects can quickly become unwieldy as they grow. Functions get scattered across files, naming conflicts emerge, and tracking dependencies becomes a nightmare. The box package provides a clean solution: a modern module system that lets you treat files and folders of R code as independent, nestable modules without the overhead of creating formal R packages.
This guide walks you through everything you need to write modular, maintainable R code with box.
Why Use Modules?
Before diving into box, consider the problem it solves. In traditional R scripts, you either load everything with library()—which pollutes your global namespace—or you carefully manage which functions you import. Neither approach scales well for large projects.
The box package brings module-based development to R:
- Local scoping: Imports don’t clutter your global environment
- Explicit dependencies: Every import is visible in your code
- Nested modules: Organize code in folders that mirror your project structure
- No package build required: Use modules directly from files and directories
Think of it like JavaScript’s import statements or Python’s modules—but for R.
Installation
Install box from CRAN the way you would any R package:
install.packages("box")
The package requires R 3.6.0 or later. You can check your R version with R.version.string.
Your First Module
Let’s build a simple module to see how this works in practice. Suppose you have a project with some utility functions for data cleaning. Instead of scattering these across your scripts, put them in a module.
Create a folder called utils in your project directory, then create a file called cleaning.R inside it:
# File: utils/cleaning.R
#' Remove outliers using IQR method
#' @param x Numeric vector
#' @param multiplier IQR multiplier (default 1.5)
#' @export
remove_outliers <- function(x, multiplier = 1.5) {
q <- quantile(x, c(0.25, 0.75))
iqr <- q[2] - q[1]
lower <- q[1] - multiplier * iqr
upper <- q[2] + multiplier * iqr
x[x >= lower & x <= upper]
}
#' Normalize vector to 0-1 range
#' @param x Numeric vector
#' @export
normalize <- function(x) {
(x - min(x)) / (max(x) - min(x))
}
This is your first module. It lives in utils/cleaning.R. The #' @export tags mark functions that other files can import—exactly like you would in a formal R package.
Loading Modules and Packages
Now let’s use this module in your main script. The box::use() function is your primary tool for both modules and packages.
Loading a Module
To import your cleaning module, add this to your script:
box::use(utils/cleaning)
# Now you can use the functions
data <- c(1, 2, 3, 4, 100) # 100 is an outlier
clean_data <- cleaning$remove_outliers(data)
normalized <- cleaning$normalize(clean_data)
The $ notation accesses functions from the imported module—similar to how you would access list elements or namespace contents.
Loading with an Alias
If the module name is long or conflicts with something else, use an alias:
box::use(utils/cleaning[as clean])
clean_data <- clean$remove_outliers(data)
Loading Specific Functions
You can import just the functions you need:
box::use(utils/cleaning[remove_outliers, normalize])
# These functions are now available directly
clean_data <- remove_outliers(data)
normalized <- normalize(clean_data)
This approach keeps your code explicit about what it uses.
Loading Packages
The same syntax works for loading packages. Use box::use(stats) instead of library(stats):
box::use(stats)
box::use(ggplot2)
box::use(dplyr)
# Use package functions with :: or access via namespace
ggplot2$ggplot(data, ggplot2$aes(x = x, y = y))
result <- stats$sd(x)
The key difference from library() is that these imports are locally scoped—they don’t attach to your search path or pollute your global environment.
Attaching All Exported Names
When you want all exported functions available directly (like traditional package loading), use the attachment syntax:
box::use(utils/cleaning[...])
# All exported functions now available
clean_data <- remove_outliers(data)
The ... syntax attaches every function marked with #' @export.
Writing and Exporting from Modules
You have two ways to mark functions for export from a module.
Method 1: Roxygen2 Tags
Use #' @export before each function you want to export:
#' Sum two numbers
#' @param a First number
#' @param b Second number
#' @export
add <- function(a, b) {
a + b
}
This works seamlessly if you’re already using roxygen2 for documentation.
Method 2: box::export()
Alternatively, call box::export() explicitly:
add <- function(a, b) {
a + b
}
box::export(add)
This is useful when you want conditional exports or dynamic function lists.
Helper Functions That Stay Private
Functions without #' @export or a box::export() call remain private to the module. They won’t be accessible from outside:
# File: utils/internal.R
# This function is private—only usable within the module
helper <- function(x) {
x * 2
}
#' This one is exported
#' @export
public_function <- function(y) {
helper(y) # Uses the private helper internally
}
Key Features
Nested Modules
Create folders within folders to organize related functionality:
project/
├── app.R
├── models/
│ ├── __init__.R # Module entry point
│ ├── linear.R
│ └── tree.R
└── utils/
├── __init__.R
├── cleaning.R
└── validation.R
The __init__.R file serves as the module entry point—similar to Python’s __init__.py. When you import the folder, box executes __init__.R first:
# File: models/__init__.R
box::use(models/linear)
box::use(models/tree)
box::export(
linear$fit,
tree$fit
)
Now importing models gives you access to both submodules:
box::use(models)
models$linear$fit(data)
Reloading Modules During Development
When you’re actively developing a module, use box::reload() to pick up changes without restarting R:
box::use(utils/cleaning)
# Make changes to utils/cleaning.R, then reload
box::reload(utils/cleaning)
This dramatically speeds up development cycles.
Module Information Helpers
Two functions help you understand your module’s context:
box::name(): Returns the module’s namebox::file(): Returns the module’s file path
box::use(utils/cleaning)
box::name() # "cleaning"
box::file() # "/path/to/project/utils/cleaning.R"
Initialization Hooks
Use .on_load() for module initialization code that runs when the module first loads:
# File: utils/database.R
.on_load <- function() {
message("Initializing database connection...")
# Set up connections, load configs, etc.
}
box::export(
query,
connect
)
Common Pitfalls
Understanding what box does differently from traditional R loading will save you debugging time.
Only Base Package Is Attached
Inside a module, only the base package is automatically available. If you need functions from standard packages, import them explicitly:
# This won't work inside a module:
mean(x) # Error: could not find function "mean"
# Do this instead:
box::use(stats[mean])
mean(x) # Works
Module vs Package Syntax
Remember the difference between modules and packages:
box::use(stats) # Loads the stats PACKAGE
box::use(./utils) # Loads the utils MODULE (relative path)
box::use(parent/utils) # Loads from parent directory
Omitting the ./ or / prefix always loads a package, not a module.
Case Sensitivity
Module paths are case-sensitive. box::use(utils/Cleaning) will fail if your file is actually named cleaning.R:
box::use(utils/cleaning) # Correct
box::use(utils/Cleaning) # Wrong—check your file names
Relative Paths for Local Modules
When importing modules in the same directory, use ./:
box::use(./my_module) # Same directory
box::use(../shared) # Parent directory
box::use(project/utils) # Subdirectory
Namespace Differences
Note that box::use(pkg) does not attach the package—it makes it available via namespace access. This is intentional:
box::use(stats)
sd(x) # Error—not in search path
stats$sd(x) # Works correctly
Comparison with Alternatives
Here’s how box stacks up against other approaches you might consider.
vs library()
The library() function attaches packages to your search path, which can cause name conflicts and makes dependencies implicit. box::use() is explicit about what you’re importing and keeps your environment clean:
# Traditional approach—hidden dependencies
library(dplyr)
library(ggplot2)
filter(data, x > 0) # Which package's filter?
# Box approach—explicit
box::use(dplyr[filter])
box::use(ggplot2)
filter(data, x > 0) # Unambiguously from dplyr
vs devtools::load_all()
The devtools::load_all() function is designed for package development—it simulates installing and loading a package you’re working on. box, by contrast, is designed for production modularity in non-package projects. Use load_all() while building formal packages; use box for modular scripts and applications.
vs Namespace Manipulation
Traditional R namespaces (:: access) require you to create a formal package with a NAMESPACE file. box gives you namespace-like isolation without the package build process. You get the benefits of proper modularity with a fraction of the overhead.
Summary
The box package brings modern module thinking to R. Here’s what you learned:
- Modules are just files: Any R file in a folder can be a module
- Export with
#' @export: Like roxygen2, mark functions as available to importers - Import explicitly: Use
box::use()for modules and packages - Access via
$: Imported modules work like named lists - Stay local: Imports don’t pollute your global environment
- Reload during dev:
box::reload()picks up changes instantly - Nest folders: Organize code in directory hierarchies
For projects too small to warrant a full package but too large for a single script, box provides the structure you need. Your future self—and your collaborators—will thank you for writing modular code that’s easy to understand, test, and maintain.