Functions and Control Flow in R
Functions and control flow are the building blocks of any R program. Whether you’re cleaning data, running statistical analyses, or building Shiny applications, you’ll constantly need to write custom functions to encapsulate logic and use conditionals and loops to handle different scenarios.
In this tutorial, you’ll learn how to write your own functions in R, make decisions with if-else statements, and repeat operations with loops. By the end, you’ll be able to write reusable code that makes your analyses more efficient and easier to maintain.
Why Write Functions?
Before diving into the syntax, it’s worth understanding why functions matter. Without functions, you’d copy and paste the same code multiple times throughout your script. This creates several problems:
- Hard to maintain — If you find a bug, you have to fix it everywhere
- Hard to read — Long scripts with repeated code are difficult to navigate
- Error-prone — Subtle differences between copies can cause unexpected results
Functions solve all three problems. A well-written function is:
- Reusable — Call it anywhere in your code
- Testable — Verify it works once, then trust it everywhere
- Self-documenting — A good function name explains what it does
Writing Your First Function
Functions in R are created using the function() keyword. The basic structure is:
function_name <- function(arguments) {
# function body
# computations
return(value) # optional
}
Here’s a simple example:
# A function that adds two numbers
add_numbers <- function(a, b) {
result <- a + b
return(result)
}
# Call the function
add_numbers(5, 3)
# [1] 8
When you call add_numbers(5, 3), R assigns 5 to a and 3 to b, executes the body, and returns 8.
Implicit Returns
The return() statement is optional. If omitted, R returns the last evaluated expression:
multiply <- function(x, y) {
x * y # This gets returned automatically
}
multiply(4, 7)
# [1] 28
This works, but explicit return() statements make your code clearer, especially when returning early from a function.
Function Arguments with Default Values
R functions can have default argument values, making them more flexible:
greet <- function(name = "World") {
message <- paste("Hello,", name, "!")
return(message)
}
greet() # Uses default
# [1] "Hello, World !"
greet("Alice") # Overrides default
# [1] "Hello, Alice !"
You can also require certain arguments and make others optional:
calculate_bmi <- function(weight_kg, height_m, convert = TRUE) {
bmi <- weight_kg / (height_m ^ 2)
if (convert) {
return(bmi)
} else {
return(round(bmi, 1))
}
}
calculate_bmi(70, 1.75) # Returns 22.86
calculate_bmi(70, 1.75, FALSE) # Returns 22.9
Conditional Execution with if-else
The if statement executes code only when a condition is TRUE:
x <- 10
if (x > 5) {
print("x is greater than 5")
}
# [1] "x is greater than 5"
The condition must be a single TRUE or FALSE value. If you have a vector, use any() or all():
values <- c(1, 2, 3, 4, 5)
if (any(values > 3)) {
print("At least one value is greater than 3")
}
if-else Chains
For multiple conditions, use else if:
score <- 75
if (score >= 90) {
grade <- "A"
} else if (score >= 80) {
grade <- "B"
} else if (score >= 70) {
grade <- "C"
} else if (score >= 60) {
grade <- "D"
} else {
grade <- "F"
}
grade
# [1] "C"
R evaluates conditions top-to-bottom and stops at the first TRUE.
Vectorized ifelse
The ifelse() function provides vectorized conditional logic:
x <- c(1, 5, 10, -3)
ifelse(x > 0, "positive", "negative")
# [1] "positive" "positive" "positive" "negative"
This is much faster than looping over elements.
For Loops
For loops iterate over a sequence of values:
# Print numbers 1 to 5
for (i in 1:5) {
print(i)
}
# [1] 1
# [1] 2
# [1] 3
# [1] 4
# [1] 5
The loop variable i takes each value in the sequence 1:5 in turn.
Collecting Results
Pre-allocate space for results to make loops faster:
squares <- numeric(5)
for (i in 1:5) {
squares[i] <- i^2
}
squares
# [1] 1 4 9 16 25
Better yet, use vectorized operations when possible:
squares <- (1:5) ^ 2
squares
# [1] 1 4 9 16 25
Iterating Over Vectors
You can loop over any vector:
fruits <- c("apple", "banana", "cherry")
for (fruit in fruits) {
print(paste("I like", fruit))
}
# [1] "I like apple"
# [1] "I like banana"
# [1] "I like cherry"
Or use indices:
for (i in seq_along(fruits)) {
print(paste(i, ":", fruits[i]))
}
While Loops
While loops repeat until a condition becomes FALSE:
countdown <- 5
while (countdown > 0) {
print(countdown)
countdown <- countdown - 1
}
print("Liftoff!")
# [1] 5
# [1] 4
# [1] 3
# [1] 2
# [1] 1
# [1] "Liftoff!"
Be careful with while loops — they can run forever if the condition never becomes FALSE.
Control Statements
break — Exit the Loop
Use break to exit a loop early:
for (i in 1:10) {
if (i == 6) {
break
}
print(i)
}
# Prints 1 through 5
next — Skip an Iteration
Use next to skip to the next iteration:
for (i in 1:5) {
if (i == 3) {
next # Skip printing 3
}
print(i)
}
# Prints 1, 2, 4, 5
Practical Example: A Temperature Converter
Let’s build a complete function that converts temperatures, handles invalid inputs gracefully, and demonstrates multiple control flow patterns:
convert_temp <- function(value, unit = "C") {
# Validate input
if (!is.numeric(value)) {
return("Error: Temperature must be a number")
}
# Convert based on unit
if (unit == "C") {
# Celsius to Fahrenheit
fahrenheit <- (value * 9/5) + 32
return(fahrenheit)
} else if (unit == "F") {
# Fahrenheit to Celsius
celsius <- (value - 32) * 5/9
return(celsius)
} else {
return("Error: Unit must be 'C' or 'F'")
}
}
# Test the function
convert_temp(100, "C")
# [1] 212
convert_temp(32, "F")
# [1] 0
convert_temp("hot", "C")
# [1] "Error: Temperature must be a number"
This example demonstrates:
- Input validation with
if - Multiple conditions with
else if - Early return statements
- Error handling for invalid inputs
Best Practices
Follow these guidelines to write clean, maintainable R code:
- Use meaningful names — Function names should describe what they do (
calculate_bminotfunc1) - Keep functions short — Each function should do one thing well
- Document your functions — Use comments to explain inputs, outputs, and purpose
- Handle errors gracefully — Check for invalid inputs at the start
- Prefer vectorization — Use vectorized operations instead of loops when possible
- Use snake_case — R conventions use snake_case for variables and functions
Common Pitfalls
Watch out for these common mistakes:
Forgetting to return:
# Wrong: returns NULL
f <- function(x) { x + 1 }
# Correct: returns the result
f <- function(x) { return(x + 1) }
Not vectorizing conditions:
# Wrong: only checks first element
if (c(TRUE, FALSE)) { "yes" }
# Correct: check all elements
if (all(c(TRUE, FALSE))) { "yes" }
Infinite loops:
# Wrong: never ends
i <- 1
while (i > 0) { i <- i + 1 }
# Correct: has termination condition
i <- 1
while (i <= 5) { print(i); i <- i + 1 }
Summary
You now know how to:
- Create functions with
function()and use arguments - Return values explicitly or implicitly
- Use default argument values for flexibility
- Control flow with
if,else if, andelse - Iterate with
forandwhileloops - Use
breakandnextfor loop control - Handle errors gracefully with input validation
These fundamentals will serve you well as you build more complex R programs. Functions let you encapsulate logic into reusable components, while control flow lets you handle the diverse conditions you’ll encounter in real-world data analysis.
In the next tutorial, we’ll cover importing and exporting data — the essential skill for reading your data into R and saving your results.