Calling Python from R with reticulate
What is reticulate?
If you work with R, you probably know that R has thousands of packages for statistics, data analysis, and visualization. But Python has its own rich ecosystem—particularly for machine learning (scikit-learn, TensorFlow) and scientific computing (NumPy, Pandas). What if you could use the best of both worlds?
That’s exactly what the reticulate package does. It creates a bridge between R and Python, letting you import Python libraries, call Python functions, and pass data back and forth—all from your R code.
Think of reticulate as a bilingual assistant. You speak to it in R, and it translates your requests into Python, gets the result, and translates it back into R for you.
Installing reticulate
The package lives on CRAN, so installation is straightforward:
# Install reticulate from CRAN
install.packages("reticulate")
After installation, load it like any other package:
library(reticulate)
Importing Python modules
The most common way to use Python in R is to import a Python module. In Python, you would write import numpy. In R with reticulate, you use the import() function:
# Import Python's numpy library and give it a name we can use
np <- import("numpy")
# Now we can use numpy functions through np
arr <- np$array(c(1, 2, 3, 4, 5)) # Create a numpy array
arr$mean() # Calculate the mean: returns 3
Notice the $ syntax. Since we’re calling Python code, we use $ instead of R’s usual :: for accessing module functions.
You can import any Python library this way—pandas, os, sklearn, and so on:
# Import multiple Python libraries
pd <- import("pandas") # For dataframes
os <- import("os") # For filesystem operations
# For scikit-learn, import specific submodules
model_selection <- import("sklearn.model_selection")
datasets <- import("sklearn.datasets")
Running Python code directly
Sometimes you want to run a chunk of Python code rather than import a full module. Reticulate provides two functions for this:
# Run Python code as a string
py_run_string("
x = 10
y = 20
result = x + y
")
# Access the result in R through the py object
py$result # Returns 30
The py object lets you read and write Python variables from R. Any variable you create in Python with py_run_string() becomes accessible through py$variable_name.
You can also run Python from a file:
# Execute a Python script and make its functions available
source_python("my_functions.py")
# Call a function defined in that Python file
output <- my_python_function(42)
Understanding type conversion
One of reticulate’s most useful features is automatic type conversion. When you pass data from R to Python, reticulate converts it to the equivalent Python type. When the result comes back, it converts to R again.
Here’s how the conversion works:
| R type | Becomes in Python |
|---|---|
c(1, 2, 3) | List [1, 2, 3] |
data.frame(...) | Pandas DataFrame |
matrix(...) | NumPy array |
list(a = 1, b = 2) | Dictionary {'a': 1, 'b': 2} |
TRUE / FALSE | True / False |
NULL | None |
You can also convert explicitly when needed:
# Explicitly convert R data to Python
r_df <- data.frame(x = 1:3, y = c("a", "b", "c"))
py_df <- r_to_py(r_df)
# Explicitly convert Python object back to R
r_object <- py_to_r(py_object)
Sometimes you might want to disable automatic conversion. Pass convert = FALSE to import():
# Import without automatic conversion
np <- import("numpy", convert = FALSE)
# Now np$array() returns a Python object, not R
arr <- np$array(c(1, 2, 3))
# Convert manually when ready
r_arr <- py_to_r(arr)
Common pitfalls for beginners
When using reticulate, there are a few quirks that trip up newcomers. Knowing about them upfront saves hours of debugging.
Indexing starts at 0, not 1
Python uses 0-based indexing, while R uses 1-based. This matters when calling Python methods:
# In R, the first element is at position 1
vec <- c(10, 20, 30)
vec[1] # Returns 10
# In Python (and reticulate), the first element is at position 0
np <- import("numpy")
py_vec <- np$array(c(10, 20, 30))
py_vec$item(0L) # Returns 10 - note the L for integer
# Direct indexing also works
py_vec[[1]] # Returns 10 (R index, converts to 0 internally)
Use the L suffix for integers
R treats all numbers as numeric (double) by default. Python distinguishes between integers and floats. When a Python function expects an integer, pass it with L:
# Create a 2x3 array first
np <- import("numpy")
arr <- np$array(c(1, 2, 3, 4, 5, 6)) # 6 elements for 2x3 reshape
# This might fail or behave unexpectedly
arr$reshape(2, 3) # Python sees 2 and 3 as floats
# This works correctly
arr$reshape(2L, 3L) # Python sees 2 and 3 as integers
The L suffix tells R to create an integer rather than a numeric value.
Single-element vectors become scalars
In R, c(5) is a vector of length 1. In Python, this often becomes a scalar (a single value). If a Python function expects a list, wrap it explicitly:
# R converts c(5) to a Python scalar, not a list
# Some Python functions will reject this
# Explicitly create a list
list(5L) # This becomes [5] in Python
Iterators can be used only once
If you create a Python iterator, consuming it once empties it:
# Create a Python iterator
py_iterator <- iter(py$range(5))
# Consume the first element
iter <- iter_next(py_iterator)
# The iterator is now exhausted
second <- iter_next(py_iterator) # Returns NULL - nothing left
Practical example: using pandas
Let’s put this together with a realistic example. Suppose you have an R data frame and want to use Pandas for some operation:
library(reticulate)
# Create an R data frame
r_df <- data.frame(
name = c("Alice", "Bob", "Carol"),
score = c(85, 92, 78)
)
# Import pandas
pd <- import("pandas")
# Convert to Pandas DataFrame (automatic conversion)
py_df <- r_to_py(r_df)
# Assign the result - sort_values returns a new DataFrame
py_df <- py_df$sort_values("score", ascending = FALSE)
print(py_df)
# name score
# 1 Bob 92
# 2 Alice 85
# 3 Carol 78
You can also filter directly using Pandas syntax:
# Filter rows where score > 80
filtered <- py_df[py_df$score > 80, ]
print(filtered)
# name score
# 1 Alice 85
# 2 Bob 92
Choosing your Python environment
By default, reticulate uses the Python in your system PATH. You can specify a different Python or conda environment:
# Use a specific Python interpreter
use_python("/usr/bin/python3")
# Use a conda environment
use_condaenv("my-environment")
# Use a virtual environment
use_virtualenv("my-venv")
This is useful when you need specific package versions or want to keep your R and Python dependencies separate.
Summary
The reticulate package opens up Python’s ecosystem to R users. You can import any Python module with import(), run inline Python with py_run_string(), and let reticulate handle the data conversion automatically. Watch out for the 0-based indexing and integer type quirks, and you’ll have a powerful combination of both languages at your fingertips.
See also
- The install.packages() function for installing R packages from CRAN
- R’s library() function for loading R packages
- R’s data.frame() function for data exchange between R and Python