How to sample random rows from a data frame in R
· 2 min read · Updated March 14, 2026 · beginner
r sample random dplyr data.table slice
Sampling random rows from a data frame is essential for creating train/test splits, bootstrap resampling, and randomizing data order.
With dplyr
The tidyverse way uses slice_sample():
library(dplyr)
# Sample 5 random rows without replacement
sampled <- df %>% slice_sample(n = 5)
# Sample 10% of rows
sampled <- df %>% slice_sample(prop = 0.1)
# Sample with replacement (allows duplicate rows)
sampled <- df %>% slice_sample(n = 5, replace = TRUE)
# Weighted sampling (probability proportional to a column)
sampled <- df %>% slice_sample(n = 5, weight_by = some_column)
# Sample all rows in random order (shuffle)
shuffled <- df %>% slice_sample(prop = 1)
With base R
Base R uses sample() with row indices:
# Sample 5 random row indices without replacement
set.seed(42) # for reproducibility
indices <- sample(nrow(df), size = 5)
sampled <- df[indices, ]
# Sample 10% of rows
n_to_sample <- round(nrow(df) * 0.1)
sampled <- df[sample(nrow(df), n_to_sample), ]
# Sample with replacement
sampled <- df[sample(nrow(df), 5, replace = TRUE), ]
# Randomize row order
shuffled <- df[sample(nrow(df)), ]
With data.table
library(data.table)
dt <- as.data.table(df)
# Sample 5 random rows without replacement
sampled <- dt[sample(.N, 5)]
# Sample 10% of rows
sampled <- dt[sample(.N, round(.N * 0.1))]
# Sample with replacement
sampled <- dt[sample(.N, 5, replace = TRUE)]
# Shuffle all rows
shuffled <- dt[sample(.N)]
Train/Test Split Example
A common use case is creating training and testing sets:
library(dplyr)
set.seed(123)
# Split: 70% train, 30% test
n <- nrow(df)
train_indices <- sample(n, size = 0.7 * n)
train <- df %>% slice(train_indices)
test <- df %>% slice(-train_indices) # remaining rows
Or with base R:
set.seed(123)
indices <- sample(nrow(df))
train_size <- round(0.7 * nrow(df))
train <- df[indices[1:train_size], ]
test <- df[indices[(train_size + 1):n], ]
Stratified Sampling
Sample proportionally from groups:
library(dplyr)
# Sample 5 rows from each group
stratified <- df %>%
group_by(category) %>%
slice_sample(n = 5)
Common Variations
| Task | dplyr | data.table |
|---|---|---|
| Sample n rows | slice_sample(n = 5) | dt[sample(.N, 5)] |
| Sample proportion | slice_sample(prop = 0.1) | dt[sample(.N, .N * 0.1)] |
| Sample with replacement | slice_sample(n = 5, replace = TRUE) | dt[sample(.N, 5, replace = TRUE)] |
| Shuffle rows | slice_sample(prop = 1) | dt[sample(.N)] |
See Also
- sample() — Base R sampling function
- dplyr::slice() — dplyr row selection functions
- head() — Select first rows