How to Sample Without Replacement in R with sample()
Sample without replacement in R using sample() with replace = FALSE (the default) to draw unique values from a population. Each element can only be selected once. This is the typical choice for train/test splits and random subset selection where duplicates would be meaningless. For data frames, combine with dplyr::slice_sample() or row-index subsetting.
# Sample 3 numbers from 1 to 10 without replacement
sample(1:10, size = 3)
# [1] 4 9 2
# Shuffle all elements
sample(1:10)
# [1] 3 9 5 10 6 1 2 4 8 7
For data frames, use dplyr::slice_sample(n = 5) or subset by indices: df[sample(nrow(df), 5), ]. Always set a seed with set.seed() for reproducible sampling. Since R 3.6, the default random number generator changed, so record the R version alongside your seed for full reproducibility.
# Create train/test indices without replacement
set.seed(42)
indices <- sample(100, size = 70)
train <- indices
test <- setdiff(1:100, indices)
Weighted sampling uses the prob argument: sample(x, n, prob = weights) where weights is a vector of selection probabilities the same length as x. sample() requires size <= length(x) when replace = FALSE; requesting more samples than the population raises an error. The weights do not need to sum to one — R normalises them automatically.