Fast File Import with readr

· 4 min read · Updated March 7, 2026 · beginner
readr import tidyverse csv data

When working with data in R, reading flat files like CSVs is one of the most common tasks. The readr package provides a fast and friendly way to import data into R, handling the nitty-gritty details so you can focus on analysis.

Why use readr?

The base R functions like read.csv() have served us well for decades, but they come with some baggage. They’re relatively slow for large files, make assumptions about your data that may not hold, and their defaults don’t always align with modern data workflows.

readr addresses these issues by being:

  • Faster — Uses parallel processing and parses data more efficiently
  • Smarter — Automatically detects delimiters and handles type conversion
  • Tidier — Returns tibbles instead of data frames, with better printing
  • More controllable — Explicit column type specification when you need it

Installing and loading readr

If you haven’t already, install readr from CRAN:

install.packages("readr")
library(readr)

Reading CSV files

The most common use case is reading comma-separated values. Use read_csv() for this:

# Read a CSV file
df <- read_csv("my_data.csv")

# Print the tibble to inspect
df

When you print a readr tibble, you get a compact summary showing the column names, types, and the first few rows:

# A tibble: 1,000 × 5
   name          age department salary
   <chr>        <dbl> <chr>      <dbl>
1  Alice           32 Engineering  75000
2  Bob             28 Sales        62000
3  Carol           45 Marketing    89000

This output format makes it easy to see at a glance what data you’re working with — much cleaner than the default data.frame printing in base R.

Reading TSV and other delimited files

For tab-separated values, use read_tsv():

# Read a tab-separated file
df_tsv <- read_tsv("data.tsv")

For files with other delimiters, read_delim() handles any character as a separator:

# Read a semicolon-delimited file
df_semi <- read_delim("data.txt", delim = ";")

# Read a pipe-delimited file
df_pipe <- read_delim("data.txt", delim = "|")

Specifying column types

One of readr’s superpowers is automatic type detection, but sometimes you need explicit control. The col_types argument lets you specify exactly what each column should be:

df <- read_csv("data.csv", 
               col_types = cols(
                 id = col_integer(),
                 name = col_character(),
                 value = col_double(),
                 flag = col_logical()
               ))

The available column types are:

  • col_integer() — Integer numbers
  • col_double() — Floating point numbers
  • col_character() — Strings
  • col_logical() — TRUE/FALSE values
  • col_factor() — Categorical factors
  • col_date(), col_datetime(), col_time() — Date/time types
  • col_skip() — Don’t import this column

Handling missing values

readr automatically converts common missing value representations:

# These are all treated as NA:
# - Empty strings: ""
# - "NA" (the string)
# - "N/A", "n/a"
# - ".", "#NUM!", etc.

df <- read_csv("data.csv", na = c("", "NA", "N/A"))

The na argument lets you customize what strings should be treated as missing values.

Reading files from URLs

readr can read directly from URLs, which is handy for accessing online datasets:

# Read a CSV directly from the web
df <- read_csv("https://example.com/data.csv")

# Read from GitHub raw URLs
df <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-11.csv")

Performance tips

When working with large files, these tips will speed things up:

Skip rows and limit columns

# Skip the first 10 rows (like header comments)
df <- read_csv("data.csv", skip = 10)

# Only read first 1000 rows
df <- read_csv("data.csv", n_max = 1000)

# Select specific columns
df <- read_csv("data.csv", col_select = c(id, name, value))

Specify column types upfront

Specifying types avoids the overhead of type inference:

# Faster: types known upfront
df <- read_csv("large_file.csv", 
               col_types = cols(.default = col_double()))

Use show_col_types = FALSE

Suppress the column type printing for cleaner output in scripts:

df <- read_csv("data.csv", show_col_types = FALSE)

Comparing with base R

Here’s a quick comparison between readr and base R functions:

FeaturereadrBase R
ReturnsTibbleData frame
SpeedFasterSlower
Type detectionAutomaticAutomatic
String factorsNo (by default)Yes (by default)
Row namesNoneOptional

The main practical difference you’ll notice is that readr never creates row names, and it doesn’t automatically convert strings to factors — both behaviors that modern R programmers generally prefer.

Summary

The readr package is your go-to for importing flat files in R. Whether you’re loading customer data, survey results, or any tabular dataset, readr makes the process painless and predictable.

Key takeaways:

  • Use read_csv() for CSVs, read_tsv() for TSVs, and read_delim() for custom delimiters
  • Specify col_types when you need explicit control over column interpretation
  • Use na to handle missing value representations in your data
  • Take advantage of skip, n_max, and col_select for large files
  • Enjoy the tibble output and sensible defaults

With readr in your toolkit, you’ll spend less time fighting with file imports and more time doing actual analysis. The next time you need to bring data into R, reach for readr first — your future self will thank you.