Dates and Times with lubridate
Working with dates and times is one of the most frustrating parts of data analysis. Base R has the POSIXct and Date classes, but using them feels like fighting the language rather than working with it. Enter lubridate, a tidyverse package that makes dates and times genuinely enjoyable to work with.
This guide covers the essential lubridate functions you’ll use daily: parsing strings into date objects, extracting date components, and performing arithmetic with dates.
Why lubridate?
Base R stores dates as the number of days since 1970-01-01 and datetimes as seconds since that epoch. This underlying numeric representation is elegant but painful to work with directly. You end up writing cryptic strptime formats and wrestling with time zones.
lubridate solves this with intuitive function names that match the date formats you encounter. The function name itself tells R how to parse the string. No more memorizing format codes.
Installing and loading lubridate
install.packages("lubridate")
library(lubridate)
Or install the full tidyverse:
install.packages("tidyverse")
library(tidyverse)
Parsing dates
The parsing functions are lubridate’s signature feature. The function name encodes the order of year (y), month (m), and day (d) in your string.
# Year-Month-Day
ymd("2024-03-15")
ymd(20240315) # Also works with numeric input
# Day-Month-Year
dmy("15-03-2024")
# Month-Day-Year
mdy("03/15/2024")
# Year-Month
ym("2024-03")
# Various separators work automatically
ymd("2024/03/15")
ymd("2024.03.15")
dmy("15.03.2024")
What if your data has mixed formats? lubridate handles most separators automatically:
# Multiple formats in one vector
dates <- c("2024-01-15", "15/01/2024", "20240115")
ymd(dates) # Works because formats are consistent
The key insight: choose the function whose letters match the order in your string. That’s it.
Parsing date-Times
For timestamps with hours, minutes, and seconds, add the time components to the function name. ymd_hms() handles the full timestamp, while variants like ymd_hm() or mdy_hms() adapt to your input format. By default, lubridate assigns UTC as the time zone, but you can override this with the tz argument — always specify it explicitly when your data comes from a known local time zone to avoid misinterpretation.
# Full timestamps with hours, minutes, seconds
ymd_hms("2024-03-15 14:30:00")
mdy_hm("03/15/2024 2:30 PM")
dmy_hms("15-03-2024 14:30:45")
# Specify time zone explicitly
ymd_hms("2024-03-15 14:30:00", tz = "America/New_York")
ymd_hms("2024-03-15 14:30:00", tz = "Europe/London")
Getting current date and time
today() returns the current system date, and now() returns the current date-time in the local time zone. These are useful for calculating ages from a birth date, computing durations relative to the present, or timestamping program output. Both functions reflect the system clock, so results differ between machines and CI environments.
today() # Current date, e.g. [1] "2026-03-11"
now() # Current date-time in local timezone
Extracting date components
Once you have a parsed date, lubridate’s accessor functions pull out individual components as numeric values — ideal for grouping by year, filtering by month, or creating calendar-based features for a model. Adding label = TRUE returns ordered factors with human-readable labels instead of integers.
date <- ymd("2024-03-15")
year(date) #> 2024
month(date) #> 3
day(date) #> 15
# Human-readable labels
month(date, label = TRUE) #> Mar (ordered factor)
wday(date) #> 6 (Saturday — 1=Sunday convention)
wday(date, label = TRUE) #> Sat
# Calendar positioning
yday(date) #> 75 (day of year)
week(date) #> 11 (week of year)
For datetimes, you can extract time components too:
datetime <- ymd_hms("2024-03-15 14:30:45")
hour(datetime) # 14
minute(datetime) # 30
second(datetime) # 45
tz(datetime) # "UTC"
You can also set components using the same functions:
# Change the month
month(date) <- 6
date
# [1] "2024-06-15"
# Change the hour
hour(datetime) <- 18
datetime
# [1] "2024-03-15 18:30:45"
Date arithmetic
This is where lubridate really shines. You can add and subtract timespans from dates using simple arithmetic.
Durations
Durations represent an exact number of seconds. They’re the most straightforward way to do arithmetic:
# Add 10 days
today() + days(10)
# Subtract 3 hours
now() - hours(3)
# Add multiple units
ymd("2024-03-15") + weeks(2) + days(3)
Available duration functions: seconds(), minutes(), hours(), days(), weeks(), months()
# Duration between two dates
start <- ymd("2024-01-01")
end <- ymd("2024-03-15")
end - start
# Time difference of 74 days
The result is a difftime object. To get the numeric value:
as.numeric(end - start, units = "days")
# 74
Periods
Periods are different. They represent human-friendly units like “1 month” or “2 hours”, units whose length in seconds can vary. This matters for months and years:
# Adding a month to January 31st
jan31 <- ymd("2024-01-31")
jan31 + months(1)
# 2024-03-02 (R correctly rolls over to March 2nd)
# Adding a month to March 31st
mar31 <- ymd("2024-03-31")
mar31 + months(1)
# 2024-05-01 (April has only 30 days)
This behavior is usually what you want. If you need exact seconds regardless of calendar quirks, use durations instead.
Intervals
An interval is a span of time anchored to specific start and end points:
# Create an interval
start <- ymd("2024-01-01")
end <- ymd("2024-03-15")
interval <- interval(start, end)
# Check if a date falls within an interval
ymd("2024-02-15") %within% interval
# TRUE
# Or if two intervals overlap
another_interval <- interval(ymd("2024-03-01"), ymd("2024-04-01"))
int_overlaps(interval, another_interval)
# TRUE
Intervals are particularly useful for detecting overlaps in event data or calculating precise spans across DST boundaries.
Time zones
Two functions handle time zones without changing the underlying moment in time:
# with_tz: display the same moment in a different timezone
now_ny <- now(tz = "America/New_York")
with_tz(now_ny, tz = "Europe/London")
# force_tz: change the timezone label (changes the moment!)
force_tz(now_ny, tz = "Europe/London")
The difference is subtle but important. with_tz converts the clock time. force_tz pretends the same numeric time belongs to a different zone.
Common patterns
A few patterns you’ll use constantly:
# Calculate age from birthdate
birthdate <- ymd("1990-05-15")
age <- interval(birthdate, today()) / years(1)
as.numeric(age)
# Find the last day of a month
floor_date(ymd("2024-03-15"), "month") + months(1) - days(1)
# Or use the built-in
stamp("2024-03-31")(1)
# Round to nearest unit
round_date(ymd_hms("2024-03-15 14:30:30"), "hour")
# 2024-03-15 15:00:00
# Create sequences of dates
seq(from = today(), by = "1 week", length.out = 10)
Time zones in lubridate
Time zone handling is one of the most error-prone aspects of date-time programming. ymd_hms("2024-01-15 14:00:00", tz = "America/New_York") creates a datetime in a specific time zone. with_tz(dt, "UTC") converts to UTC while keeping the same absolute time. force_tz(dt, "America/Chicago") changes the time zone label without changing the clock time, use this to fix incorrectly labeled data.
Sys.timezone() returns the system time zone. Store datetimes in UTC in databases and convert to local time for display. lubridate::tz(dt) returns the current time zone label.
Arithmetic and durations
lubridate::interval(start, end) creates an interval object. as.duration(interval) converts to duration in seconds. as.period(interval) gives a human-readable period (years, months, days). Duration arithmetic respects daylight saving time when using periods: ymd("2024-03-10") + days(1) correctly gives March 11, even though DST ends in some locations on March 10.
lubridate::ceiling_date(dt, "month") rounds up to the nearest month boundary. floor_date(dt, "week") rounds down to the start of the week. These functions are essential for aggregating time series data to regular intervals.
Date sequences
seq(from_date, to_date, by = "month") generates monthly date sequences. lubridate::make_date(year, month, day) constructs a date from components, useful when year, month, and day are in separate columns. lubridate::days_in_month(dt) returns the number of days in the month, useful for computing month-end dates.
Timezones
Timezone handling is the most common source of datetime bugs. lubridate functions default to UTC unless a timezone is specified. with_tz(dt, "America/New_York") converts a datetime to Eastern time, the instant in time is the same, only the display changes. force_tz(dt, "America/New_York") sets the timezone without converting, it says “this value should be interpreted as Eastern, not UTC”.
Use timezone-aware datetimes (POSIXct) rather than naive ones whenever data spans multiple timezones or crosses daylight saving time boundaries. Store datetimes in UTC internally and convert to local time only for display.
Duration vs period
A duration is a fixed number of seconds. A period is a calendar unit (days, months, years) that adjusts for calendar irregularities. Adding days(1) to 2024-03-10 in the Eastern US gives 2024-03-11, but adding ddays(1) (one day = 86400 seconds) gives 2024-03-10 23:00:00 EST because DST moved the clock forward that night. Use periods for human-readable calendar arithmetic and durations for precise elapsed time calculations.
Parsing mixed formats
Real-world date data often has inconsistent formats across rows. parse_date_time() accepts a vector of format strings and tries each in order: parse_date_time(x, orders = c("ymd", "dmy", "mdy")). For very messy data, anytime::anytime() from the anytime package handles a wide range of natural language formats automatically.
The two date types in R
R has two date and time types that serve different purposes. The Date class represents calendar dates without time information, a birthday, a filing date, a publication date. POSIXct represents an instant in time, a timestamp with date, time, and timezone. Choosing the wrong type causes silent problems: storing a date in a POSIXct object that does not specify a timezone inherits the local system timezone, which makes the stored instant timezone-dependent and potentially different on different machines.
lubridate works with both types through a consistent interface. Functions that work with dates work on Date objects; functions that work with date-times work on POSIXct objects. Duration arithmetic and interval operations work on both. The parsing functions return the appropriate type based on whether time components are present in the input.
Timezone handling
Timezone management is one of the most error-prone aspects of date-time programming. A POSIXct stores an absolute instant, seconds since the Unix epoch, and a timezone that determines how that instant is displayed. Changing the timezone with force_tz changes the display without changing the stored instant. Using with_tz treats the same stored instant as if it were in a different timezone, which is rarely what you want.
The confusion arises because timezone conversions and timezone reinterpretations look similar but mean different things. Converting noon UTC to New York time produces 8 AM EST, the same instant, different local time. Reinterpreting noon UTC as noon New York time produces a different instant, 5 PM UTC. Use with_tz for display conversion; use force_tz only when correcting a wrongly labeled timezone.
Arithmetic edge cases
Adding fixed durations versus adding calendar periods gives different results when crossing DST boundaries. Adding 86400 seconds (one day in seconds) to a timestamp crosses the DST boundary and may end up at 11 PM or 1 AM rather than midnight. Adding one day as a Period (days(1)) keeps the clock time constant and adjusts the UTC offset, which is usually what users expect for business logic.
For business calendar arithmetic — add 30 business days to a date, find the next working day — the bizdays package extends lubridate with custom calendar support. Business calendars vary by locale and industry, so bizdays allows defining custom holiday calendars. Most date arithmetic without business-day semantics is handled correctly by lubridate’s Period arithmetic.
Summary
lubridate transforms date handling from a chore into something you don’t dread. The key functions to remember:
| Task | Functions |
|---|---|
| Parsing | ymd(), dmy(), mdy(), ymd_hms() |
| Current | today(), now() |
| Extract | year(), month(), day(), hour(), wday() |
| Arithmetic | days(), weeks(), months(), interval() |
| Time zones | with_tz(), force_tz() |
Start with parsing and component extraction. Add date arithmetic when you need it. You’ll wonder how you ever worked with dates in R without lubridate.
See also
- Data Wrangling with dplyr — Combine lubridate with dplyr for time-based data transformation
- Functional Programming with purrr — Apply lubridate functions across list columns using purrr
- Reading and Writing CSV Files in R — Load timestamped data from files