user7164669 user7164669 - 16 days ago 6
R Question

Nested for loops for date differences

I am new to R and I am trying to calculate date differences from a baseline for every subject. I know how to calculate the day differences using difftime but I am having trouble doing it in a loop for every subject. Any help would be greatly appreciated.

Basically I want to go from:

ID DATE

1 1.1.2015

1 1.1.2016

2 1.1.2017

3 1.1.2017

3 1.1.2016

3 1.1.2017

to:

ID DATE DATEDIFF

1 1.1.2015 0

1 1.1.2016 365

2 1.1.2017 0

3 1.1.2015 0

3 1.1.2016 365

3 1.1.2017 730

Answer

Use lubridate to parse the dates and dplyr to calculate the new column:

library(lubridate)
df <- data.frame(
  id = c(1,1,2,3,3,3),
  date = c('1.1.2015','1.1.2016','1.1.2017','1.1.2015','1.1.2016','1.1.2017'))
# parse dates as DayMonthYear
df$date <- dmy(df$date)
# calculate the difference to the oldest date in each group
# mutate is called once for each group, so you could use an
# arbitrary expression to calculate your new column only with
# the data for this group
df %>% group_by(id) %>% mutate(datediff = date-min(date))

Result:

     id       date datediff
          
1     1 2015-01-01   0 days
2     1 2016-01-01 365 days
3     2 2017-01-01   0 days
4     3 2015-01-01   0 days
5     3 2016-01-01 365 days
6     3 2017-01-01 731 days
Comments