moman822 moman822 - 1 month ago 5
R Question

Find next date in series by group

I have some data like this:

sample.data <- rbind(data.table(start.date=seq(from=as.Date("2010-01-01"), to=as.Date("2014-12-01"), by="quarter"),
Group=c("A","B","C","D"), rnorm(20, 5)),
data.table(start.date=seq(from=as.Date("2010-01-01"), to=as.Date("2014-12-01"), by="quarter"),
Group=c("A","B","C","D"), rnorm(20, 3))
)


I would like to create an
end.date
column that equals the next earliest
start.date
value for each group.

So, for example, the first
start.date
for
Group==A
is
2010-01-01
. The next earliest
start.date
for
Group==A
is
2011-01-01
. So the final result should look like this when sorted by
Group
:

start.date Group end.date
2010-01-01 A 2011-01-01
2010-01-01 A 2011-01-01
2011-01-01 A 2012-01-01
2011-01-01 A 2012-01-01
2012-01-01 A 2013-01-01
2012-01-01 A 2013-01-01
2013-01-01 A 2014-01-01
2013-01-01 A 2014-01-01
2014-01-01 A NA
2014-01-01 A NA
2010-04-01 B 2011-04-01
2010-04-01 B 2011-04-01
2011-04-01 B 2012-04-01
2011-04-01 B 2012-04-01


And so on. Ideally I would like to do this by reference, like

sample.data[, end.date := EXPRESSION]


but am at a loss for where to start. Thanks for any help.

Answer

Okay so:

events = unique(sample.data[ , .(Group, start.date) ])
events[, next.date := shift(start.date, type="lead"), by=Group]

sample.data[events, on=c("Group", "start.date"), end.date := next.date ]

In my opinion, the OP should have a table like events anyways, in keeping with database design / tidy data. For more on the second step, you could check out the Docs page. The result looks like

> sample.data[ order(Group, start.date) ]

    start.date Group   end.date
 1: 2010-01-01     A 2011-01-01
 2: 2010-01-01     A 2011-01-01
 3: 2011-01-01     A 2012-01-01
 4: 2011-01-01     A 2012-01-01
 5: 2012-01-01     A 2013-01-01
 6: 2012-01-01     A 2013-01-01
 7: 2013-01-01     A 2014-01-01
 8: 2013-01-01     A 2014-01-01
 9: 2014-01-01     A       <NA>
10: 2014-01-01     A       <NA>
11: 2010-04-01     B 2011-04-01
12: 2010-04-01     B 2011-04-01
...
Comments