user2146441 user2146441 - 1 month ago 7
R Question

R dataframe: Getting value from next row subject to criteria

I have data in the following format:

quotes <- read.csv(text = "
id,ts,origin,product,bid,ask,nextts
1,2016-10-18 20:20:54.733,SourceA,Dow,1.09812,1.0982,
2,2016-10-18 20:20:55.093,SourceA,Ftse,7010.5,7011.5,
3,2016-10-18 20:20:55.149,SourceA,Dow,18159.0,18161.0,
4,2016-10-18 20:20:55.871,SourceA,Ftse,18159.0,18161.0,")


How can I populate the column 'nextts' with the value of ts in the next row where source is the same and product is the same? Essentially, joining the data on itself (subject to it being the same product and source) and capturing the value of ts?

I found the following answer, but this is a strict lead/lag without any criteria.

Return next row in a dataframe R

Answer

First ensure that ts is character or POSIXct rather than factor by explicitly converting it as shown here or by using the as.is=TRUE argument to read.csv. Then use ave with the indicated function to shift by group.

quotes$ts <- as.character(quotes$ts)
transform(quotes, nextts = ave(ts, origin, product, FUN = function(x) c(x[-1], NA)))

giving:

  id                       ts  origin product         bid        ask                   nextts
1  1 2016-10-18  20:20:54.733 SourceA     Dow     1.09812     1.0982 2016-10-18  20:20:55.149
2  2 2016-10-18  20:20:55.093 SourceA    Ftse  7010.50000  7011.5000 2016-10-18  20:20:55.871
3  3 2016-10-18  20:20:55.149 SourceA     Dow 18159.00000 18161.0000                     <NA>
4  4 2016-10-18  20:20:55.871 SourceA    Ftse 18159.00000 18161.0000                     <NA>

Note: We used this as input:

quotes <- read.csv(text = "
  id,ts,origin,product,bid,ask,nextts
  1,2016-10-18  20:20:54.733,SourceA,Dow,1.09812,1.0982,
  2,2016-10-18  20:20:55.093,SourceA,Ftse,7010.5,7011.5,
  3,2016-10-18  20:20:55.149,SourceA,Dow,18159.0,18161.0,
  4,2016-10-18  20:20:55.871,SourceA,Ftse,18159.0,18161.0")