Milosz - 1 year ago 70
R Question

# How to get the maximum value by id over the last 4 quarters in r data.table?

I have a dataset with id's and states. I need to get the maximum status over the last 4 quarters. As an example, I created this table:

``````library(data.table)
library(zoo)

DT =
data.table(id=c("1","1","1","1","1","1","1","1","2"),
date=as.yearqtr(c("2015Q1", "2014Q4", "2014Q3", "2014Q2", "2014Q1", "2013Q4","2013Q3","2013Q2","2015Q2")),
status=c(0,0,0,0,0,0,1,0,0))
``````

I want to get the maximum value for
`status`
over the last 4 quarters available:

``````DT =
data.table(id=c("1","1","1","1","1","1","1","1","2"),
date=as.yearqtr(c("2015Q1", "2014Q4", "2014Q3", "2014Q2", "2014Q1","2013Q4","2013Q3","2013Q2","2015Q2")),
status=c(0,0,0,0,0,0,1,0,0),
Max_status=c(0,0,0,1,1,1,1,0,0))
``````

Does anyone have any clue to do that neatly and quickly in data.table? I tried something like this:

``````DT[,Date_1yPrior:=date-1]
DT[,Max_Status:=max(status[date>=Date_1yPrior & date<=Date_1yPrior]),by='id']
``````

but the j and i refers are not correct. I would prefer a
`data.table`
solution.

I think this does the trick:

``````library(zoo)
library(data.table)
DT[, date_num := as.numeric(date)]
DT[, NextStatus1Q := .SD[match(date_num - .25, .SD\$date_num), status], by=id]
DT[, NextStatus2Q := .SD[match(date_num - .5, .SD\$date_num), status], by=id]
DT[, NextStatus3Q := .SD[match(date_num - .75, .SD\$date_num), status], by=id]
DT[, NextStatus4Q := .SD[match(date_num - 1, .SD\$date_num), status], by=id]
DT[,MaxStatus:=max(NextStatus1Q,NextStatus2Q,NextStatus3Q,status,0,na.rm=T),by=c('id','date')]
#DT[,MaxStatus:=max(NextStatus1Q,NextStatus2Q,NextStatus3Q,NextStatus4Q,status,0,na.rm=T),by=c('id','date')] # If you also want 4q (actually 5 quarters back)
``````

Create new columns where you lookup the value 1-3 quarters ago and after that select the max.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download