user1471980 user1471980 - 10 months ago 33
R Question

how do you extract the peak datetime from a data frame given the peak usage per date

I am trying to determine the peak usage hour per day given the max(used) column per server:

DATE DATETIME Server MACH Capacity Used
9/16/2016 9/16/2016 12:00 cpu1 A 22256 17939
9/16/2016 9/16/2016 13:00 cpu1 A 22256 12591
9/16/2016 9/16/2016 14:00 cpu1 A 22256 15834
9/16/2016 9/16/2016 15:00 cpu2 B 22256 14095
9/16/2016 9/16/2016 16:00 cpu2 B 22256 18186
9/16/2016 9/16/2016 17:00 cpu2 B 22256 12637

In this data frame, for date 9/16/2016 for cpu1, max used is 17939 and it happened on 9/16/2016 12:00

I am trying to use the data.table package and can pick the max Used like this:

df<-df[,peak_used:=max(Used), by=c("Server","DATE")]

I also need to extract the peak hour and create a new column as peak column and insert the date and time in there?

Any ideas how can I extract the peak DATETIME for max Used for that DATE?

Answer Source

We can use which.max to get the index of max rows, after grouping by 'Server', 'DATE', subset the 'DATETIME' using that index, and create the 'peaktime' by assinging (:=) the value to it

df[, peaktime := DATETIME[which.max(Used)], by = .(Server, DATE)]

If we need both 'peakused' and 'peaktime', then

df[, c("peakused", "peaktime") := {
       i1 <- which.max(Used)
      .(DATETIME[i1], Used[i1])},  by = .(Server, DATE)]