Moore Moore - 2 months ago 9
R Question

How to subset consecutive rows if they meet a condition

I am using R to analyze a number of time series (1951-2013) containing daily values of Max and Min temperatures. The data has the following structure:

YEAR MONTH DAY MAX MIN
1985 1 1 22.8 9.4
1985 1 2 28.6 11.7
1985 1 3 24.7 12.2
1985 1 4 17.2 8.0
1985 1 5 17.9 7.6
1985 1 6 17.7 8.1


I need to find the frequency of heat waves based on this definition: A period of three or more consecutive days ‎with a daily maximum and minimum temperature exceeding the 90th percentile of the maximum ‎and minimum temperatures for all days in the studied period.

Basically, I want to subset those consecutive days (three or more) when the Max and Min temp exceed a threshold value. The output would be something like this:

YEAR MONTH DAY MAX MIN
1989 7 18 45.0 23.5
1989 7 19 44.2 26.1
1989 7 20 44.7 24.4
1989 7 21 44.6 29.5
1989 7 24 44.4 31.6
1989 7 25 44.2 26.7
1989 7 26 44.5 25.0
1989 7 28 44.8 26.0
1989 7 29 44.8 24.6
1989 8 19 45.0 24.3
1989 8 20 44.8 26.0
1989 8 21 44.4 24.0
1989 8 22 45.2 25.0


I have tried the following to subset my full dataset to just the days that exceed the 90th percentile temperature:

HW<- subset(Mydata, Mydata$MAX >= (quantile(Mydata$MAX,.9)) &
Mydata$MIN >= (quantile(Mydata$MIN,.9)))


However, I got stuck in how I can subset only consecutive days that have met the condition.

Answer

An approach with data.table which is slightly different from @jlhoward's approach (using the same data):

library(data.table)

setDT(df)
df[, hotday := +(MAX>=44.5 & MIN>=24.5)
   ][, hw.length := with(rle(hotday), rep(lengths,lengths))
     ][hotday == 0, hw.length := 0]

this produces a datatable with a heat wave length variable (hw.length) instead of a TRUE/FALSE variable for a specific heat wave length:

> df
    YEAR MONTH DAY  MAX  MIN hotday hw.length
 1: 1989     7  18 45.0 23.5      0         0
 2: 1989     7  19 44.2 26.1      0         0
 3: 1989     7  20 44.7 24.4      0         0
 4: 1989     7  21 44.6 29.5      1         1
 5: 1989     7  22 44.4 31.6      0         0
 6: 1989     7  23 44.2 26.7      0         0
 7: 1989     7  24 44.5 25.0      1         3
 8: 1989     7  25 44.8 26.0      1         3
 9: 1989     7  26 44.8 24.6      1         3
10: 1989     7  27 45.0 24.3      0         0
11: 1989     7  28 44.8 26.0      1         1
12: 1989     7  29 44.4 24.0      0         0
13: 1989     7  30 45.2 25.0      1         1