m_c m_c - 17 days ago 13
R Question

12th consecutive TRUE row in data frame

How to find 12 consecutive TRUE values in $crit? I am tryng with something like this:

for(i in 12:nrow(df)) {
if(sum(df$crit[(i-12):i])=12)
print(df$date[i])
}


I would like to get 12th consecutive TRUE value within $crit. IS this code ok for looping over groups of 12 consecutive rows?

My data:

date rain temp rh accumulation crit
1 2015-04-02 10:00:00 0.5 9.8 96 NA FALSE
2 2015-04-02 11:00:00 0.1 10.0 95 NA TRUE
3 2015-04-02 12:00:00 0.0 10.1 95 NA TRUE
4 2015-04-02 13:00:00 0.1 10.7 95 NA TRUE
5 2015-04-02 14:00:00 0.0 10.7 94 NA TRUE
6 2015-04-02 15:00:00 0.1 10.7 95 NA TRUE
7 2015-04-02 16:00:00 0.6 11.2 96 NA TRUE
8 2015-04-02 17:00:00 0.1 11.7 96 NA TRUE
9 2015-04-02 18:00:00 0.4 11.6 96 NA TRUE
10 2015-04-02 19:00:00 0.2 11.3 96 NA TRUE
11 2015-04-02 20:00:00 0.6 11.3 97 NA TRUE
12 2015-04-02 21:00:00 0.2 11.6 97 NA TRUE
13 2015-04-02 22:00:00 0.0 12.0 96 1 TRUE
14 2015-04-02 23:00:00 0.3 11.8 96 2 TRUE
15 2015-04-03 00:00:00 0.0 11.8 97 3 TRUE
16 2015-04-03 01:00:00 0.0 11.9 97 4 TRUE
17 2015-04-03 02:00:00 0.1 12.2 95 5 TRUE
18 2015-04-03 03:00:00 0.8 11.4 93 6 TRUE
19 2015-04-03 04:00:00 0.6 10.9 92 7 TRUE
20 2015-04-03 05:00:00 0.0 10.3 89 NA FALSE

Answer

Sounds like a rolling sum - you want to add up the last 12 crit values and see if you get 12 or not. There's a lot of ways to do a rolling sum, but a particularly easy one to implement is a lagged cumsum.

## some data
set.seed(47)
crit = runif(100) < 0.8 

## rolling sum of last 12 elements
rs = cumsum(crit) - cumsum(c(rep(0, 12), head(crit, -12)))

## see where we get to 12
which(rs == 12)
# [1] 28 29 30 31 32 33 34 62 63 64 65 66

## verify
names(crit) = seq_along(crit)
crit[16:29]
#    16    17    18    19    20    21    22    23    24    25    26    27    28    29 
# FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE 

Hopefully the code is pretty easy to understand. In the verify step we can see that indeed the 28th element (first output of the which) is the 12th in a series of 12 TRUEs.


Translating to a data frame application:

set.seed(47)
dd = data.frame(crit = runif(100) < 0.8, date = as.Date("2016-01-01") + seq_along(crit))    

rs = with(dd, cumsum(crit) - cumsum(c(rep(0, 12), head(crit, -12))))

dd[which(rs == 12), ]
#    crit       date
# 28 TRUE 2016-01-29
# 29 TRUE 2016-01-30
# 30 TRUE 2016-01-31
# 31 TRUE 2016-02-01
# 32 TRUE 2016-02-02
# 33 TRUE 2016-02-03
# 34 TRUE 2016-02-04
# 62 TRUE 2016-03-03
# 63 TRUE 2016-03-04
# 64 TRUE 2016-03-05
# 65 TRUE 2016-03-06
# 66 TRUE 2016-03-07