Gerald T - 2 years ago 94
R Question

# r identify groups based on conditions

I am trying a clear way to identify groups based on condition on multiple column.

As an example imagine we have a table of points ordered by date and another numeric column:

``````DT <- data.table(a = paste0("Date", 1:10), V1 = c(1, 2, 11, 9, 11, 11, 2, 2, 11, 11))

a V1
1:  Date1  1
2:  Date2  2
3:  Date3 11
4:  Date4  9
5:  Date5 11
6:  Date6 11
7:  Date7  2
8:  Date8  2
9:  Date9 11
10: Date10 11
``````

Every time a value in
`V1 > 10`
we have a new group obtaining the following table:

``````         a V1 ID
1:  Date1  1  1
2:  Date2  2  1
3:  Date3 11  2
4:  Date4  9  2
5:  Date5 11  3
6:  Date6 11  4
7:  Date7  2  4
8:  Date8  2  4
9:  Date9 11  5
10: Date10 11  6
``````

The tables are over 3million point so no loops please. Any help? Thank you very much!!!

We need `cumsum` on the logical `vector` (`V1 > 10`), add 1 to it in case it starts from 0 and assign (`:=`) to create the 'ID' column
``````DT[, ID := cumsum(V1>10)+1]