Tareva - 1 year ago 50
R Question

Implementing sequential counter of decreasing values

I need to implement a counter that decrements

`dec_cnt`
by 1 based on certain conditions.

Below is my dataframe
`df`
.

`````` ID   A
1   0
2   0
3   0
4   1
5   1
6   0
7   0
8   0
9   0
10    0
11    0
12    0
13    0
14    0
15    0
16   -1
17    1
18    0
19    1
20    0
21   -1
22    0
23    0
24   -1
25    0
26    0
27    0
28    0
29    0
30    0
31    0
32    0
33    0
34    0
``````

The conditions are

a. The counter should start from the data point where the
`A==1 or -1`
and start decrementing the counter for next
`16`
values,for example value of
`A == 1`
at
`ID 4`
, so from
`ID == 4`
till
`ID==19`
the decrement counter should be implemented starting from value
`15`
till counter is
`0`
. Also to note that if there exists any
`A== 1/-1`
in between this range it should be ignored.
b. I also need to implement
`retain_A`
column which retains the value of
`A`
through out the
`counter`
.

Below is my expected output.

`````` ID   A       retain_A   dec_cnt
1   0         NA         NA
2   0         NA         NA
3   0         NA         NA
4   1         1          15
5   1         1          14
6   0         1          13
7   0         1          12
8   0         1          11
9   0         1          10
10    0         1          9
11    0         1          8
12    0         1          7
13    0         1          6
14    0         1          5
15    0         1          4
16   -1         1          3
17    1         1          2
18    0         1          1
19    1         1          0
20    0         NA         NA
21   -1         -1         15
22    0         -1         14
23    0         -1         13
24   -1         -1         12
25    0         -1         11
26    0         -1         10
27    0         -1          9
28    0         -1          8
29    0         -1          7
30    0         -1          6
31    0         -1          5
32    0         -1          4
33    0         -1          3
34    0         -1          2
``````

The similar kind of question had been posted couple of days ago where the solution uses
`for loop`
, Also the
`loop`
fails to execute if the data points are more than
`35`
. I wanted to avoid
`for loop`
because its execution time will be more if we are dealing with huge amount of data.

The data frame is take from the question posted here

below is the script that I tried using the above referenced post.

``````  dec_cnt <- 0
Retain_A <- NA
for (i in seq_along(df\$A)) {
if (dec_cnt == 0) {
if (df\$A[i] == 0) next
dec_cnt <- 15
Retain_A <- df\$A[i]
df\$Retain_A[i] <- df\$A[i]
df\$dec_cnt[i] <- dec_cnt
} else {
dec_cnt <- dec_cnt - 1
df\$Retain_A[i] <- Retain_A
df\$dec_cnt[i] <- dec_cnt
}
}
``````

I don't think it's realistic to avoid any kind of loop, `for` or otherwise. Perhaps a more realistic goal would be to avoid loops that iterate over every single value, regardless of whether it is relevant.

Starting from your 2-column input, let's pre-set the empty columns:

``````dat\$retain_A <- NA
dat\$dec_cnt  <- NA
``````

Here's where we can gain some efficiency: instead of repeatedly making comparisons, we can know if it matches -1/1 now:

``````ind <- which(dat\$A %in% c(-1,1))
last_match <- 0
ind
# [1]  4  5 16 17 19 21 24
``````

The trick is to keep track of the `last_match` and discard any indices between it and the next 15 entries.

``````ind <- ind[ind > last_match]
while (length(ind) > 0) {
i <- seq(ind[1], min(ind[1] + 15, nrow(dat)))
dat\$dec_cnt[i] <- head(15:0, n = length(i))
dat\$retain_A[i] <- dat\$A[ ind[1] ]
last_match <- ind[1] + 15
ind <- ind[ind > last_match]
}
dat
#    ID  A retain_A dec_cnt
# 1   1  0       NA      NA
# 2   2  0       NA      NA
# 3   3  0       NA      NA
# 4   4  1        1      15
# 5   5  1        1      14
# 6   6  0        1      13
# 7   7  0        1      12
# 8   8  0        1      11
# 9   9  0        1      10
# 10 10  0        1       9
# 11 11  0        1       8
# 12 12  0        1       7
# 13 13  0        1       6
# 14 14  0        1       5
# 15 15  0        1       4
# 16 16 -1        1       3
# 17 17  1        1       2
# 18 18  0        1       1
# 19 19  1        1       0
# 20 20  0       NA      NA
# 21 21 -1       -1      15
# 22 22  0       -1      14
# 23 23  0       -1      13
# 24 24 -1       -1      12
# 25 25  0       -1      11
# 26 26  0       -1      10
# 27 27  0       -1       9
# 28 28  0       -1       8
# 29 29  0       -1       7
# 30 30  0       -1       6
# 31 31  0       -1       5
# 32 32  0       -1       4
# 33 33  0       -1       3
# 34 34  0       -1       2
``````

You'll find that your initial loop iterates once per row whereas this solution iterates only once per non-zero.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download