User23 User23 - 2 months ago 7
R Question

Sum of values in column V1 based on conditions set in V2 in R

I want to create the sum of a number of values in a column, based on following conditions:

Say I have following data

Z <- matrix(c(1,2,3,4,5,6,7,8,9,10,0,0,1,1,0,0,0,0,0,1), nrow = 10, ncol = 2)


giving me

V1 V2
1 0
2 0
3 1
4 1
5 0
6 0
7 0
8 0
9 0
10 1


Now I only want to sum of values in V1 between the first 1 in V2 and the first value in V2 that is followed by four zeroes. In this example this would be the sum of [3,1] and [4,1] since [3,2] contains the first zero and [4,2] is the first value that is followed by four zeroes in resp. [5,2], [6,2], [7,2] and [8,2].

I tried following loop and variations on it but it keeps giving errors.

for(j in 1:10){
ifelse(V2(j) == 1,
(for i in (j:(10-j+1)){
ifelse (V2(i+1) == 0 & V2(i+2) == 0 & V2(i+3) == 0 & V2(i+4) == 0, total <- sum(V1(c(j:i))), next)})
, next)
}

Answer

With simple for loop:

index1 <- which(Z[,2]==1)[1]
index2 <- NULL
indices <- index1-1+which(Z[index1:nrow(Z),2]==0)
for (i in 1:(length(indices)-3)) {
  if (all((indices[i]+(0:3))==indices[i:(i+3)])) {
    index2 <- (indices[i] - 1) # position of first consecutive 0s after the first 1  is indices[i]
    break
  }
}
ifelse(!is.null(index2), sum(Z[index1:index2, 1]), 0)  
#[1] 7