Alexander - 5 months ago 34
R Question

# Summing rows based on conditional in groups

Previously I asked related to this question but I need more elegant and general way to solve this.
I have data separated in groups and I want to sum some rows in range based on conditional. I prefer to use 'dplyr' to do this because it's more straight forward for me to understand.

The conditionals which I need as follows;

1: for group 1 ;
find the first occurrence of '10' and sum the rows after this occurrence to the end of the group and count how many rows.

2: for group 2;'find the last occurrence of '10' and and sum the rows before this occurrence to the beginning of the group and count how many rows!

3: for group 3; find the first occurrence of '10' and and sum the rows before this occurrence to the starting row of the group and count how many rows.

``````df <- data.frame(gr=rep(c(1,2,3),c(7,9,11)),
y_value=c(c(0,0,10,8,8,6,0),c(10,10,10,8,7,6,2,0,0), c(8,5,8,7,6,2,10,10,8,7,0)))

> df
gr y_value
1   1       0
2   1       0
3   1      10
4   1       8
5   1       8
6   1       6
7   1       0
8   2      10
9   2      10
10  2      10
11  2       8
12  2       7
13  2       6
14  2       2
15  2       0
16  2       0
17  3       8
18  3       5
19  3       8
20  3       7
21  3       6
22  3       2
23  3      10
24  3      10
25  3       8
26  3       7
27  3       0
``````

It guess something like this should work but cannot figured out how to implement this to
`dplyr`

``````count <- function(y,gr){
if (any(y==10)&(gr==1)) {
*
*
*
if (any(y==10)&(gr==2))
*
*
*
*
``````

}
}

``````df%>%
library(dplyr)

df %>%
group_by(gr) %>%
do(data.frame(.,count_rows=count(y_value,gr)))
``````

expected output

``````  > df
gr y_value sum nrow
1   1       0  22   4
2   1       0  22   4
3   1      10  22   4
4   1       8  22   4
5   1       8  22   4
6   1       6  22   4
7   1       0  22   4
8   2      10  23   6
9   2      10  23   6
10  2      10  23   6
11  2       8  23   6
12  2       7  23   6
13  2       6  23   6
14  2       2  23   6
15  2       0  23   6
16  2       0  23   6
17  3       8  28   6
18  3       5  28   6
19  3       7  28   6
20  3       6  28   6
21  3       2  28   6
22  3      10  28   6
23  3      10  28   6
24  3       8  28   6
25  3       7  28   6
26  3       0  28   6
``````

Hope this helps!

``````#sample data
df <- data.frame(gr=rep(c(1,2,3),c(7,9,11)),
y_value=c(c(0,0,10,8,8,6,0),c(10,10,10,8,7,6,2,0,0), c(8,5,8,7,6,2,10,10,8,7,0)))

library(dplyr)
df_temp <- df %>%
group_by(gr) %>%
mutate(rows_to_aggregate=cumsum(y_value==10)) %>%
filter(ifelse(gr==1 | gr==2, rows_to_aggregate !=0 & y_value!=10, rows_to_aggregate ==0)) %>%
mutate(nrow=n(), sum=sum(y_value)) %>%
select(gr,sum,nrow) %>%
distinct()

#final output
df<- left_join(df,df_temp, by='gr')
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download