Alexander - 22 days ago 6

R Question

Previously I asked related to this question but I need more elegant and general way to solve this.

I have data separated in groups and I want to sum some rows in range based on conditional. I prefer to use 'dplyr' to do this because it's more straight forward for me to understand.

The conditionals which I need as follows;

1: for group 1 ;

find the first occurrence of '10' and sum the rows **after** this occurrence to the end of the group and count how many rows.

2: for group 2;'find the last occurrence of '10' and and sum the rows **before** this occurrence to the beginning of the group and count how many rows!

3: for group 3; find the first occurrence of '10' and and sum the rows **before** this occurrence to the starting row of the group and count how many rows.

`df <- data.frame(gr=rep(c(1,2,3),c(7,9,11)),`

y_value=c(c(0,0,10,8,8,6,0),c(10,10,10,8,7,6,2,0,0), c(8,5,8,7,6,2,10,10,8,7,0)))

> df

gr y_value

1 1 0

2 1 0

3 1 10

4 1 8

5 1 8

6 1 6

7 1 0

8 2 10

9 2 10

10 2 10

11 2 8

12 2 7

13 2 6

14 2 2

15 2 0

16 2 0

17 3 8

18 3 5

19 3 8

20 3 7

21 3 6

22 3 2

23 3 10

24 3 10

25 3 8

26 3 7

27 3 0

It guess something like this should work but cannot figured out how to implement this to

`dplyr`

`count <- function(y,gr){`

if (any(y==10)&(gr==1)) {

*

*

*

if (any(y==10)&(gr==2))

*

*

*

*

}

}

`df%>%`

library(dplyr)

df %>%

group_by(gr) %>%

do(data.frame(.,count_rows=count(y_value,gr)))

expected output

`> df`

gr y_value sum nrow

1 1 0 22 4

2 1 0 22 4

3 1 10 22 4

4 1 8 22 4

5 1 8 22 4

6 1 6 22 4

7 1 0 22 4

8 2 10 23 6

9 2 10 23 6

10 2 10 23 6

11 2 8 23 6

12 2 7 23 6

13 2 6 23 6

14 2 2 23 6

15 2 0 23 6

16 2 0 23 6

17 3 8 28 6

18 3 5 28 6

19 3 7 28 6

20 3 6 28 6

21 3 2 28 6

22 3 10 28 6

23 3 10 28 6

24 3 8 28 6

25 3 7 28 6

26 3 0 28 6

Answer Source

Hope this helps!

```
#sample data
df <- data.frame(gr=rep(c(1,2,3),c(7,9,11)),
y_value=c(c(0,0,10,8,8,6,0),c(10,10,10,8,7,6,2,0,0), c(8,5,8,7,6,2,10,10,8,7,0)))
library(dplyr)
df_temp <- df %>%
group_by(gr) %>%
mutate(rows_to_aggregate=cumsum(y_value==10)) %>%
filter(ifelse(gr==1 | gr==2, rows_to_aggregate !=0 & y_value!=10, rows_to_aggregate ==0)) %>%
mutate(nrow=n(), sum=sum(y_value)) %>%
select(gr,sum,nrow) %>%
distinct()
#final output
df<- left_join(df,df_temp, by='gr')
```