Rilcon42 Rilcon42 - 1 month ago 11
R Question

What is the right way to reference part of a dataframe after piping?

What is the correct way to do something like this? I am trying to get the

colSums
of each group for specific columns. The
.
syntax seems incorrect with this type of subsetting.

csv<-data.frame(id_num=c(1,1,1,2,2),c(1,2,3,4,5),c(1,2,3,3,3))
temp<-csv%>%group_by(id_num)%>%colSums(.[,2:3],na.rm=T)

Answer

This can be done with summarise_each or in the recent version additional functions like summarise_at, summarise_if were introduced for convenient use.

csv %>%
    group_by(id_num) %>%
    summarise_each(funs(sum))

csv %>%
     group_by(id_num) %>%
     summarise_at(2:3, sum) 

If we are using column names, wrap it with var in the summarise_at

csv %>%
    group_by(id_num) %>%
    summarise_at(names(csv)[-1], sum)

NOTE: In the OP's dataset, the column names for the 2nd and 3rd columns were not specified resulting in something like c.1..2..3..4..5.