I have following data.frame (df)
ID1 ID2 Col1 Col2 Col3 Grp
A B 1 3 6 G1
C D 3 5 7 G1
E F 4 5 7 G2
G h 5 6 8 G2
summarize(ID1s=toString(ID1), ID2s=toString(ID2), Col1=sum(Col1), Col2=sum(Col2), Col3=sum(Col3))
I would like to be able to implement it so that it would work for a data frame with known and always named the same ID1, ID2, Grp, and any number of additional numeric column with unknown names.
You can overwrite the ID columns first and then group by them as well:
DF %>% group_by(Grp) %>% mutate_each(funs(. %>% unique %>% sort %>% toString), ID1, ID2) %>% group_by(ID1, ID2, add=TRUE) %>% summarise_each(funs(sum)) # Source: local data frame [2 x 6] # Groups: Grp, ID1 [?] # # Grp ID1 ID2 Col1 Col2 Col3 # (chr) (chr) (chr) (int) (int) (int) # 1 G1 A, C B, D 4 8 13 # 2 G2 E, G F, h 9 11 15
I think you'll want to uniqify and sort before collapsing to a string, so I've added those steps.