Leo Leo - 3 months ago 13
R Question

Aggregating to String and Summing the Values Associated to Aggregate in R 3.3.0 Dplyr v 0.5.0

Question:

I have the following data frame that I want to simplify

Fruit <- c("Apple","Apple","Orange","Orange","Banana","Banana")
Farmer <- c("Bob","Ben","Bill","Bob","George","Bob")
Tons.Jan <- c(20,40,10,20,35,15)
Tons.Feb <- c(30,40,20,15,25,30)
Tons.Mar <- c(10,10,15,10,20,30)
Tons.Apr <- c(15,20,15,30,30,30)
Tons.May <- c(20,5,20,20,20,10)

df <- cbind(Fruit,Farmer)
df <- cbind(df,Tons.Jan)
df <- cbind(df,Tons.Feb)
df <- cbind(df,Tons.Mar)
df <- cbind(df,Tons.Apr)
df <- tbl_df(cbind(df,Tons.May))


I want to be able to summarise the Farmers to a single comma delimited strong and sum the Tons to the observations so that it looks like the following

I want to get to the following

Fruit2 <- c("Apple","Orange","Banana")
Farmer2 <- c("Bob,Ben","Bill,Bob","George,Bob")
Tons.Jan2 <- c(60,30,50)
Tons.Feb2 <- c(70,35,55)
Tons.Mar2 <- c(20,25,50)
Tons.Apr2 <- c(35,45,60)
Tons.May2 <- c(25,40,30)

df2 <- cbind(Fruit2,Farmer2)
df2 <- cbind(df2,Tons.Jan2)
df2 <- cbind(df2,Tons.Feb2)
df2 <- cbind(df2,Tons.Mar2)
df2 <- cbind(df2,Tons.Apr2)
df2 <- tbl_df(cbind(df2,Tons.May2))


What I have tried:

I have tried using dplyr function group_by and summarise_each below

df <- df %>% group_by(Fruit) %>%
summarise_each_(funs(toString))


However I am not sure how to integrate summing the numeric values without specifically calling out each column using the summarise function,

Any Help is appreciated.

Answer
library(dplyr)

# Convert the relevant columns to numeric
df <- mutate_each(df, funs(as.numeric), -Fruit, -Farmer)

# or as mentioned in the comments by jazzurro
df <- mutate_at(df, vars(starts_with("Tons")), as.numeric)

df %>% 
    group_by(Fruit) %>% 
    mutate(Farmer = toString(Farmer)) %>%
    group_by(Fruit, Farmer) %>%
    summarise_all(funs(sum))


#Source: local data frame [3 x 7]
#Groups: Fruit [?]
#
#   Fruit      Farmer Tons.Jan Tons.Feb Tons.Mar Tons.Apr Tons.May
#   <chr>       <chr>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
#1  Apple    Bob, Ben       60       70       20       35       25
#2 Banana George, Bob       50       55       50       60       30
#3 Orange   Bill, Bob       30       35       25       45       40