Kelsey - 9 months ago 55

R Question

I have used the aggregate function to get a summary of results based on their collection location. The summary returns 3 obs. of 2 variables. One variable is the group name, one is the summary statistic per group.

How do I get R to view each column (Group, min, 1st quartile, median, etc.) as unique in my data frame? Ultimately I'd like this to be 3 obs. of 7 variables, one for each column. OR I'd like to know how to cleanly get min, median, and max by Location. Thanks!

`Result <- c(1,1,2,100,50,30,45,20, 10, 8)`

Location <- c("Alpha", "Beta", "Gamma", "Alpha", "Beta", "Gamma", "Alpha", "Beta", "Gamma", "Alpha")

df <- data.frame(Result, Location)

head(df)

Agg <- aggregate(df$Result, list(df$Location), summary)

head(Agg)

Group.1 x.Min. x.1st Qu. x.Median x.Mean x.3rd Qu. x.Max.

1 Alpha 1.00 6.25 26.50 38.50 58.75 100.00

2 Beta 1.00 10.50 20.00 23.67 35.00 50.00

3 Gamma 2.00 6.00 10.00 14.00 20.00 30.00

Answer

Since `aggregate`

's `simplify`

parameter defaults to `TRUE`

, it's simplifying the results of calling the function (here, `summary`

) to a matrix. You can reconstruct the data.frame, coercing the column into its own data.frame:

```
with(Agg, data.frame(Group.1, as.data.frame(x)))
## Group.1 Min. X1st.Qu. Median Mean X3rd.Qu. Max.
## 1 Alpha 1 6.25 26.5 38.50 58.75 100
## 2 Beta 1 10.50 20.0 23.67 35.00 50
## 3 Gamma 2 6.00 10.0 14.00 20.00 30
```

Alternately, dplyr's `summarise`

family of functions can handle multiple summary statistics well:

```
library(dplyr)
df %>% group_by(Location) %>% summarise_all(funs(min, median, max))
## # A tibble: 3 × 4
## Location min median max
## <fctr> <dbl> <dbl> <dbl>
## 1 Alpha 1 26.5 100
## 2 Beta 1 20.0 50
## 3 Gamma 2 10.0 30
```

If you really want all of `summary`

, you can use `broom::tidy`

to turn the results into a data.frame:

```
df %>% group_by(Location) %>% do(broom::tidy(summary(Result)))
## Source: local data frame [3 x 7]
## Groups: Location [3]
##
## Location minimum q1 median mean q3 maximum
## <fctr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Alpha 1 6.25 26.5 38.50 58.75 100
## 2 Beta 1 10.50 20.0 23.67 35.00 50
## 3 Gamma 2 6.00 10.0 14.00 20.00 30
```

Source (Stackoverflow)