Alex Coppock Alex Coppock - 2 months ago 19
R Question

use dplyr's summarise_each to return one row per function?

I'm using dplyr's summarise_each to apply a function to multiple columns of data. One thing that's nice is that you can apply multiple functions at once. Thing is, it's annoying that the output is a dataframe with a single row. It seems like it should return as many rows as functions, with as many columns as columns that were summarised.

library(dplyr)
default <-
iris %>%
summarise_each(funs(min, max), matches("Petal"))


this returns

> default
Petal.Length_min Petal.Width_min Petal.Length_max Petal.Width_max
1 1 0.1 6.9 2.5


I'd prefer something like

library(reshape2)
desired <-
iris %>%
select(matches("Petal")) %>%
melt() %>%
group_by(variable) %>%
summarize(min=min(value),max=max(value)) %>%
t()


which returns something close (not a dataframe, but you all get the idea)

> desired
[,1] [,2]
variable "Petal.Length" "Petal.Width"
min "1.0" "0.1"
max "6.9" "2.5"


is there an option in summarise_each to do this? If not, Hadley, would you mind adding it?

Answer

You can achieve a similar output combining the dplyr and tidyr packages. Something along these lines can help

library(dplyr)
library(tidyr)

iris %>%
  select(matches("Petal")) %>%
  summarise_each(funs(min, max)) %>%
  gather(variable, value) %>%
  separate(variable, c("var", "stat"), sep = "\\_") %>%
  spread(var, value)
##   stat Petal.Length Petal.Width
## 1  max          6.9         2.5
## 2  min          1.0         0.1