Hernando Casas Hernando Casas - 2 months ago 6
R Question

dplyr summarize: create variables from named vector

Here's my problem:

I am using a function that returns a named vector. Here's a toy example:

toy_fn <- function(x) {
y <- c(mean(x), sum(x), median(x), sd(x))
names(y) <- c("Right", "Wrong", "Unanswered", "Invalid")
y
}


I am using group_by in dplyr to apply this function for each group (typical split-apply-combine). So, here's my toy data.frame:

set.seed(1234567)
toy_df <- data.frame(id = 1:1000,
group = sample(letters, 1000, replace = TRUE),
value = runif(1000))


And here's the result I am aiming for:

toy_summary <-
toy_df %>%
group_by(group) %>%
summarize(Right = toy_fn(value)["Right"],
Wrong = toy_fn(value)["Wrong"],
Unanswered = toy_fn(value)["Unanswered"],
Invalid = toy_fn(value)["Invalid"])

> toy_summary
Source: local data frame [26 x 5]

group Right Wrong Unanswered Invalid
1 a 0.5038394 20.15358 0.5905526 0.2846468
2 b 0.5048040 15.64892 0.5163702 0.2994544
3 c 0.5029442 21.62660 0.5072733 0.2465612
4 d 0.5124601 14.86134 0.5382463 0.2681955
5 e 0.4649483 17.66804 0.4426197 0.3075080
6 f 0.5622644 12.36982 0.6330269 0.2850609
7 g 0.4675324 14.96104 0.4692404 0.2746589


It works! But it is just not cool to call four times the same function. I would rather like dplyr to get the named vector and create a new variable for each element in the vector. Something like this:

toy_summary <-
toy_df %>%
group_by(group) %>%
summarize(toy_fn(value))


This, unfortunately, does not work because "Error: expecting a single value".

I thought, ok, let's just convert the vector to a
data.frame
using
data.frame(as.list(x))
. But this does not work either. I tried many things but I couldn't trick dplyr into think it's actually receiving one single value (observation) for 4 different variables. Is there any way to help dplyr realize that?.

Answer

You can also try this with do():

toy_df %>%
  group_by(group) %>%
  do(res = toy_fn(.$value))