Matias Andina Matias Andina - 1 month ago 15
R Question

Use grouped summary to operate in another data.frame column by factor

I want to compute a

summary
of a grouped
data.frame
, for example.

df_summ = mtcars %>% group_by(am) %>% summarise(mean_mpg=mean(mpg))

am mean_mpg
(dbl) (dbl)
1 0 17.14737
2 1 24.39231


In order to later transform another
data.frame
that shares the same factor levels, but not the number of rows. For example, calculating the absolute difference from each group's mean of the single values.

Here's the toy example

toy=data.frame(am=c(1,1,0,0),mpg=c(1,2,3,4))


The calculation I would like to do would be
y = abs(toy$mpg- df_summ$mean_mpg)
by factor.

My head tells me dplyr must be able to do this but I can't come up with a way.
I want to keep the original data.frame (as in, using
mtcars %>% group_by(am) %>% mutate(...)
)

The expected output looks like that

toy
am mpg expected
1 1 1 23.39231
2 1 2 22.39231
3 0 3 14.14737
4 0 4 13.14737

Answer

Join the two data frames and then perform the calculation:

toy %>% 
    left_join(df_summ) %>% 
    mutate(y = abs(mpg - mean_mpg))

giving:

Joining, by = "am"
  am mpg mean_mpg        y
1  1   1 24.39231 23.39231
2  1   2 24.39231 22.39231
3  0   3 17.14737 14.14737
4  0   4 17.14737 13.14737