dapperdan dapperdan - 2 months ago 7
R Question

How can I manipulate dataframe columns with different values from an external vector (with dplyr)

In R, I would like to manipulate (say multiply) data.frame columns with appropriately named values stored in a vector (or data.frame, if that's easier).

Let's say, I want to first

summarise
the variables
disp
,
hp
, and
wt
from the
mtcars
dataset.

vars <- c("disp", "hp", "wt")
mtcars %>%
summarise_at(vars, funs(sum(.))


(throw a
group_by(cyl)
into the mix, or use
mutate_at
if you'd like to have more rows)

Now I'd like to multiply each of the resulting columns with a particular value, given by

multiplier <- c("disp" = 2, "hp" = 3, "wt" = 4)


Is it possible to refer to these within the
summarise_at
function?

The result should look like this (and I don't want to have to refer to the variable names directly while getting there):

disp hp wt
14766.2 14082 411.808


UPDATE:

Maybe my MWE was too minimal. Let's say I want to do the same operation with a data.frame grouped by
cyl


mtcars %>%
group_by(cyl) %>%
summarise_at(vars, sum)


The result should thus be:

cyl disp hp wt
1 4 2313.0 2727 100.572
2 6 2566.4 2568 87.280
3 8 9886.8 8787 223.956


UPDATE 2:

Maybe I was not explicit enough here either, but the columns in the data.frame should be multiplied by the respective values in the vector (and only those columns mentioned in the vector), so e.g.
disp
should be multiplied by 2,
hp
by 3 and
wt
by 4, all other variables (e.g.
cyl
) should remain untouched by the multiplication.

Answer

We could also do this with map function from purrr

library(purrr)
mtcars %>%
    summarise_at(vars, sum) %>%
    map2_df(multiplier, `*`)
#      disp    hp      wt
#     <dbl> <dbl>   <dbl>
# 1 14766.2 14082 411.808

For the updated question

d1 <- mtcars %>% 
         group_by(cyl) %>% 
         summarise_at(vars, sum) 
d1 %>% 
   select(one_of(vars)) %>% 
   map2_df(multiplier[vars], ~ .x * .y) %>%
   bind_cols(d1 %>% select(-one_of(vars)), .) 
#    cyl   disp    hp      wt
#  <dbl>  <dbl> <dbl>   <dbl>
#1     4 2313.0  2727 100.572
#2     6 2566.4  2568  87.280
#3     8 9886.8  8787 223.956

Or we can use gather/spread

library(tidyr)
mtcars %>% 
    group_by(cyl) %>% 
    summarise_at(vars, sum) %>% 
    gather(var, val, -cyl) %>% 
    mutate(val = val*multiplier[match(var, names(multiplier))]) %>% 
    spread(var, val)
#     cyl   disp    hp      wt
#   <dbl>  <dbl> <dbl>   <dbl>
#1     4 2313.0  2727 100.572
#2     6 2566.4  2568  87.280
#3     8 9886.8  8787 223.956