Tower_Watch Tower_Watch - 4 months ago 106
R Question

Recode and Mutate_all in dplyr

I am trying to use recode and mutate_all to recode columns. However, for some reason, I am getting an error. I do believe this post is similar to how to recode (and reverse code) variables in columns with dplyr but the answer in that post has used lapply function.

Here's what I tried after reading dplyr package's help pdf.

by_species<-matrix(c(1,2,3,4),2,2)
tbl_species<-as_data_frame(by_species)
tbl_species %>% mutate_all(funs(. * 0.4))
# A tibble: 2 x 2
V1 V2
<dbl> <dbl>
1 0.4 1.2
2 0.8 1.6


So, this works well.

However, this doesn't work:

grades<-matrix(c("A","A-","B","C","D","B-","C","C","F"),3,3)
tbl_grades <- as_data_frame(grades)
tbl_grades %>% mutate_all(funs(dplyr::recode(.,A = '4.0')))


I get this error:

Error in vapply(dots[missing_names], function(x) make_name(x$expr), character(1)) :
values must be length 1,
but FUN(X[[1]]) result is length 3


Can someone please explain what's the problem and why above code isn't working?

I'd appreciate any help.

Thanks

Answer

@Mir has done a good job describing the problem. Here's one possible workaround. Since the problem is in generating the name, you can supply your own name

tbl_grades %>% mutate_all(funs(recode=recode(.,A = '4.0')))

Now this does add columns rather than replace them. Here's a function that will "forget" that you supplied those names

dropnames<-function(x) {if(is(x,"lazy_dots")) {attr(x,"has_names")<-FALSE}; x}
tbl_grades %>% mutate_all(dropnames(funs(recode=dplyr::recode(.,A = '4.0'))))

This should behave like the original. Although really

tbl_grades %>% mutate_all(dropnames(funs(recode(.,A = '4.0'))))

because dplyr often has special c++ versions of some functions that it can use if it recognized the functions (like lag for example) but this will not happen if you also specify the namespace (if you use dplyr::lag).