Neil Neil - 3 months ago 13
R Question

Conditionally select column values in dplyr and then changing the datatype

I have following dataframe in R

Lead.Stage Number.of.Followup.Calls
1 Not Interested Select
2 Unreachable ""
3 Qualified 1
4 Unreachable 2
5 Qualified 2
6 Junk Lead Select


Number.of.Followup.Calls is of character type. I want to perform a groupby on Lead.Stage to calculate average no of follow up calls for that Lead.Stage

In dplyr I am filtering out
Select and empty String
and then converting digits to numeric one. I am using following code in r,but it does not seem to work.

train %>%
group_by(Lead.Stage) %>%
filter((Number.of.Followup.Calls == "" | Number.of.Followup.Calls ==
"Select")) %>%
mutate_each_(funs(as.numeric), Number.of.Followup.Calls) %>%
summarise(Total = mean(Number.of.Followup.Calls))


Thanks in advance :)

Answer

It is easier to do this with %in%

train %>% 
    group_by(Lead.Stage)  %>%
    filter(!Number.of.Followup.Calls %in% c("", "Select")) %>%
    summarise(Total = mean(as.numeric(Number.of.Followup.Calls)))
#   Lead.Stage Total
#       <chr> <dbl>
#1   Qualified   1.5
#2 Unreachable   2.0

Or otherwise, we don't need to do all the filter and other stuff, as converting to as.numeric automatically changes all the non-numeric elements to NA and then just do mean(., na.rm = TRUE)

train %>% 
    group_by(Lead.Stage)  %>%
    summarise(Total = mean(as.numeric(Number.of.Followup.Calls), na.rm = TRUE)) %>%
    na.omit()
#    Lead.Stage Total
#        <chr> <dbl>
# 1   Qualified   1.5
#2 Unreachable   2.0
#Warning messages:
#1: In mean(as.numeric(c("", "2")), na.rm = TRUE) :
# NAs introduced by coercion

The warning message is just a friendly reminder about converting the non-numeric elements to NA.