yake84 yake84 - 3 months ago 22
R Question

Dplyr pipe (%>%) within mutate()?

The piping in

dplyr
is cool and sometimes I want to clean up one column by applying multiple commands to it. Is there a way to use the pipe within the
mutate()
command? I notice this most when using regex and it comes up also in other contexts. In the example below, I can clearly see the different manipulations I am applying to the column "Clean" and I am curious if there is a way to do something that mimics
%>%
within
mutate()
.

library(dplyr)
phone <- data.frame(Numbers = c("1234567890", "555-3456789", "222-222-2222",
"5131831249", "123.321.1234","(333)444-5555",
"+1 123-223-3234", "555-666-7777 x100"),
stringsAsFactors = F)

phone2 <- phone %>%
mutate(Clean = gsub("[A-Za-z].*", "", Numbers), #remove extensions
Clean = gsub("[^0-9]", "", Clean), #remove parentheses, dashes, etc
Clean = substr(Clean, nchar(Clean)-9, nchar(Clean)), #grab the right 10 characters
Clean = gsub("(^\\d{3})(\\d{3})(\\d{4}$)", "(\\1)\\2-\\3", Clean)) #format

phone2


I know there might be a better
gsub()
command but for the purposes of this question, I want to know if there is a way to pipe these
gsub()
elements together so that I don't have to keep writing
Clean = gsub(...)
but also not have to use the method where I embed these inside each other.

It would be fine with me if you answer this question using a simpler example.

Answer

I guess you need

phone %>% 
     mutate(Clean = gsub("[A-Za-z].*", "", Numbers) %>%
                    gsub("[^0-9]", "", .) %>%
                    substr(., nchar(.)-9, nchar(.)) %>% 
                    gsub("(^\\d{3})(\\d{3})(\\d{4}$)", "(\\1)\\2-\\3", .))
#            Numbers         Clean
#1        1234567890 (123)456-7890
#2       555-3456789 (555)345-6789
#3      222-222-2222 (222)222-2222
#4        5131831249 (513)183-1249
#5      123.321.1234 (123)321-1234
#6     (333)444-5555 (333)444-5555
#7   +1 123-223-3234 (123)223-3234
#8 555-666-7777 x100 (555)666-7777