Kristofersen Kristofersen - 8 days ago 6
R Question

Put a space before emojis

I am trying to clean up some text. I have a list of emojis that I do not want removed from the text. I would like to put a space before these emojis only if there is not one already.

emojis = as.character(outer(c(":", ";", ":-", ";-","="),c(")", "(", "]", "[", "D", "o", "O", "P", "p","8"),FUN = paste,sep=""))


If I had a tweet that said.

Tweet = "I am so happy:)"


I would like that to be

Tweet = "I am so happy :)"


after the code is run.

It's a pretty simple idea, I haven't been able to find any code to do this though.

Full list of emojis that need a space before them:

":)" ";)" ":-)" ";-)" "=)" ":(" ";(" ":-(" ";-(" "=(" ":]" ";]" ":-]" ";-]" "=]" ":[" ";[" ":-[" ";-[" "=[" ":D" ";D" ":-D" ";-D" "=D" ":o" ";o" ":-o" ";-o" "=o" ":O" ";O" ":-O" ";-O" "=O" ":P" ";P" ":-P" ";-P" "=P" ":p" ";p" ":-p" ";-p" "=p" ":8" ";8" ":-8" ";-8" "=8"

Answer

A regular expression can help.

emojis = as.character(outer(c(":", ";", ":-", ";-","="),c("\\)", "\\(", "\\]", "\\[", "D", "o", "O", "P", "p","8"),FUN = paste,sep=""))
pat <- paste0("(\\w+)(", paste(emojis, collapse="|"), ")")
Tweet = "I am so happy:)"
sub(pat, "\\1 \\2", Tweet)
#[1] "I am so happy :)"