snowneji snowneji - 4 months ago 10
R Question

R: how to use regex to substitute an element appeared multiple time by single one of them

I have:

txt= 'finance . . . . . lottery ticket . . . community'


trying to get:

txt2 = 'finance.lottery ticket.community'


but the following didn't work:

gsub('[[:punct:]]{2,}','',txt)


Did I do anything wrong here? Thanks!

Answer

There are spaces between . in your text, you need to include that in your regex as well:

gsub('(\\s?[[:punct:]]\\s?){2,}','.',txt)
# [1] "finance.lottery ticket.community"