statquant statquant - 20 days ago 6
R Question

How can I replace part of a string if it is included in a pattern

I am looking for a way to replace all

_
(by say
''
) in each of the following characters

x <- c('test_(match)','test_xMatchToo','test_a','test_b')


if and only if
_
is followed by
(
or
x
. So the output wanted is:

x <- c('test(match)','testxMatchToo','test_a','test_b')


How can this be done (using any package is fine)?

Answer

Using a lookahead:

_(?=[(x])

What a lookahead does is assert that the pattern matches, but does not actually match the pattern it's looking ahead for. So, here, the final match text consists of only the underscore, but the lookahead asserts that it's followed by an x or (.

Demo on Regex101

Your R code would look a bit like this (one arg per line for clarity):

gsub(
    "_(?=[(x])",                            # The regex
    "",                                     # Replacement text
    ("your_string", "your_(other)_string"), # Vector of strings
    perl=TRUE                               # Make sure to use PCRE
)
Comments