TrebiLime TrebiLime - 1 month ago 6
R Question

How to delete string or digits after certain pattern?

If there is a vector x that is,

x <- c('/name12/?ad_2','/name13/?ad_3','/name14/?ad_4')


Is there a way to delete the following numbers after 'ad_'?

so the converted x appears as

'/name12/?ad_' '/name13/?ad_' '/name14/?ad_'


I was trying to use
gsub
function but it didn't work because of the digits followed by 'name'.

Answer

You may use a regex with sub (since you perform a single search and replace, you do not need gsub) and use a pattern depending on what you need to include or exclude in the result.

You might use "(\\?ad_)[0-9]+$" to remove ?ad_ + digits and replace with "\\1" to restore the ?ad_ value, or just match the _ and then digits (and replace with _).

See demo code:

> x <- c('/name12/?ad_2','/name13/?ad_3','/name14/?ad_4')
> sub("(\\?ad_)[0-9]+$", "\\1", x)
[1] "/name12/?ad_" "/name13/?ad_" "/name14/?ad_"
> sub("_[0-9]+$", "_", x)
[1] "/name12/?ad_" "/name13/?ad_" "/name14/?ad_"

See the regex demo

Pattern details:

  • _ - matches an underscore
  • [0-9]+ - 1 or more (due to the + quantifier matching one or more occurrences, as many as possible)
  • $ - the end of string.
Comments