And And - 3 months ago 9
R Question

Split the character vector into two parts

Consider the following character vector of the length 1:

l <- "http://www.idealo.de/preisvergleich/OffersOfProduct/4983410_-iphone-se-64gb-spacegrau-apple.html"


I desire to split it into two parts, so that the first part should be:

p1 <- "http://www.idealo.de/preisvergleich/OffersOfProduct/4983410"


and the second one:

p2 <- "_-iphone-se-64gb-spacegrau-apple.html"


Surely, one must use regexp to solve the problem. Please could you give me some insight where I can learn manipulation with regular expressions easily. For any help I will be sincerely thankful.

Answer

Use "(?<=[^_])(?=_)" with strsplit gives you what you need:

strsplit(l, "(?<=[^_])(?=_)", perl = T)

# [[1]]
# [1] "http://www.idealo.de/preisvergleich/OffersOfProduct/4983410"
# [2] "_-iphone-se-64gb-spacegrau-apple.html"