chunxuan - 1 month ago 6x
R Question

# How to split a string from right-to-left, like Python's rsplit()?

Suppose a vector:

xx.1 <- c("zz_ZZ_uu_d", "II_OO_d")

I want to get a new vector splitted from right most and only split once. The expected results would be:

c("zz_ZZ_uu", "d", "II_OO", "d").

It would be like python's
rsplit()
function. My current idea is to reverse the string, and split the with
str_split()
in
stringr
.

Any better solutions?

unlist(strsplit(xx.1, "_(?!.*_)", perl = TRUE))
# [1] "zz_ZZ_uu" "d"        "II_OO"    "d"

Where a(?!b) says to find such an a which is not followed by a b. In this case .*_ means that no matter how far (.*) there should not be any more _'s.

However, it seems to be not that easy to generalise this idea. First, note that it can be rewritten as positive lookahead with _(?=[^_]*\$) (find _ followed by anything but _, here \$ signifies the end of a string). Then a not very elegant generalisation would be

rsplit <- function(x, s, n) {
p <- paste0("[^", s, "]*")
rx <- paste0(s, "(?=", paste(rep(paste0(p, s), n - 1), collapse = ""), p, "\$)")
unlist(strsplit(x, rx, perl = TRUE))
}

rsplit(vec, "_", 1)
# [1] "a_b_c_d_e_f" "g"           "a"           "b"
rsplit(vec, "_", 3)
# [1] "a_b_c_d" "e_f_g"   "a_b"

where e.g. in case n=3 this function uses _(?=[^_]*_[^_]*_[^_]*\$).