Miguel Miguel - 3 months ago 13
R Question

R: Remove consecutive duplicates from comma separated string

I'm having issues removing just the right amount of information from the following data:


18,14,17,2,9,8

17,17,17,14

18,14,17,2,1,1,1,1,9,8,1,1,1


I'm applying !duplicate in order to remove the duplicates.

SplitFunction <- function(x) {
b <- unlist(strsplit(x, '[,]'))
c <- b[!duplicated(b)]
return(paste(c, collapse=","))
}


I'm having issues removing only consecutive duplicates. The result below is what I'm getting.


18,14,17,2,9,8

17,14

18,14,17,2,1,9,8


The data below is what I want to obtain.


18,14,17,2,9,8

17,14

18,14,17,2,1,9,8,1


Can you suggest a way to perform this? Ideally a vectorized approach...

Thanks,

Miguel

Answer

you can use rle function to sovle this question.

xx <- c("18,14,17,2,9,8","17,17,17,14","18,14,17,2,1,1,1,1,9,8,1,1,1")
zz <- strsplit(xx,",")
sapply(zz,function(x) rle(x)$value)

And you can refer to this link. How to remove/collapse consecutive duplicate values in sequence in R?

Comments