aL3xa - 1 year ago 72
R Question

# rle-like function that catches "run" of adjacent integers

I'm pretty sure that you all agree that

`rle`
is one of those "gotcha" functions in R. Is there any similar function that can "catch" a "run" of adjacent integer values?

So, if I have a vector like this one:

``````x <- c(3:5, 10:15, 17, 22, 23, 35:40)
``````

and I call that esoteric function, I'll get response like this one:

``````lengths: 3, 6, 1, 2, 6
values: (3,4,5), (10,11,12... # you get the point
``````

It's not that hard to write a function like this, but still... any ideas?

1) Calculate values and then lengths based on values

``````s <- split(x, cumsum(c(0, diff(x) != 1)))
run.info <- list(lengths = unname(sapply(s, length)), values = unname(s))
``````

Running it using `x` from the question gives this:

``````> str(run.info)
List of 2
\$ lengths: int [1:5] 3 6 1 2 6
\$ values :List of 5
..\$ : num [1:3] 3 4 5
..\$ : num [1:6] 10 11 12 13 14 15
..\$ : num 17
..\$ : num [1:2] 22 23
..\$ : num [1:6] 35 36 37 38 39 40
``````

2) Calculate lengths and then values based on lengths

Here is a second solution based on Gregor's length calculation:

``````lens <- rle(x - seq_along(x))\$lengths
list(lengths = lens, values = unname(split(x, rep(seq_along(lens), lens))))
``````

3) Calculate lengths and values without using other

This one seems inefficient since it calculates each of `lengths` and `values` from scratch and it also seems somewhat overly complex but it does manage to get it all down to a single statement so I thought I would add it as well. Its basically just a mix of the prior two solutions marked 1) and 2) above. Nothing really new relative to those two.

``````list(lengths = rle(x - seq_along(x))\$lengths,
values = unname(split(x, cumsum(c(0, diff(x) != 1)))))
``````