Daniel F - 3 years ago 126

Python Question

Say I have this type of array

`y`

array([299839, 667136, 665420, 665418, 665421, 667135, 299799, 665419, 667137, 299800])

as the result of a "top 10"

`argpartition`

`y = np.argpartiton(-x, np.arange(10))[:10]`

Now, I want to remove the elements that are sequential, only keeping the first (maximum) element in the series such that:

`y_new`

array([299839, 667136, 665420, 299799])

But while that seems like it should be simple I'm not seeing an efficient way to do it (or even a good way to start). Assume the real-world application will do the top 1000 or so and need to do it many times.

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

Here's one approach based on sorting -

```
# Get the sorted indices
sidx = y.argsort()
# Get sorted array
ys = y[sidx]
# Get indices at which islands of sequential numbers start/stop
cut_idx = np.flatnonzero(np.concatenate(([True], np.diff(ys)!=1 )))
# Finally get the minimum indices for each island and then index into
# input for the desired output
y_new = y[np.minimum.reduceat(sidx, cut_idx)]
```

If you would like to keep the order of elements in the output, sort the indices and then index at the last step -

```
y[np.sort(np.minimum.reduceat(sidx, cut_idx))]
```

Sample input, output -

```
In [56]: y
Out[56]:
array([299839, 667136, 665420, 665418, 665421, 667135, 299799, 665419,
667137, 299800])
In [57]: y_new
Out[57]: array([299799, 299839, 665420, 667136])
In [58]: y[np.sort(np.minimum.reduceat(sidx, cut_idx))]
Out[58]: array([299839, 667136, 665420, 299799])
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**