Gabriel Gabriel - 1 day ago 6
Python Question

Remove elements from array updating list of stored indexes accordingly

Consider a

numpy
array of the form:

> a = np.random.uniform(0., 100., (10, 1000))


and a list of indexes to elements in that array that I want to keep track of:

> idx_s = [0, 5, 7, 9, 12, 17, 19, 32, 33, 35, 36, 39, 40, 41, 42, 45, 47, 51, 53, 57, 59, 60, 61, 62, 63, 65, 66, 70, 71, 73, 75, 81, 83, 85, 87, 88, 89, 90, 91, 93, 94, 96, 98, 100, 106, 107, 108, 118, 119, 121, 124, 126, 127, 128, 129, 133, 135, 138, 142, 143, 144, 146, 147, 150]


I also have a list of indexes of elements I need to remove from
a
:

> idx_d = [4, 12, 18, 20, 21, 22, 26, 28, 29, 31, 37, 43, 48, 54, 58, 74, 80, 86, 99, 109, 110, 113, 117, 134, 139, 140, 141, 148, 154, 156, 160, 166, 169, 175, 183, 194, 198, 199, 219, 220, 237, 239, 241, 250]


which I delete with:

> a_d = np.delete(arr, idx_d, axis=1)


But this process alters the indexes of elements in
a_d
. The indexes in
idx_s
no longer point in
a_d
to the same elements in
a
, since
np.delete()
moved them. For example: if I delete the element of index
4
from
a
, then all indexes after
4
in
idx_s
are now displaced by 1 to the right in
a_d
.

v Index 5 points to 'f' in a
0 1 2 3 4 5 6
a -> a b c d e f g ... # Remove 4th element 'e' from a
a_d -> a b c d f g h ... # Now index 5 no longer points to 'f' in a_d, but to 'g'
0 1 2 3 4 5 6


How do I update the
idx_s
list of indexes, so that the same elements that were pointed in
a
are pointed in
a_d
?

In the case of an element that is present in
idx_s
that is also present in
idx_d
(and thus removed from
a
and not present in
a_d
) its index should also be discarded.

Answer

You could use np.searchsorted to get the shifts for each element in idx_s and then simply subtract those from idx_s for the new shifted-down values, like so -

idx_s - np.searchsorted(idx_d, idx_s)

If idx_d is not already sorted, we need to feed in a sorted version. Thus, for simplicity assuming these as arrays, we would have -

idx_s = idx_s[~np.in1d(idx_s, idx_d)]
out = idx_s - np.searchsorted(np.sort(idx_d), idx_s)

A sample run to help out getting a better picture -

In [530]: idx_s
Out[530]: array([19,  5, 17,  9, 12,  7,  0])

In [531]: idx_d
Out[531]: array([12,  4, 18])

In [532]: idx_s = idx_s[~np.in1d(idx_s, idx_d)] # Remove matching ones

In [533]: idx_s
Out[533]: array([19,  5, 17,  9,  7,  0])

In [534]: idx_s - np.searchsorted(np.sort(idx_d), idx_s) # Updated idx_s
Out[534]: array([16,  4, 15,  8,  6,  0])
Comments