John Crow John Crow - 23 days ago 7
Python Question

Efficient way to delete elements in one numpy array from another

What is the best way to delete the elements from one numpy array in another? Essentially I'm after

np.delete()
where the order of the arrays doesn't matter.

import numpy as np
a = np.array([2,1,3])
print a
b = np.array([4,1,2,5,2,3])
b = np.delete(b, a) # doesn't work as desired
print b # want [4,5,2]


Iterating over the elements of
a
is very slow for large arrays.

Answer

You can use np.argmax to find the first True element along a set of rows or columns. So, for example, you can do a broadcasted version of this operation this way:

>>> a = np.array([2,1,3])
>>> b = np.array([4,1,2,5,2,3])
>>> np.delete(b, np.argmax(b == a[:, np.newaxis], axis=1))
array([4, 5, 2])

Of course, as with many numpy vectorized operations, the speed comes at the cost of allocating an array of size len(a) * len(b), so depending on your inputs this may not be appropriate.