John Crow John Crow - 4 months ago 44
Python Question

Efficient way to delete elements in one numpy array from another

What is the best way to delete the elements from one numpy array in another? Essentially I'm after

where the order of the arrays doesn't matter.

import numpy as np
a = np.array([2,1,3])
print a
b = np.array([4,1,2,5,2,3])
b = np.delete(b, a) # doesn't work as desired
print b # want [4,5,2]

Iterating over the elements of
is very slow for large arrays.


You can use np.argmax to find the first True element along a set of rows or columns. So, for example, you can do a broadcasted version of this operation this way:

>>> a = np.array([2,1,3])
>>> b = np.array([4,1,2,5,2,3])
>>> np.delete(b, np.argmax(b == a[:, np.newaxis], axis=1))
array([4, 5, 2])

Of course, as with many numpy vectorized operations, the speed comes at the cost of allocating an array of size len(a) * len(b), so depending on your inputs this may not be appropriate.