Octoplus Octoplus - 4 months ago 14
Python Question

Find the row indexes of several values in a numpy array

I have an array X:

X = np.array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])


And I wish to find the index of the row of several values in this array:

searched_values = np.array([[4, 2],
[3, 3],
[5, 6]])


For this example I would like a result like:

[0,3,4]


I have a code doing this, but I think it is overly complicated:

X = np.array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])

searched_values = np.array([[4, 2],
[3, 3],
[5, 6]])

result = []

for s in searched_values:
idx = np.argwhere([np.all((X-s)==0, axis=1)])[0][1]
result.append(idx)

print(result)


I found this answer for a similar question but it works only for 1d arrays.

Is there a way to do what I want in a simpler way?

Answer

One approach would be to use NumPy broadcasting, like so -

np.where((X==searched_values[:,None]).all(-1))[1]

Sample run -

In [47]: X
Out[47]: 
array([[4, 2],
       [9, 3],
       [8, 5],
       [3, 3],
       [5, 6]])

In [48]: searched_values
Out[48]: 
array([[4, 2],
       [3, 3],
       [5, 6]])

In [49]: np.where((X==searched_values[:,None]).all(-1))[1]
Out[49]: array([0, 3, 4])

A memory efficient approach would be to convert each row as linear index equivalents and then using np.in1d, like so -

dims = X.max(0)+1
out = np.where(np.in1d(np.ravel_multi_index(X.T,dims),\
                    np.ravel_multi_index(searched_values.T,dims)))[0]

Another memory efficient approach using np.searchsorted and with that same philosophy of converting to linear index equivalents would be like so -

dims = X.max(0)+1
X1D = np.ravel_multi_index(X.T,dims)
searched_valuesID = np.ravel_multi_index(searched_values.T,dims)
sidx = X1D.argsort()
out = sidx[np.searchsorted(X1D,searched_valuesID,sorter=sidx)]