acdr - 1 year ago 147
Python Question

# Efficiently find indices of all values in an array

I have a very large array, consisting of integers between 0 and N, where each value occurs at least once.

I'd like to know, for each value k, all the indices in my array where the array's value equals k.

For example:

``````arr = np.array([0,1,2,3,2,1,0])
desired_output = {
0: np.array([0,6]),
1: np.array([1,5]),
2: np.array([2,4]),
3: np.array([3]),
}
``````

Right now I am accomplishing this with a loop over
`range(N+1)`
, and calling
`np.where`
N times.

``````indices = {}
for value in range(max(arr)+1):
indices[value] = np.where(arr == value)[0]
``````

This loop is by far the slowest part of my code. (Both the
`arr==value`
evaluation and the
`np.where`
call take up significant chunks of time.) Is there a more efficient way to do this?

I also tried playing around with
`np.unique(arr, return_index=True)`
but that only tells me the very first index, rather than all of them.

A pythonic way is using `collections.defaultdict()`:

``````>>> from collections import defaultdict
>>>
>>> d = defaultdict(list)
>>>
>>> for i, j in enumerate(arr):
...     d[j].append(i)
...
>>> d
defaultdict(<type 'list'>, {0: [0, 6], 1: [1, 5], 2: [2, 4], 3: [3]})
``````

And here is a Numpythonic way using a dictionary comprehension and `numpy.where()`:

``````>>> {i: np.where(arr == i)[0] for i in np.unique(arr)}
{0: array([0, 6]), 1: array([1, 5]), 2: array([2, 4]), 3: array([3])}
``````

And here is a pure Numpythonic approach if you don't want to involve the dictionary:

``````>>> uniq = np.unique(arr)
>>> args, indices = np.where((np.tile(arr, len(uniq)).reshape(len(uniq), len(arr)) == np.vstack(uniq)))
>>> np.split(indices, np.where(np.diff(args))[0] + 1)
[array([0, 6]), array([1, 5]), array([2, 4]), array([3])]
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download