Imanol Luengo - 1 year ago 153
Python Question

# Numpy finding element index in another array

I have an array/set with unique positive integers, i.e.

``````>>> unique = np.unique(np.random.choice(100, 4, replace=False))
``````

And an array containing multiple elements sampled from this previous array, such as

``````>>> A = np.random.choice(unique, 100)
``````

I want to map the values of the array
`A`
to the position of which those values occur in
`unique`
.

So far the best solution I found is through a mapping array:

``````>>> table = np.zeros(unique.max()+1, unique.dtype)
>>> table[unique] = np.arange(unique.size)
``````

The above assigns to each element the index on the array, and thus, can be used later to map
`A`

``````>>> table[A]
array([2, 2, 3, 3, 3, 3, 1, 1, 1, 0, 2, 0, 1, 0, 2, 1, 0, 0, 2, 3, 0, 0, 0,
0, 3, 3, 2, 1, 0, 0, 0, 2, 1, 0, 3, 0, 1, 3, 0, 1, 2, 3, 3, 3, 3, 1,
3, 0, 1, 2, 0, 0, 2, 3, 1, 0, 3, 2, 3, 3, 3, 1, 1, 2, 0, 0, 2, 0, 2,
3, 1, 1, 3, 3, 2, 1, 2, 0, 2, 1, 0, 1, 2, 0, 2, 0, 1, 3, 0, 2, 0, 1,
3, 2, 2, 1, 3, 0, 3, 3], dtype=int32)
``````

Which already gives me the proper solution. However, if the unique numbers in
`unique`
are very sparse and large, this approach implies creating a very large
`table`
array just to store a few numbers for later mapping.

Is there any better solution?

NOTE: both
`A`
and
`unique`
are sample arrays, not real arrays. So the question is not how to generate positional indexes, it is just how to efficiently map elements of
`A`
to indexes in
`unique`
, the pseudocode of what I'd like to speedup in numpy is as follows,

``````B = np.zeros_like(A)
for i in range(A.size):
B[i] = unique.index(A[i])
``````

(assuming
`unique`
is a list in the above pseudocode).

The table approach described in your question is the best option when `unique` if pretty dense, but `unique.searchsorted(A)` should produce the same result and doesn't require `unique` to be dense. `searchsorted` is great with ints, if anyone is trying to do this kind of thing with floats which have precision limitations, consider something like this.