Imanol Luengo Imanol Luengo - 6 months ago 25
Python Question

Numpy finding element index in another array

I have an array/set with unique positive integers, i.e.

>>> unique = np.unique(np.random.choice(100, 4, replace=False))


And an array containing multiple elements sampled from this previous array, such as

>>> A = np.random.choice(unique, 100)


I want to map the values of the array
A
to the position of which those values occur in
unique
.

So far the best solution I found is through a mapping array:

>>> table = np.zeros(unique.max()+1, unique.dtype)
>>> table[unique] = np.arange(unique.size)


The above assigns to each element the index on the array, and thus, can be used later to map
A
through advanced indexing:

>>> table[A]
array([2, 2, 3, 3, 3, 3, 1, 1, 1, 0, 2, 0, 1, 0, 2, 1, 0, 0, 2, 3, 0, 0, 0,
0, 3, 3, 2, 1, 0, 0, 0, 2, 1, 0, 3, 0, 1, 3, 0, 1, 2, 3, 3, 3, 3, 1,
3, 0, 1, 2, 0, 0, 2, 3, 1, 0, 3, 2, 3, 3, 3, 1, 1, 2, 0, 0, 2, 0, 2,
3, 1, 1, 3, 3, 2, 1, 2, 0, 2, 1, 0, 1, 2, 0, 2, 0, 1, 3, 0, 2, 0, 1,
3, 2, 2, 1, 3, 0, 3, 3], dtype=int32)


Which already gives me the proper solution. However, if the unique numbers in
unique
are very sparse and large, this approach implies creating a very large
table
array just to store a few numbers for later mapping.

Is there any better solution?

NOTE: both
A
and
unique
are sample arrays, not real arrays. So the question is not how to generate positional indexes, it is just how to efficiently map elements of
A
to indexes in
unique
, the pseudocode of what I'd like to speedup in numpy is as follows,

B = np.zeros_like(A)
for i in range(A.size):
B[i] = unique.index(A[i])


(assuming
unique
is a list in the above pseudocode).

Answer

The table approach described in your question is the best option when unique if pretty dense, but unique.searchsorted(A) should produce the same result and doesn't require unique to be dense. searchsorted is great with ints, if anyone is trying to do this kind of thing with floats which have precision limitations, consider something like this.