dranobob dranobob - 15 days ago 5
Python Question

How to split an array based on minimum row value using vectorization

I am trying to figure out how to take the following for loop that splits an array based on the index of the lowest value in the row and use vectorization. I've looked at this link and have been trying to use the numpy.where function but currently unsuccessful.

For example if an array has n columns, then all the rows where col[0] has the lowest value are put in one array, all the rows where col[1] are put in another, etc.

Here's the code using a for loop.

import numpy

a = numpy.array([[ 0. 1. 3.]
[ 0. 1. 3.]
[ 0. 1. 3.]
[ 1. 0. 2.]
[ 1. 0. 2.]
[ 1. 0. 2.]
[ 3. 1. 0.]
[ 3. 1. 0.]
[ 3. 1. 0.]])

result_0 = []
result_1 = []
result_2 = []
for value in a:
if value[0] <= value[1] and value[0] <= value[2]:
result_0.append(value)
elif value[1] <= value[0] and value[1] <= value[2]:
result_1.append(value)
else:
result_2.append(value)

print(result_0)
>>[array([ 0. 1. 3.]), array([ 0. 1. 3.]), array([ 0. 1. 3.])]
print(result_1)
>>[array([ 1. 0. 2.]), array([ 1. 0. 2.]), array([ 1. 0. 2.])]
print(result_2)
>>[array([ 3. 1. 0.]), array([ 3. 1. 0.]), array([ 3. 1. 0.])]

Answer

First, use argsort to see where the lowest value in each row is:

>>> a.argsort(axis=1)

array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2],
       [1, 0, 2],
       [1, 0, 2],
       [1, 0, 2],
       [2, 1, 0],
       [2, 1, 0],
       [2, 1, 0]])

Note that wherever a row has 0, that is the smallest column in that row.

Now you can build the results:

>>> sortidx = a.argsort(axis=1)
>>> [a[sortidx[:,i] == 0] for i in range(a.shape[1])]

[array([[ 0.,  1.,  3.],
        [ 0.,  1.,  3.],
        [ 0.,  1.,  3.]]),
 array([[ 1.,  0.,  2.],
        [ 1.,  0.,  2.],
        [ 1.,  0.,  2.]]),
 array([[ 3.,  1.,  0.],
        [ 3.,  1.,  0.],
        [ 3.,  1.,  0.]])]

So it is done with only a single loop over the columns, which will give a huge speedup if the number of rows is much larger than the number of columns.

Comments