dranobob - 1 month ago 17

Python Question

I am trying to figure out how to take the following for loop that splits an array based on the index of the lowest value in the row and use vectorization. I've looked at this link and have been trying to use the numpy.where function but currently unsuccessful.

For example if an array has *n* columns, then all the rows where *col[0]* has the lowest value are put in one array, all the rows where *col[1]* are put in another, etc.

Here's the code using a for loop.

`import numpy`

a = numpy.array([[ 0. 1. 3.]

[ 0. 1. 3.]

[ 0. 1. 3.]

[ 1. 0. 2.]

[ 1. 0. 2.]

[ 1. 0. 2.]

[ 3. 1. 0.]

[ 3. 1. 0.]

[ 3. 1. 0.]])

result_0 = []

result_1 = []

result_2 = []

for value in a:

if value[0] <= value[1] and value[0] <= value[2]:

result_0.append(value)

elif value[1] <= value[0] and value[1] <= value[2]:

result_1.append(value)

else:

result_2.append(value)

print(result_0)

>>[array([ 0. 1. 3.]), array([ 0. 1. 3.]), array([ 0. 1. 3.])]

print(result_1)

>>[array([ 1. 0. 2.]), array([ 1. 0. 2.]), array([ 1. 0. 2.])]

print(result_2)

>>[array([ 3. 1. 0.]), array([ 3. 1. 0.]), array([ 3. 1. 0.])]

Answer

First, use `argsort`

to see where the lowest value in each row is:

```
>>> a.argsort(axis=1)
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[1, 0, 2],
[1, 0, 2],
[1, 0, 2],
[2, 1, 0],
[2, 1, 0],
[2, 1, 0]])
```

Note that wherever a row has `0`

, that is the smallest column in that row.

Now you can build the results:

```
>>> sortidx = a.argsort(axis=1)
>>> [a[sortidx[:,i] == 0] for i in range(a.shape[1])]
[array([[ 0., 1., 3.],
[ 0., 1., 3.],
[ 0., 1., 3.]]),
array([[ 1., 0., 2.],
[ 1., 0., 2.],
[ 1., 0., 2.]]),
array([[ 3., 1., 0.],
[ 3., 1., 0.],
[ 3., 1., 0.]])]
```

So it is done with only a single loop over the columns, which will give a huge speedup if the number of rows is much larger than the number of columns.