akshay akshay - 3 months ago 11
Python Question

Order of repetition per row and column in Python

I have been trying to figure the order of repetition per-row and just couldn't do it. Ok. Lets consider a ndarray of size

(2, 11, 10)


a = np.array([
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 1, 1, 1, 0, 0],
[0, 1, 0, 0, 0, 1, 0, 0, 1, 0],
[1, 1, 0, 0, 1, 1, 1, 1, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
[1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
[1, 1, 0, 1, 1, 0, 1, 1, 0, 0],
[0, 1, 1, 1, 0, 0, 1, 1, 0, 1],
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 1, 0, 0, 1, 1],
[0, 1, 1, 1, 0, 0, 1, 1, 0, 1]
],
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 1, 0, 0, 0, 1, 1],
[0, 1, 0, 1, 0, 0, 0, 1, 0, 0],
[1, 1, 0, 1, 0, 1, 1, 1, 0, 0],
[1, 1, 0, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 1, 1, 0, 0],
[1, 0, 0, 0, 1, 1, 0, 0, 1, 1],
[1, 1, 1, 0, 0, 1, 1, 1, 0, 1],
[1, 0, 0, 1, 1, 0, 1, 0, 1, 0],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[1, 1, 1, 0, 0, 1, 1, 1, 0, 1]
]
])


What I wanted to is to get the order of every
1's
per row based on a column. Whenever the first
1
is found in a row the order starts would start at
0
; then goes to the second row if
1
is found here then the order is
1
, but if the
1
is already present at the column index in the previous row, then it is ignored. For example

Lets consider these lists:

0 1 2 3 4 5 6 7 8 9 -> column index
0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], -> no 1's no order here
1 [1, 1, 0, 0, 0, 1, 1, 1, 0, 0], -> order starts at 0
2 [0, 1, 0, 0, 0, 1, 0, 0, 1, 0], -> order starts at 1


At row index
0
there are no
1
so nothing happens, at row index
1
there are ones in column index
[0,1,5,6,7]
this will be equal to
0
; the output should be

column order
0 0
1 0
2 -
3 -
4 -
5 0
6 0
7 0
8 -
9 -


At row index
2
there are
1
at column index
[1,5,8]
whos order is
1
; in there
1
and
5
are ignored because it already has an order
0
to it, but for the unknown order it should be
1
; the final output should be

column order
0 0
1 0
2 -
3 -
4 -
5 0
6 0
7 0
8 1
9 -


I have tried using Numpy's
np.where
method to the index values; something like this

index = np.asarray(np.where(a == 1)).T


I have no idea what to do next. Can anyone please help me?

Answer

The desired result (clarified in the comments) is to find the row index of the first 1 in each column. If it is guaranteed that there is at least one 1 in each column, the result can be found using a.argmax(axis=1):

In [77]: a
Out[77]: 
array([[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 0, 0, 0, 1, 1, 1, 0, 0],
        [0, 1, 0, 0, 0, 1, 0, 0, 1, 0],
        [1, 1, 0, 0, 1, 1, 1, 1, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
        [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
        [1, 1, 0, 1, 1, 0, 1, 1, 0, 0],
        [0, 1, 1, 1, 0, 0, 1, 1, 0, 1],
        [1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
        [0, 0, 1, 1, 0, 1, 0, 0, 1, 1],
        [0, 1, 1, 1, 0, 0, 1, 1, 0, 1]],

       [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 1, 0, 0, 0, 1, 1],
        [0, 1, 0, 1, 0, 0, 0, 1, 0, 0],
        [1, 1, 0, 1, 0, 1, 1, 1, 0, 0],
        [1, 1, 0, 1, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 0, 0, 0, 1, 1, 0, 0],
        [1, 0, 0, 0, 1, 1, 0, 0, 1, 1],
        [1, 1, 1, 0, 0, 1, 1, 1, 0, 1],
        [1, 0, 0, 1, 1, 0, 1, 0, 1, 0],
        [1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
        [1, 1, 1, 0, 0, 1, 1, 1, 0, 1]]])

In [78]: a.argmax(axis=1)
Out[78]: 
array([[1, 1, 4, 4, 3, 1, 1, 1, 2, 7],
       [3, 1, 5, 2, 1, 3, 3, 2, 1, 1]])

This method assigns 0 to any column that is all zeros. In the example array a, the first row of both 2-d arrays is all 0. If that is always the case, then a 0 in a.argmax(axis=1) can be used to detect a column that is all zeros.

Alternatively, with some additional processing, -1 can be put in the result when a column is all zeros. For example,

In [135]: b
Out[135]: 
array([[[1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0],
        [0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0],
        [0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1],
        [1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0]],

       [[1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1],
        [1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1],
        [0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1]]])

In [136]: result = b.argmax(axis=1)

In [137]: result[(b == 0).all(axis=1)] = -1

In [138]: result
Out[138]: 
array([[ 0,  1,  2, -1,  0,  3,  1,  1,  0,  0,  0,  2],
       [ 0,  1, -1,  0,  0,  3,  0,  2,  0,  0,  1,  0]])