What's the right numpy syntax to compare one column against others in a 2d ndarray?
After reading some docs on array broadcasting, I am still not quite sure what the correct way to do this is.
Example: Suppose I have a 2d array of goals scored by each player (row) in each game (column).
# goals = number of goals scored by ith player in jth game (NaN if player did not play)
# column = game
goals = np.array([ [np.nan, 0, 1], # row = player
[ 1, 2, 0],
[ 0, 0, np.nan],
[np.nan, 1, 1],
[ 0, 0, 1] ])
nan
True
goals[:,2] > goals[:,:2]
ValueError: operands could not be broadcast together with shapes (5,) (5,2)
(5,)
(5,2)
np.newaxis
with np.errstate(invalid='ignore'):
personalBest= ( np.isnan(goals[:,:2]) 
(goals[:,2][:,np.newaxis] > goals[:,:2] )
).all(axis=1)
print(personalBest) # returns desired solution
You could do something like this 
np.flatnonzero((goals[:,None,1] > goals[:,:1]).any(1))
Let's go through it in steps.
Step #1: We are introducing a new axis on the lastcolumn sliced version to keep it as 2D
with the last axis being a singleton dimension/axis. The idea is to compare each of its element against all elements in that row except the element itself :
In [3]: goals[:,None,1]
Out[3]:
array([[ 1.],
[ 0.],
[ nan],
[ 1.],
[ 1.]])
In [4]: goals[:,None,1].shape # Check the shapes for broadcasting alignment
Out[4]: (5, 1)
In [5]: goals.shape
Out[5]: (5, 3)
Step #2: Next up, we are actually performing the comparison against all the columns of the array skipping the last column itself as that's part of the sliced version obtained earlier 
In [7]: goals[:,None,1] > goals[:,:1]
Out[7]:
array([[False, True],
[False, False],
[False, False],
[False, False],
[ True, True]], dtype=bool)
Step #3: Then, we are checking if there's ANY match along each row 
In [8]: (goals[:,None,1] > goals[:,:1]).any(axis=1)
Out[8]: array([ True, False, False, False, True], dtype=bool)
Step #4: Finally, getting the matching indices with np.flatnonzero

In [9]: np.flatnonzero((goals[:,None,1] > goals[:,:1]).any(axis=1))
Out[9]: array([0, 4])