piRSquared - 9 months ago 70

Python Question

consider the array

`a`

`np.random.seed([3,1415])`

a = np.random.randint(0, 10, (10, 2))

a

array([[0, 2],

[7, 3],

[8, 7],

[0, 6],

[8, 6],

[0, 2],

[0, 4],

[9, 7],

[3, 2],

[4, 3]])

What is a vectorized way to get the cumulative argmax

`array([[0, 0], <-- both start off as max position`

[1, 1], <-- 7 > 0 so 1st col = 1, 3 > 2 2nd col = 1

[2, 2], <-- 8 > 7 1st col = 2, 7 > 3 2nd col = 2

[2, 2], <-- 0 < 8 1st col stays the same, 6 < 7 2nd col stays the same

[2, 2],

[2, 2],

[2, 2],

[7, 2], <-- 9 is new max of 2nd col, argmax is now 7

[7, 2],

[7, 2]])

here is a non-vectorized way to do it.

Notice that as the window expands, argmax applies to the growing window.

`pd.DataFrame(a).expanding().apply(np.argmax).astype(int).values`

array([[0, 0],

[1, 1],

[2, 2],

[2, 2],

[2, 2],

[2, 2],

[2, 2],

[7, 2],

[7, 2],

[7, 2]])

Answer Source

I would like to make a function that computes cumulative argmax for 1d array and then apply it to all columns. This is the code:

```
import numpy as np
np.random.seed([3,1415])
a = np.random.randint(0, 10, (10, 2))
def cumargmax(v):
uargmax = np.frompyfunc(lambda i, j: j if v[j] > v[i] else i, 2, 1)
return uargmax.accumulate(np.arange(0, len(v)), 0, dtype=np.object).astype(v.dtype)
np.apply_along_axis(cumargmax, 0, a)
```

The reason for converting to `np.object`

and then converting back is a workaround for Numpy 1.9, as mentioned in generalized cumulative functions in NumPy/SciPy?