muon - 1 year ago 145
Python Question

# sum groups rows of numpy matrix using list of lists of indices

slice numpy array using lists of indices and apply function, is it possible to vectorize (or nonvectorized way to do this)? vectorized would be ideal for large matrices

``````import numpy as np
index = [[1,3], [2,4,5]]
a = np.array(
[[ 3,  4,  6,  3],
[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15],
[1, 1,    4,  5]])
``````

summing by the groups of row indices in
`index`
, giving:

``````np.array([[8, 10, 12, 14],
[17, 19, 24, 37]])
``````

Approach #1 : Here's an almost* vectorized approach -

``````def sumrowsby_index(a, index):
index_arr = np.concatenate(index)
lens = np.array([len(i) for i in index])
cut_idx = np.concatenate(([0], lens[:-1].cumsum() ))
``````

*Almost because of the step that computes `lens` with a loop-comprehension, but since we are simply getting the lengths and no computation is involved there, that step won't sway the timings in any big way.

Sample run -

``````In [716]: a
Out[716]:
array([[ 3,  4,  6,  3],
[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15],
[ 1,  1,  4,  5]])

In [717]: index
Out[717]: [[1, 3], [2, 4, 5]]

In [718]: sumrowsby_index(a, index)
Out[718]:
array([[ 8, 10, 12, 14],
[17, 19, 24, 27]])
``````

Approach #2 : We could leverage fast matrix-multiplication with `numpy.dot` to perform those sum-reductions, giving us another method as listed below -

``````def sumrowsby_index_v2(a, index):
lens = np.array([len(i) for i in index])
id_ar = np.zeros((len(lens), a.shape[0]))
c = np.concatenate(index)
r = np.repeat(np.arange(len(index)), lens)
id_ar[r,c] = 1
return id_ar.dot(a)
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download