Bananach - 1 month ago 11
Python Question

# Speed up double for loop in numpy

I currently have the following double loop in my Python code:

``````for i in range(a):
for j in range(b):
A[:,i]*=B[j][:,C[i,j]]
``````

(A is a float matrix. B is a list of float matrices. C is a matrix of integers. By matrices I mean m x n np.arrays.

To be precise, the sizes are: A: mxa B: b matrices of size mxl (with l different for each matrix) C: axb. Here m is very large, a is very large, b is small, the l's are even smaller than b
)

I tried to speed it up by doing

``````for j in range(b):
A[:,:]*=B[j][:,C[:,j]]
``````

but surprisingly to me this performed worse.

More precisely, this did improve performance for small values of m and a (the "large" numbers), but from m=7000,a=700 onwards the first appraoch is roughly twice as fast.

Is there anything else I can do?

Maybe I could parallelize? But I don't really know how.

(I am not committed to either Python 2 or 3)

Here's a vectorized approach assuming `B` as a list of arrays that are of the same shape -

``````# Convert B to a 3D array
B_arr = np.asarray(B)

# Use advanced indexing to index into the last axis of B array with C
# and then do product-reduction along the second axis.
# Finally, we perform elementwise multiplication with A
A *= B_arr[np.arange(B_arr.shape[0]),:,C].prod(1).T
``````

For cases with smaller `a`, we could run a loop that iterates through the length of `a` instead. Also, for more performance, it might be a better idea to store those elements into a separate 2D array instead and perform the elementwise multiplication only once after we get out of the loop.

Thus, we would have an alternative implementation like so -

``````range_arr = np.arange(B_arr.shape[0])
out = np.empty_like(A)
for i in range(a):
out[:,i] = B_arr[range_arr,:,C[i,:]].prod(0)
A *= out
``````
Source (Stackoverflow)