xolodec - 9 months ago 41

Python Question

I have come across a problem which is to rewrite a piece of code in vectorized form. The code shown below is a simplified illustration of initial problem

`K = 20`

h, w = 15, 20

H, W = 1000-h, 2000-w

q = np.random.randint(0, 20, size=(H, W, K)) # random just for illustration

Q = np.zeros((H+h, W+w, K))

for n in range(H):

for m in range(W):

Q[n:n+h, m:m+w, :] += q[n, m, :]

This code takes long to execute and it seems to me it is rather simple to allow vectorized implementation.

I am aware of numpy's

`s_`

`Q`

`q`

I guess that

`np.add.at`

`IndexError: failed to coerce slice entry of type numpy.ndarray to integer`

for any attempt i make.

Maybe there is some another numpy's magic which I am unaware of and which could help me in my task but it seems extremely difficult to google for it.

Answer Source

Well you are basically summing across sliding windows along the first and second axes, which in signal processing domain is termed as `convolution`

. For two axes that would be `2D convolution`

. Now, Scipy has it implemented as `convolve2d`

and could be used for each slice along the third axis.

Thus, we would have an implementation with it, like so -

```
from scipy.signal import convolve2d
kernel = np.ones((h,w),dtype=int)
m,n,r = q.shape[0]+h-1, q.shape[1]+w-1, q.shape[2]
out = np.empty((m,n,r),dtype=q.dtype)
for i in range(r):
out[...,i] = convolve2d(q[...,i],kernel)
```

As it turns out, we can use `fftconvolve`

from the same repo that allows us to work with higher-dimensional arrays. This would get us the output in a fully vectorized way, like so -

```
from scipy.signal import fftconvolve
out = fftconvolve(q,np.ones((h,w,1),dtype=int))
```

**Runtime test**

Function definitions -

```
def original_app(q,K,h,w,H,W):
Q = np.zeros((H+h-1, W+w-1, K))
for n in range(H):
for m in range(W):
Q[n:n+h, m:m+w, :] += q[n, m, :]
return Q
def convolve2d_app(q,K,h,w,H,W):
kernel = np.ones((h,w),dtype=int)
m,n,r = q.shape[0]+h-1, q.shape[1]+w-1, q.shape[2]
out = np.empty((m,n,r),dtype=q.dtype)
for i in range(r):
out[...,i] = convolve2d(q[...,i],kernel)
return out
def fftconvolve_app(q,K,h,w,H,W):
return fftconvolve(q,np.ones((h,w,1),dtype=int))
```

Timings and verification -

```
In [128]: # Setup inputs
...: K = 20
...: h, w = 15, 20
...: H, W = 200-h, 400-w
...: q = np.random.randint(0, 20, size=(H, W, K))
...:
In [129]: %timeit original_app(q,K,h,w,H,W)
1 loops, best of 3: 2.05 s per loop
In [130]: %timeit convolve2d_app(q,K,h,w,H,W)
1 loops, best of 3: 2.05 s per loop
In [131]: %timeit fftconvolve_app(q,K,h,w,H,W)
1 loops, best of 3: 233 ms per loop
In [132]: np.allclose(original_app(q,K,h,w,H,W),convolve2d_app(q,K,h,w,H,W))
Out[132]: True
In [133]: np.allclose(original_app(q,K,h,w,H,W),fftconvolve_app(q,K,h,w,H,W))
Out[133]: True
```

So, it seems `fftconvolve`

based approach is doing really well there!