Jean - 1 year ago 46

Python Question

I have coded a kriging algorithm but I find it quite slow. Especially, do you have an idea on how I could vectorise the piece of code in the cons function below:

`import time`

import numpy as np

B = np.zeros((200, 6))

P = np.zeros((len(B), len(B)))

def cons():

time1=time.time()

for i in range(len(B)):

for j in range(len(B)):

P[i,j] = corr(B[i], B[j])

time2=time.time()

return time2-time1

def corr(x,x_i):

return np.exp(-np.sum(np.abs(np.array(x) - np.array(x_i))))

time_av = 0.

for i in range(30):

time_av+=cons()

print "Average=", time_av/100.

Edit: Bonus questions

- What happens to the broadcasting solution if I want with C the same dimension than B
`corr(B[i], C[j])`

- What happens to the scipy solution if my p-norm orders are an array:

`p=np.array([1.,2.,1.,2.,1.,2.])`

def corr(x, x_i):

return np.exp(-np.sum(np.abs(np.array(x) - np.array(x_i))**p))

For 2., I triedbut scipy is expecting a scalar.`P = np.exp(-cdist(B, C,'minkowski', p))`

Answer Source

Your problem seems very simple to vectorize. For each pair of rows of `B`

you want to compute

```
P[i,j] = np.exp(-np.sum(np.abs(B[i,:] - B[j,:])))
```

You can make use of array broadcasting and introduce a third dimension, summing along the last one:

```
P2 = np.exp(-np.sum(np.abs(B[:,None,:] - B),axis=-1))
```

The idea is to reshape the first occurence of `B`

to shape `(N,1,M)`

while the second `B`

is left with shape `(N,M)`

. With array broadcasting, the latter is equivalent to `(1,N,M)`

, so

```
B[:,None,:] - B
```

is of shape `(N,N,M)`

. Summing along the last index will then result in the `(N,N)`

-shape correlation array you're looking for.

Note that if you were using `scipy`

, you would be able to do this using `scipy.spatial.distance.cdist`

(or, equivalently, a combination of `scipy.spatial.distance.pdist`

and `scipy.spatial.distance.squareform`

), without unnecessarily computing the lower triangular half of this symmetrix matrix. Using @Divakar's suggestion in comments for the simplest solution this way:

```
from scipy.spatial.distance import cdist
P3 = 1/np.exp(cdist(B, B, 'minkowski',1))
```

`cdist`

will compute the Minkowski distance in 1-norm, which is exactly the sum of the absolute values of coordinate differences.