Python Question

# How can I speed up nearest neighbor search with python?

I have a code, which calculates the nearest voxel (which is unassigned) to a voxel ( which is assigned). That is i have an array of voxels, few voxels already have a scalar (1,2,3,4....etc) values assigned, and few voxels are empty (lets say a value of '0'). This code below finds the nearest assigned voxel to an unassigned voxel and assigns that voxel the same scalar. So, a voxel with a scalar '0' will be assigned a value (1 or 2 or 3,...) based on the nearest voxel. This code below works, but it takes too much time.
Is there an alternative to this ? or if you have any feedback on how to improve it further?

""" #self.voxels is a 3D numpy array"""

``````def fill_empty_voxel1(self,argx, argy, argz):
""" where # argx, argy, argz are the voxel location where the voxel is zero"""
argx1, argy1, argz1 = np.where(self.voxels!=0)   # find the non zero voxels
a = np.column_stack((argx1, argy1, argz1))
b = np.column_stack((argx, argy, argz))
tree = cKDTree(a, leafsize=a.shape[0]+1)
distances, ndx = tree.query(b, k=1, distance_upper_bound= self.mean) # self.mean is a mean radius search value
argx2, argy2, argz2 = a[ndx][:][:,0],a[ndx][:][:,1],a[ndx][:][:,2]
self.voxels[argx,argy,argz] = self.voxels[argx2,argy2,argz2] # update the voxel array
``````

# Example

""" Here is a small example with small dataset:"""

``````import numpy as np
from scipy.spatial import cKDTree
import timeit

voxels = np.zeros((10,10,5), dtype=np.uint8)
voxels[1:2,:,:] = 5.
voxels[5:6,:,:] = 2.
voxels[:,3:4,:] = 1.
voxels[:,8:9,:] = 4.
argx, argy, argz = np.where(voxels==0)

tic=timeit.default_timer()
argx1, argy1, argz1 = np.where(voxels!=0)   # non zero voxels
a = np.column_stack((argx1, argy1, argz1))
b = np.column_stack((argx, argy, argz))
tree = cKDTree(a, leafsize=a.shape[0]+1)
distances, ndx = tree.query(b, k=1, distance_upper_bound= 5.)
argx2, argy2, argz2 = a[ndx][:][:,0],a[ndx][:][:,1],a[ndx][:][:,2]
voxels[argx,argy,argz] = voxels[argx2,argy2,argz2]
toc=timeit.default_timer()
timetaken = toc - tic #elapsed time in seconds
print '\nTime to fill empty voxels', timetaken
``````

# for visualization:

``````from mayavi import mlab
data = voxels.astype('float')
scalar_field = mlab.pipeline.scalar_field(data)
iso_surf = mlab.pipeline.iso_surface(scalar_field)
surf = mlab.pipeline.surface(scalar_field)
vol = mlab.pipeline.volume(scalar_field,vmin=0,vmax=data.max())
mlab.outline()
mlab.show()
``````

Now, if I have the dimension of the voxels array as something like (500,500,500), then the time it takes to compute the nearest search is no longer efficient. How can I overcome this? Could parallel computation reduce the time (I have no idea whether I can parallelize the code, if you do, please let me know)?

# A potential fix:

I could substantially improve the computation time by adding the n_jobs = -1 parameter in the cKDTree query.

``````distances, ndx = tree.query(b, k=1, distance_upper_bound= 5., n_jobs=-1)
``````

I was able to compute the distances in less than a hour for an array of (400,100,100) on a 13 core CPU. I tried with 1 processor and it takes around 18 hours to complete the same array.
Thanks to @gsamaras for the answer!

It would be interesting to try sklearn.neighbors.NearestNeighbors, which offers `n_jobs` parameter:

The number of parallel jobs to run for neighbors search.

This package also provides the Ball Tree algorithm, which you can test versus the kd-tree one, however my hunch is that the kd-tree will be better (but that again does depend on your data, so research that!).

You might also want to use dimensionality reduction, which is easy. The idea is that you reduce your dimensions, thus your data contain less info, so that tackling the Nearest Neighbour Problem can be done much faster. Of course, there is a trade off here, accuracy!

You might/will get less accuracy with dimensionality reduction, but it might worth the try. However, this usually applies in a high dimensional space, and you are just in 3D. So I don't know if for your specific case it would make sense to use sklearn.decomposition.PCA.

A remark:

If you really want high performance though, you won't get it with , you could switch to , and use CGAL for example.