Xanos - 1 year ago 79
Python Question

# find nearest value in list, for all values in list

I have a list of complex numbers for which I want to find the closest value in another list of complex numbers.

My current approach with numpy:

``````import numpy as np

refArray = np.random.random(16);
myArray = np.random.random(1000);

def find_nearest(array, value):
idx = (np.abs(array-value)).argmin()
return idx;

for value in np.nditer(myArray):
index = find_nearest(refArray, value);
print(index);
``````

Unfortunately, this takes ages for a large amount of values.
Is there a faster or more "pythonian" way of matching each value in myArray to the closest value in refArray?

FYI: I don't necessarily need numpy in my script.

Important: the order of both myArray as well as refArray is important and should not be changed. If sorting is to be applied, the original index should be retained in some way.

Here's one vectorized approach with `np.searchsorted` based on `this post` -

``````def closest_argmin(yy, refArray):
sidx = refArray.argsort()
xx = refArray[sidx]
idx = np.searchsorted(xx, yy)
L = xx.size
idx[idx==L] = L-1
mask = (idx > 0) &  \
( (idx == L) | (np.abs(yy - xx[idx-1]) < np.abs(yy - xx[idx])) )
``````

Benchmarking

Approaches -

``````# Original approach
def org_app(myArray, refArray):
out1 = np.empty(myArray.size, dtype=int)
for i, value in enumerate(myArray):
# find_nearest from posted question
index = find_nearest(refArray, value)
out1[i] = index
return out1

def closest_argmin(yy, refArray):
sidx = refArray.argsort()
xx = refArray[sidx]
idx = np.searchsorted(xx, yy)
L = xx.size
idx[idx==L] = L-1
mask = (idx > 0) &  \
( (idx == L) | (np.abs(yy - xx[idx-1]) < np.abs(yy - xx[idx])) )
``````

Timings and verification -

``````In [188]: refArray = np.random.random(16)
...: myArray = np.random.random(1000)
...:

In [189]: %timeit org_app(myArray, refArray)
100 loops, best of 3: 1.95 ms per loop

In [190]: %timeit closest_argmin(myArray, refArray)
10000 loops, best of 3: 36.6 µs per loop

In [191]: np.allclose(closest_argmin(myArray, refArray), org_app(myArray, refArray))
Out[191]: True
``````

`50x+` speedup for the posted sample and hopefully more for larger datasets!

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download