DilithiumMatrix - 1 year ago 45

Python Question

This answer explains how to find the nearest (sorted) array element to a *single point*, in a manner efficient for large arrays (slightly modified):

`def arg_nearest(array, value):`

idx = np.searchsorted(array, value, side="left")

if idx > 0 and (idx == len(array) or math.fabs(value - array[idx-1]) < math.fabs(value - array[idx])):

return idx-1

else:

return idx

If, instead, we want to find the array elements nearest a

Some test cases:

`>>> xx = [0.2, 0.8, 1.3, 1.5, 2.0, 3.1, 3.8, 3.9, 4.5, 5.1, 5.5]`

>>> yy = [1, 2, 3, 4, 5]

>>> of_x_nearest_y(xx, yy)

[0.5, 2.0, 3.1, 3.9, 5.1]

>>> xx = [0.2, 0.8, 1.3, 1.5, 2.0, 3.1, 3.8, 3.9, 4.5, 5.1, 5.5]

>>> yy = [-2, -1, 4.6, 5.8]

>>> of_x_nearest_y(xx, yy)

[0.2, 0.2, 4.5, 5.5]

Edit: assuming both arrays are sorted, you can do a

`def args_nearest(options, targets):`

locs = np.zeros(targets.size, dtype=int)

prev = 0

for ii, tt in enumerate(targets):

locs[ii] = prev + arg_nearest(options[prev:], tt)

prev = locs[ii]

return locs

Answer Source

You can make few changes to extend it for an array of elements in `value`

, like so -

```
idx = np.searchsorted(xx, yy, side="left").clip(max=xx.size-1)
mask = (idx > 0) & \
( (idx == len(xx)) | (np.fabs(yy - xx[idx-1]) < np.fabs(yy - xx[idx])) )
out = xx[idx-mask]
```

**Explanation**

Nomenclature : `array`

is the array in which we are looking to place elements from `value`

to maintain the sorted nature of `array`

.

Changes needed to extend the solution for a single element to many elements for searching :

1] Clip the indices array `idx`

obtained from `np.searchsorted`

at a max. of `array.size-1`

, because for elements in `value`

that are larger than the maximum of `array`

, we need to make `idx`

indexable by `array`

.

2] Introduce `numpy`

to replace `math`

to do those operations in a vectorized manner.

3] Replace the conditional statement by the trick of `idx - mask`

. In this case, internally Python would up-convert `mask`

to an `int`

array to match up with the datatype of `idx`

. Thus, all the `True`

elements become `1`

and thus for `True`

elements we would effectively have `idx-1`

, which is the `True`

case of the IF conditional statement in the original code.