user - 1 year ago 70

Python Question

How do you optimize this code?

At the moment it is running to slow for the amount of data that goes through this loop. This code runs 1-nearest neighbor. It will predict the label of the training_element based off the p_data_set

`# [x] , [[x1],[x2],[x3]], [l1, l2, l3]`

def prediction(training_element, p_data_set, p_label_set):

temp = np.array([], dtype=float)

for p in p_data_set:

temp = np.append(temp, distance.euclidean(training_element, p))

minIndex = np.argmin(temp)

return p_label_set[minIndex]

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

Use a *k*-D tree for fast nearest-neighbour lookups, e.g. `scipy.spatial.cKDTree`

:

```
from scipy.spatial import cKDTree
# I assume that p_data_set is (nsamples, ndims)
tree = cKDTree(p_data_set)
# training_elements is also assumed to be (nsamples, ndims)
dist, idx = tree.query(training_elements, k=1)
predicted_labels = p_label_set[idx]
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**