TheCurlyManLives - 7 months ago 38

R Question

I have generated 5 coordinates (each consisting of an x and a y variable, which are considered the "truth."

`D <- 2 #amount of dimensions`

K <- 5

events <- 2*K #number of events

truth <- matrix(data=runif(events, min = 0, max = 1), nrow=K)

Then I generated another set of coordinates, in this case two:

`E <- 2`

test <- matrix(data=runif(2*E, min = 0, max = 1), nrow=E)

and now I would want to know which of these first five coordinates is closest (in a euclidian sense) to each of these two test coordinates. What is the easiest way to go about this?

Answer

If you want to avoid having to calculate the distance for every row combination, avoid using the base `dist`

, and do it without any external packages, you can manually code a Euclidean distance by making two conforming matrices first.

```
diffs <- truth[rep(1:nrow(truth), nrow(test)),] -test[rep(1:nrow(test), each=nrow(truth)),]
eucdiff <- function(x) sqrt(rowSums(x^2))
max.col(-matrix(eucdiff(diffs), nrow=nrow(test), byrow=TRUE), "first")
#[1] 4 3
```

Using @aichao's data above.

Source (Stackoverflow)