schlomm - 1 year ago 123
R Question

# Calculate the distance between two points of two datasets (nearest neighbor)

I want to calculate the distance between two points in two different datasets. I don't want to calculate the distance between all points - just to the nearest point of datasetB.

Some examples:

Dataset A - Persons

http://pastebin.com/HbaeqACi

Dataset B - Waterfeatures:

http://pastebin.com/UdDvNtHs

Dataset C - City:

http://pastebin.com/nATnkMRk

So...I want to calculate the distance of each person to the nearest waterfeature point.

I've already tried to work with the rgeos package and after struggling with some projections errors, I've got it to work. But this calculate (at least I assume it) all distances to every point, but, as already said, I've only interested in the distance to the nearest waterfeature point.

``````# load csv files
# change dataframes to SpatialPointDataFrame and assign a projection
library(sp)
library(rgeos)
coordinates(persons) <- c("POINT_X", "POINT_Y")
proj4string(persons) <- CRS("+proj=utm +datum=WGS84")
coordinates(water) <- c("POINT_X", "POINT_Y")
proj4string(water) <- CRS("+proj=utm +datum=WGS84")

# use rgoes package to calculate the distance
distance <- gDistance(persons, water, byid=TRUE)
# works, but calculates a huge number of distances
``````

Is there any parameter, which I've missed. Or do I need to use another package or function? I've also looked at spatstat, which is able to calculate the distance to the nearest neighbor, but not of two different datasets: http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/library/spatstat/html/nndist.html

Edit:

The complete R-Script including plotting of the datasets:

``````library(RgoogleMaps)
library(ggplot2)
library(ggmap)
library(sp)
library(fossil)

# plot data
persons_ggplot2 <- persons
city_ggplot2 <- city
water_ggplot2 <- water
gc <- geocode('new york, usa')
center <- as.numeric(gc)
G <- ggmap(get_googlemap(center = center, color = 'bw', scale = 1, zoom = 11, maptype = "terrain", frame=T), extent="device")
G1 <- G + geom_point(aes(x=POINT_X, y=POINT_Y ),data=city, shape = 22, color="black", fill = "yellow", size = 4) + geom_point(aes(x=POINT_X, y=POINT_Y ),data=persons, shape = 8, color="red", size=2.5) + geom_point(aes(x=POINT_X, y=POINT_Y ),data=water_ggplot2, color="blue", size=1)
plot(G1)

#### calculate distance
# Generate unique coordinates dataframe
UniqueCoordinates <- data.frame(unique(persons[,4:5]))
UniqueCoordinates\$Id <- formatC((1:nrow(UniqueCoordinates)), width=3,flag=0)

# Generate a function that looks for the closest waterfeature for each id coordinates
NearestW <- function(id){
tmp <- UniqueCoordinates[UniqueCoordinates\$Id==id, 1:2]
WaterFeatures <- rbind(tmp,water[,2:3])
tmp1 <- earth.dist(WaterFeatures, dist=TRUE)[1:(nrow(WaterFeatures)-1)]
tmp1 <- which.min(tmp1)
tmp1 <- water[tmp1,1]
tmp1 <- data.frame(tmp1, WaterFeature=tmp)
return(tmp1)
}

#apply to each id and the merge
CoordinatesWaterFeature <- ldply(UniqueCoordinates\$Id, NearestW)
persons <- merge(persons, CoordinatesWaterFeature, by.x=c(4,5), by.y=c(2,3))
``````

What about writing a function that looks for the nearest waterfeature for every person?

``````#requires function earth.dist from "fossil" package
require(fossil)

#Generate unique coordinates dataframe
UniqueCoordinates <- data.frame(unique(persons[,4:5]))
UniqueCoordinates\$Id <- formatC((1:nrow(UniqueCoordinates)), width=3,flag=0)

#Generate a function that looks for the closest waterfeature for each id coordinates
NearestW <- function(id){
tmp <- UniqueCoordinates[UniqueCoordinates\$Id==id, 1:2]
WaterFeatures <- rbind(tmp,water[,2:3])
tmp1 <- earth.dist(WaterFeatures, dist=TRUE)[1:(nrow(WaterFeatures)-1)]
tmp1 <- min(tmp1)
tmp1 <- data.frame(tmp1, WaterFeature=tmp)
return(tmp1)
}

#apply to each id and the merge
CoordinatesWaterFeature <- ldply(UniqueCoordinates\$Id, NearestW)
persons <- merge(persons, CoordinatesWaterFeature, by.x=c(4,5), by.y=c(2,3))
``````

NOTE: I've added a stringsAsFactors parameter to the original read.csv , it make the merging easier at the end

NOTE:Column tmp1 notes the number of METERS to the nearest water feature

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download