user3231352 user3231352 - 21 days ago 9
R Question

Find nearest points of latitude and longitude from different data sets with different length

I have two data set of different stations. The data are basically data.frames with coordinates, longitudes and latitudes. Given the first data set (or vice versa), I want to find the nearest station for each station in the other data set. My main problem here is that the coordinates are not ordered and that the data sets have different lengths. For example, the first one contains 2228 stations ,and the second one 1782. So, I don't know how handle this.
I know about the function rdist.earth and I tried to use it. This is a short sample of this:

#First data set of stations
set1 <- structure(list(lon = c(13.671114, 12.866947, 15.94223, 11.099736,
12.958342, 14.203892, 11.86389, 16.526674, 16.193064, 17.071392
), lat = c(48.39167, 48.148056, 48.721111, 47.189167, 47.054443,
47.129166, 47.306667, 47.84, 47.304167, 48.109444)), .Names = c("lon",
"lat"), row.names = c(NA, 10L), class = "data.frame")

#Second data set
set2 <- structure(list(lon = structure(c(14.4829998016357, 32.4000015258789,
-8.66600036621094, 15.4670000076294, 18.9160003662109, 19.0160007476807,
31.0990009307861, 14.3660001754761, 9.59899997711182, 11.0830001831055
), .Dim = 10L), lat = structure(c(35.8499984741211, 34.75, 70.9329986572266,
78.25, 69.6829986572266, 74.515998840332, 70.3659973144531, 67.265998840332,
63.6990013122559, 60.1990013122559), .Dim = 10L)), .Names = c("lon",
"lat"), row.names = c(NA, 10L), class = "data.frame")
#computing distance
dd<- rdist.earth(set1,set2,miles=FALSE)


Now I have the matrix dd, with the distances..but I don't know how find the information for each point. I mean, for example, from the data set 1, the first point, what is the nearest station in the second data set? Any idea??

Thanks a lot.

Answer Source

Here is an other possible solution:

library(rgeos)
set1sp <- SpatialPoints(set1)
set2sp <- SpatialPoints(set2)
set1$nearest_in_set2 <- apply(gDistance(set1sp, set2sp, byid=TRUE), 1, which.min)

head(set1)
       lon      lat nearest_in_set2
## 1 13.67111 48.39167              10
## 2 12.86695 48.14806              10
## 3 15.94223 48.72111              10
## 4 11.09974 47.18917               1
## 5 12.95834 47.05444               1
## 6 14.20389 47.12917               1