Gaurav Chaturvedi - 2 years ago 95
R Question

# Geographic distance between 2 lists of lat/lon coordinates

I have 2 lists (

`list1`
,
`list2`
) with latitude / longitudes of various locations. One list (
`list2`
) has locality names that
`list1`
does not have.

I want an approximate locality for every point in list1 as well. So I want to take a point in
`list1`
, try to look for the nearest point in
`list2`
and take that locality. I repeat for every point in
`list1`
. It also want the distance (in meters) and the index of the point (in
`list1`
) so I can build some business rules around it - essentially these are 2 new cols that should be added to
`list1`
(
`near_dist`
,
`indx`
).

I am using the
`gdist`
function, but I'm unable to get this to work with data frame inputs.

Example input lists:

``````list1 <- data.frame(longitude = c(80.15998, 72.89125, 77.65032, 77.60599,
72.88120, 76.65460, 72.88232, 77.49186,
72.82228, 72.88871),
latitude = c(12.90524, 19.08120, 12.97238, 12.90927,
19.08225, 12.81447, 19.08241, 13.00984,
18.99347, 19.07990))
list2 <- data.frame(longitude = c(72.89537, 77.65094, 73.95325, 72.96746,
77.65058, 77.66715, 77.64214, 77.58415,
77.76180, 76.65460),
latitude = c(19.07726, 13.03902, 18.50330, 19.16764,
12.90871, 13.01693, 13.00954, 12.92079,
13.02212, 12.81447),
locality = c("A", "A", "B", "B", "C", "C", "C", "D", "D", "E"))
``````

Answer Source

To calculate the geographic distance between two points with latitude/longitude coordinates, you can use several formula's. The package `geosphere` has the `distCosine`, `distHaversine`, `distVincentySphere` and `distVincentyEllipsoid` for calculating the distance. Of these, the `distVincentyEllipsoid` is considered the most accurate one, but is computationally more intensive than the other ones.

With one of these functions, you can make a distance matrix. Based on that matrix you can then assign `locality` names based on shortest distance with `which.min` and the corresponding distance with `min` (see for this the last part of the answer) like this:

``````library(geosphere)

# create distance matrix
mat <- distm(list1[,c('longitude','latitude')], list2[,c('longitude','latitude')], fun=distVincentyEllipsoid)

# assign the name to the point in list1 based on shortest distance in the matrix
list1\$locality <- list2\$locality[apply(mat, 1, which.min)]
``````

this gives:

``````> list1
longitude latitude locality
1   80.15998 12.90524        D
2   72.89125 19.08120        A
3   77.65032 12.97238        C
4   77.60599 12.90927        D
5   72.88120 19.08225        A
6   76.65460 12.81447        E
7   72.88232 19.08241        A
8   77.49186 13.00984        D
9   72.82228 18.99347        A
10  72.88871 19.07990        A
``````

Another possibility is to assign the `locality` based on the average longitude and latitude values of the `locality`s in `list2`:

``````library(dplyr)
list2a <- list2 %>% group_by(locality) %>% summarise_each(funs(mean)) %>% ungroup()
mat2 <- distm(list1[,c('longitude','latitude')], list2a[,c('longitude','latitude')], fun=distVincentyEllipsoid)
list1\$locality2 <- list2a\$locality[apply(mat2, 1, which.min)]
``````

or with `data.table`:

``````library(data.table)
list2a <- setDT(list2)[,lapply(.SD, mean), by=locality]
mat2 <- distm(setDT(list1)[,.(longitude,latitude)], list2a[,.(longitude,latitude)], fun=distVincentyEllipsoid)
list1\$locality2 <- list2a\$locality[apply(mat2, 1, which.min)]
``````

this gives:

``````> list1
longitude latitude locality locality2
1   80.15998 12.90524        D         D
2   72.89125 19.08120        A         B
3   77.65032 12.97238        C         C
4   77.60599 12.90927        D         C
5   72.88120 19.08225        A         B
6   76.65460 12.81447        E         E
7   72.88232 19.08241        A         B
8   77.49186 13.00984        D         C
9   72.82228 18.99347        A         B
10  72.88871 19.07990        A         B
``````

As you can see, this leads in most (7 out of 10) occasions to another assigned `locality`.

You can add the distance with:

``````list1\$near_dist <- apply(mat2, 1, min)
``````

the result:

``````> list1
longitude latitude locality locality2   near_dist
1:  80.15998 12.90524        D         D 269966.8970
2:  72.89125 19.08120        A         B  65820.2047
3:  77.65032 12.97238        C         C    739.1885
4:  77.60599 12.90927        D         C   9209.8165
5:  72.88120 19.08225        A         B  66832.7223
6:  76.65460 12.81447        E         E      0.0000
7:  72.88232 19.08241        A         B  66732.3127
8:  77.49186 13.00984        D         C  17855.3083
9:  72.82228 18.99347        A         B  69456.3382
10:  72.88871 19.07990        A         B  66004.9900
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download