Peter Hickman - 3 months ago 9
R Question

# Gecoding Colloquial Place Names: Zero Results, but Can Get Manual Results (R ggmap)

I'd like to know the latitudes and longitudes of the district offices on the island of Java, Indonesia. Districts are administrative regions, like states in the USA. Most of my geocode queries return inaccurate results: the latitude and longitude are for the district as a whole, not the district office. Yet if I type the query into Google Maps manually, I find what I want.

``````library("ggmap")
# list of district names
# vector of queries for Google maps
queries <- paste("Kantor Bupati ", dists\$distName, ", ", dists\$distName,
", ", dists\$provinceName, ", Indonesia", sep="")
# impute latitude and longitude
dists[c("lon", "lat")] <- geocode(queries)
``````

The expression "Kantor Bupati" means District Office in Indonesian.

E.g., if I type "Kantor Bupati BOGOR, BOGOR, JAWA BARAT, Indonesia" into google maps, I find the district office: lat=-6.479745, lon=106.824742. But geocode returns: lat=-6.597147, lon=106.806. That is 20km away: not precise enough for my purposes.

I've solved this: I use the Google Places API as SymbolixAU suggested. The vectorized function below takes as arguments the colloquial place names we want to geocode and a second vector of non-colloquial place names that can be geocoded using `ggmap`'s `geocode`. It returns latitude, longitude, and the name of the place. Get an API key here.

``````library("ggmap") # regular geocode function
library("RJSONIO") # read JSON

# API Key for Google Places
key <- # your key here

geoCodeColloquial <- function(queries, bases) {

# need coordinates of base to focus search
print("Getting coordinates of bases...")
baseCoords <- geocode(bases, source="google")

# request to Google Places
print("Requesting coordinates of queries...")
baseCoords\$lat, ",", baseCoords\$lon,
queries,
"&key=",
key,
sep="")

# results from Google Places; take only top result for each query
info <- lapply(requests,
function(request)
fromJSON(request)\$results[[1]])

# lat and lon
coords <- lapply(info, function(i) i\$geometry\$location)

# name of top result
geoCodeNames <- lapply(info, function(i) i\$name)
geoCodeNamesDf <- data.frame(matrix(unlist(geoCodeNames),
nrow=length(geoCodeNames), byrow=T))

# add lat, lon, and discovered names to dataframe
outDf <- data.frame(matrix(unlist(coords),
nrow=length(coords), byrow=T))
names(outDf) <- c("lat", "lon")
outDf["geoCodeName"] <- geoCodeNamesDf
return(outDf)
}
``````