Tomas Alonso Rehor Tomas Alonso Rehor - 8 days ago 7
R Question

Decreasing processing time for shp map choropleth

Im making a choropleth map from Argentina in which Im going to plot some Data.

I can put the map up with no problems and also plot some data on it. For example like this:

enter image description here

The problem is I think R is rendering the map in too high quality (which I dont need) and the processing time is taking ages. (~3 minutes) To display that choropleth. This is the code im using.

arg_shp <- readOGR("ARG_adm_shp/ARG_adm1.shp", "ARG_adm1")


puntos <- read.csv("puntos.csv", sep = ",", header = T)

arg_pv <- fortify(arg_shp, region = "NAME_1")

gg <- ggplot()
gg <- gg + geom_map(data=arg_pv, map=arg_pv,
aes(long, lat, map_id=id),
color="#2b2b2b", size=0.15, fill=NA)
gg <- gg + coord_map()
gg <- gg + ggthemes::theme_map()



gg + geom_map(data = puntos, aes(map_id = Provincia, fill = Puntos),
map = arg_pv)


Or alternatively I have tried using something like this to see if it made any difference.

ggplot() + geom_map(data = puntos, aes(map_id = Provincia, fill = Puntos),
map = arg_pv) + expand_limits(x = arg_pv$long , y = arg_pv$lat)


After trying some data im well aware that the code that is making the processing take long is obviously the


expand_limits


As is taking the information for all the 259k data points in the fortify table.

Any ideas to cope with this?

Answer

This:

library(maptools)
library(rgdal)
library(ggplot2)
library(ggalt)
library(ggthemes)
library(viridis)
library(magrittr

# as stated in the other answer, this is the same as your shapefile
arg_adm <- getData('GADM', country='ARG', level=1)

gSimplify(arg_adm, 0.01, topologyPreserve=TRUE) %>% 
  SpatialPolygonsDataFrame(dat=arg_adm@data) -> arg_adm

arg_map <- fortify(arg_adm, region="NAME_1")

arg_proj <- "+proj=aeqd +lat_0=-37.869859624840764 +lon_0=-66.533203125"

# reproducibly simulate some data
set.seed(1492)
puntos <- data.frame(id=c("Buenos Aires", "Córdoba", "Catamarca", "Chaco", "Chubut",
                          "Ciudad de Buenos Aires", "Corrientes", "Entre Ríos", "Formosa", 
                          "Jujuy", "La Pampa", "La Rioja", "Mendoza", "Misiones", "Neuquén", 
                          "Río Negro", "Salta", "San Juan", "San Luis", "Santa Cruz", 
                          "Santa Fe", "Santiago del Estero", "Tierra del Fuego", "Tucumán"),
                     value=sample(100, 24))

gg <- ggplot() 
# draw the base polygon layer
gg <- gg + geom_map(data=arg_map, map=arg_map, 
                    aes(long, lat, map_id=id),
                    color="#b2b2b2", size=0.15, fill=NA)
# fill in the polygons
gg <- gg + geom_map(data=puntos, map=arg_map,
                    aes(fill=value, map_id=id),
                    color="#b2b2b2", size=0.15)
gg <- gg + scale_fill_viridis(name="Scale Title")
gg <- gg + coord_proj(arg_proj)
gg <- gg + theme_map()
gg <- gg + theme(legend.position=c(0.8, 0.1))
gg

enter image description here

Renders really fast on my system:

benchplot(gg)

##        step user.self sys.self elapsed
## 1 construct     0.000    0.000   0.000
## 2     build     0.029    0.002   0.031
## 3    render     0.206    0.006   0.217
## 4      draw     0.049    0.001   0.051
## 5     TOTAL     0.284    0.009   0.299

Try to follow the above idiom vs what you're doing or post the output of dput(puntos) into your question so it's reproducible. Also: the continued inclusion of the entire RStudio window in your questions is really neither helpful nor minimal.