sinQueso sinQueso - 1 month ago 17
R Question

How to make raster from unique values?

Let's say I have a data frame that looks like


category longitude latitude
a -83.7 26.2
b -83.7 26.2
b -84.6 26.2
c -82.4 25.7
a -80.1 27.7
b -75.9 34.5


What would be the best approach to create a raster map based on the number of unique categories that occur in each cell?

Not being very familiar with geospatial data, I put below the approach I have been using. However, there might be a better/faster solution.

library(raster)
library(ggplot2)
foo <- structure(list(category = c("a", "b", "b", "c", "a", "b"),
longitude = c(-83.7, -83.7, -84.6, -82.4, -80.1, -75.9),
latitude = c(26.2, 26.2, 26.2, 25.7, 27.7, 34.5)),
.Names = c("category", "longitude", "latitude"),
class = "data.frame",
row.names = c(NA, -6L))
split_foo <- split(foo, foo$category)
us_raster <- raster(xmn = -127, ymn = 23, xmx = -61, ymx = 50, res = 1)
raster_lst <- lapply(split_foo, function(x) {
pts <- SpatialPoints(data.frame(lon = foo$longitude, lat = foo$latitude))
rasterize(pts, us_raster, fun="count")
})
raster_foo <- Reduce("merge", raster_lst)
gg_foo <- as.data.frame(as(raster_foo, "SpatialPixelsDataFrame"))
colnames(gg_foo) <- c("value", "x", "y")
ggplot() +
geom_raster(data = gg_foo, aes(x = x, y = y, fill = value)) +
coord_quickmap()


output

dww dww
Answer

Here's a method that should be fast even on large data.

I'll use data table notation here, although you could equally well do this in base R or dplyr notation if you prefer.

library(data.table)
dt = data.table(foo)
r = raster(vals=0, xmn=-85, xmx=-75, ymn=25, ymx=35, res=1)

# add a new column to dt specifying which raster cell each row is in
dt[, rastercell := cellFromXY(r, .SD[,.(longitude, latitude)])]
# aggregate how many unique categories there are in each raster cell
dt.tab <- dt[, length(unique(category)), by = rastercell]
# update the values of the raster with these values
r[dt.tab[, rastercell]] <- dt.tab[, V1] 
# and plot
plot(r)

enter image description here

The same thing in base R notation, in case you prefer:

r = raster(vals=0, xmn=-85, xmx=-75, ymn=25, ymx=35, res=1)
foo$rastercell <- cellFromXY(r, foo[,2:3])
foo.tab <- aggregate(category~rastercell, data=foo, FUN=function(x) length(unique(x)))
r[foo.tab[, "rastercell"]] <- foo.tab[, "category"]
plot(r)
Comments