NikMas NikMas - 6 months ago 33
R Question

Extract and Resample Functions in R raster package: Area-Weighted Values

I made this post in other forum as well, but since I really need a reply, i post it here once more.

I am working in R and want to calculate the value of a polygon derived from the intersecting cells of a raster. The value should consider the weights on each intersecting cell. When I try to run the "extract" function with a sample raster and polygon I get different weights that the ones I calculate manually, resulting in different final value.

Here is my sample code:

require(raster)
r <- raster(nrow=2, ncol=2, xmn=-180, xmx=60, ymn=-30, ymx=90)
r[] <- c(1,2,4,5)
s <- raster(xmn=-120, xmx=-40, ymn=20, ymx=60, nrow=1, ncol=1)
s.pl <- as(s, 'SpatialPolygons')
w <- raster::extract(r, s.pl, method="simple",weights=T, normalizeWeights=F)
mean.value <- raster::extract(r, s.pl, method="simple",weights=T, fun=mean)


The value I get is 2.14 but according to the actual weights of the cells it should be 2. More specifically for every part of the polygon that intersects with different cell the data are:

Area Value
1800 1
600 2
600 4
200 5


So the final value of the polygon based on the above should be 2.

Can it be because of the projection that is in lat/lon? But even when I assign projection in meters I have the same result. How can I get the value of 2 that I am interested in? I also tried with "resample" function but I get different results as well.

My final target is to create a new raster with different resolution and extents from the original one and assign the values based on the weights of the cells of the original raster that intersect with the cells of the new raster. But it seems that neither resample nor extract functions give the expected outcome.

Answer Source

Here is what I managed to do based on the replies of this post.

require(raster)
require(rgeos)
r <- raster(nrow=2, ncol=2, xmn=-180, xmx=60, ymn=-30, ymx=90)    
r[] <- c(1,2,4,5)    
r <- stack(r, r*2, r^2)
s <- raster(xmn=-120, xmx=-40, ymn=20, ymx=60, nrow=1, ncol=1)    
s.pl <- as(s, 'SpatialPolygons')    
r.s <- as(r, 'SpatialPolygonsDataFrame')
pi1 <- gIntersection(r.s, s.pl, byid = T)
areas1 <- data.frame(area=sapply(pi1@polygons, FUN=function(x) {slot(x, 'area')}))
row.names(areas1) <- sapply(pi1@polygons, FUN=function(x) {slot(x, 'ID')})
areas1$Pol.old <- as.numeric(vapply(strsplit(rownames(areas1), " "), `[`, 1, FUN.VALUE=character(1)))
areas1$pol.new <- as.numeric(vapply(strsplit(rownames(areas1), " "), `[`, 2, FUN.VALUE=character(1)))
f <- r.s@data
seqs <- match(areas1$Pol.old, rownames(f))
ar <- cbind(areas1, f[seqs,])
ar[,-(1:3)] <- ar[,-(1:3)]*ar$area
f <- aggregate.data.frame(ar, by=list(ar$pol.new), FUN=sum)
f[,-(1:4)] <- f[,-(1:4)]/f$area  
ar.v <- as.matrix(f[, -c(1:4)])
s2 <- stack(s)
s1 <- setValues(s2, ar.v)

If somebody can suggest nicer and/or faster code, please let me know, cause I don't like my approach very much.