gamemastersr gamemastersr - 4 months ago 11
R Question

Contour plot: density of sites with latitude and logitude locations

I have data of this format and want to make a contour plot. When I try to use the

density(z)
I get error message
"x must be numeric"
. I'm not sure how transform my data to make it the correct format to generate the contour. I just want it to be based on the density of points as the two columns represent my long/lat.

z <- c(
c(8.83,8.89),
c(8.89,8.94),
c(8.84,8.9),
c(8.79,8.852),
c(8.79,8.88),
c(8.8,8.82),
c(8.75,8.78),
c(8.8,8.8),
c(8.74,8.81),
c(8.89,8.99),
c(8.97,8.97),
c(9.04,9.08),
c(9,9.01),
c(8.99,8.99),
c(8.93,8.97)
)
z <- matrix(z, ncol = 2, byrow = TRUE)

Answer

density() is used for univariate density estimation. Since you have two independent variables: long and lat, you should use kde2d() from R's default package MASS.

library(MASS)
fit <- kde2d(z[,1], z[,2])
contour(fit$x, fit$y, fit$z)
## show original data locations
points(z, pch = 19, col = 4)

contour


Follow-up

If you look at ?kde2d:

Usage:

     kde2d(x, y, h, n = 25, lims = c(range(x), range(y)))

The default number of cells along each of x and y are n = 25, which gives you a 25 * 25 grid. Density estimation is done on this grids. Perhaps you are wondering why estimation is done on a regular grid. Because such grid is like pixels of a digital photo. Grid/raster like object is convenient for visualization. Actually, if you want computer to proceed 3D graph, you have to give it a raster like object.

In practice, you should choose n according to how many data you have. Note that a 25 * 25 grid has 625 cells, this is quite fair when you have 1000 data points. You can also try n = 50. Setting n is very similar to setting number of bins when you produce a histogram. As n increases, your resulting estimation is more jagged. Consider the histogram example if you are unclear:

x <- rnorm(200)
hist(x, breaks = 10)
hist(x, breaks = 20)

Precisely, density estimation is different to histogram; the former is a kernel smoother, while the latter is a primitive bin smoother. But the choice of n (refinement) does has equal effect.

Comments