gamemastersr - 4 months ago 11
R Question

# Contour plot: density of sites with latitude and logitude locations

I have data of this format and want to make a contour plot. When I try to use the

`density(z)`
I get error message
`"x must be numeric"`
. I'm not sure how transform my data to make it the correct format to generate the contour. I just want it to be based on the density of points as the two columns represent my long/lat.

``````z <- c(
c(8.83,8.89),
c(8.89,8.94),
c(8.84,8.9),
c(8.79,8.852),
c(8.79,8.88),
c(8.8,8.82),
c(8.75,8.78),
c(8.8,8.8),
c(8.74,8.81),
c(8.89,8.99),
c(8.97,8.97),
c(9.04,9.08),
c(9,9.01),
c(8.99,8.99),
c(8.93,8.97)
)
z <- matrix(z, ncol = 2, byrow = TRUE)
``````

`density()` is used for univariate density estimation. Since you have two independent variables: `long` and `lat`, you should use `kde2d()` from R's default package `MASS`.

``````library(MASS)
fit <- kde2d(z[,1], z[,2])
contour(fit\$x, fit\$y, fit\$z)
## show original data locations
points(z, pch = 19, col = 4)
``````

Follow-up

If you look at `?kde2d`:

``````Usage:

kde2d(x, y, h, n = 25, lims = c(range(x), range(y)))
``````

The default number of cells along each of `x` and `y` are `n = 25`, which gives you a 25 * 25 grid. Density estimation is done on this grids. Perhaps you are wondering why estimation is done on a regular grid. Because such grid is like pixels of a digital photo. Grid/raster like object is convenient for visualization. Actually, if you want computer to proceed 3D graph, you have to give it a raster like object.

In practice, you should choose `n` according to how many data you have. Note that a 25 * 25 grid has 625 cells, this is quite fair when you have 1000 data points. You can also try `n = 50`. Setting `n` is very similar to setting number of bins when you produce a histogram. As `n` increases, your resulting estimation is more jagged. Consider the histogram example if you are unclear:

``````x <- rnorm(200)
hist(x, breaks = 10)
hist(x, breaks = 20)
``````

Precisely, density estimation is different to histogram; the former is a kernel smoother, while the latter is a primitive bin smoother. But the choice of `n` (refinement) does has equal effect.