silkita silkita - 3 months ago 56
R Question

Transform color scale to probability-transformed color distribution with scale_fill_gradientn()

I am trying to visualize heavily tailed raster data, and I would like a non-linear mapping of colors to the range of the values. There are a couple of similar questions, but they don't really solve my specific problem (see links below).

library(ggplot2)
library(scales)

set.seed(42)
dat <- data.frame(
x = floor(runif(10000, min=1, max=100)),
y = floor(runif(10000, min=2, max=1000)),
z = rlnorm(10000, 1, 1) )

# colors for the colour scale:
col.pal <- colorRampPalette(c("#00007F", "blue", "#007FFF", "cyan", "#7FFF7F", "yellow", "#FF7F00", "red", "#7F0000"))
fill.colors <- col.pal(64)


This is how the data look like if not transformed in some way:

ggplot(dat, aes(x = x, y = y, fill = z)) +
geom_tile(width=2, height=30) +
scale_fill_gradientn(colours=fill.colors)


enter image description here
My question is sort of a follow-up question related to
this one or this one , and the solution given here actually yields exactly the plot I want, except for the legend:

qn <- rescale(quantile(dat$z, probs=seq(0, 1, length.out=length(fill.colors))))
ggplot(dat, aes(x = x, y = y, fill = z)) +
geom_tile(width=2, height=30) +
scale_fill_gradientn(colours=fill.colors, values = qn)


enter image description here

Now I want the colour scale in the legend to represent the non-linear distribution of the values (now only the red part of the scale is visible), i.e. the legend should as well be based on quantiles. Is there a way to accomplish this?

I thought the
trans
argument within the colour scale might do the trick, as suggested here , but that throws an error, I think because
qnorm(pnorm(dat$z))
results in some infinite values (I don't completely understand the function though..).

norm_trans <- function(){
trans_new('norm', function(x) pnorm(x), function(x) qnorm(x))
}
ggplot(dat, aes(x = x, y = y, fill = z)) +
geom_tile(width=2, height=30) +
scale_fill_gradientn(colours=fill.colors, trans = 'norm')
> Error in seq.default(from = best$lmin, to = best$lmax, by = best$lstep) : 'from' must be of length 1


So, does anybody know how to have a quantile-based colour distribution in the plot and in the legend?

Answer

This code will make manual breaks with a pnorm transformation. Is this what you are after?

ggplot(dat, aes(x = x, y = y, fill = z)) + 
  geom_tile(width=2, height=30) +
  scale_fill_gradientn(colours=fill.colors, 
                       trans = 'norm', 
                       breaks = quantile(dat$z, probs = c(0, 0.25, 1))
  )
Comments