Cebs Cebs - 1 year ago 118
R Question

How to arange a sloppy heatmap r

I want to create a heatmap of p-values with the outcome of a pairwise.wilcox.test. So, after performing the test, I reshape the outcome:

test <- pairwise.wilcox.test(world$mean, world$con, p.adjust.method ="bonferroni",conf.level = 0.95)
test.result <- melt (test[[3]],na.rm=T)

The outcome is the following:

X1 X2 value
1 europe africa 7.216273e-20
2 namerica africa 2.694228e-23
3 samerica africa 1.001953e-01
4 asia africa 3.515077e-66
5 europe europe NA
6 namerica europe 6.551144e-02
7 samerica europe 2.615654e-05
8 asia europe 2.148064e-09
9 europe namerica NA
10 namerica namerica NA
11 samerica namerica 4.894171e-10
12 asia namerica 3.642124e-02
13 europe samerica NA
14 namerica samerica NA
15 samerica samerica NA
16 asia samerica 5.999172e-25

Then I run a ggplot2 script to get the heatmap:

test.result$X1 <- factor(test.result$X1, levels = c("europe", "namerica", "samerica", "asia"))
test.result$X2 <- factor(test.result$X2, levels = c("europe", "namerica", "samerica","asia"))

test.result$value<-cut(test.result$value, breaks=c(-Inf,0.001,0.05,1),right=F)

ggplot(data = test.result, aes(X1, X2, fill = value)) +
geom_tile(aes(fill=test.result$value),color="white") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, vjust = 1,
size = 12, hjust = 1)) +

The outcome is the following figure:
enter image description here

As you can see the figure is not sorted in the diagonal, is kind of sloppy... I dont know how to arrange correctly the figure in order to get all the p values in the diagonal. Thanks for your help

The figure that I'm looking for is like this:
enter image description here

Answer Source

I think this is what you want?:

Calling your data tr:

tr = structure(list(X1 = structure(c(2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 
2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L), .Label = c("asia", "europe", 
"namerica", "samerica"), class = "factor"), X2 = structure(c(1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("africa", 
"europe", "namerica", "samerica"), class = "factor"), value = c(7.216273e-20, 
2.694228e-23, 0.1001953, 3.515077e-66, NA, 0.06551144, 2.615654e-05, 
2.148064e-09, NA, NA, 4.894171e-10, 0.03642124, NA, NA, NA, 5.999172e-25
)), .Names = c("X1", "X2", "value"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16"))

Swarch's comment was correct in that we need the factors to have the same levels/same order. The comment didn't quite work because africa was omitted. Fixing that:

lev = c("europe", "namerica", "samerica", "asia", "africa")
tr$X1 <- factor(tr$X1, levels = lev)
trX2 <- factor(tr$X2, levels = lev)

We can now make a plot. Some corrections here

  1. never use data$column inside aes() - use unquoted column names.
  2. if you specify fill = value in the top ggplot() call, no need to reiterate it for the geom_tile() layer.
  3. your value seems to be continuous. scale_fill_brewer implies a discrete scale, so cannot be used here. It seems fine without, but you could also try scale_fill_distiller.
  4. the code in your question was missing a +.

This code works:

ggplot(data = tr, aes(X1, X2, fill = value)) +
    geom_tile(color = "white") +
    theme_minimal() +
    theme(axis.text.x = element_text(
        angle = 45,
        vjust = 1,
        size = 12,
        hjust = 1
    )) +

enter image description here

Also note that the exact diagonal of 1's is missing here (unlike in your mtcars example) because it is missing from your data. That is, africa is completely absent from X1 and asia is completely absent from X2. If you want to plot those tiles, you will need to augment your data with those rows.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download