Haroon Rashid Haroon Rashid - 8 days ago 5
R Question

Hide/Drop missing values in heat map with ggplot2

I have a data frame with continous missing values from 11 Jan to 14 Jan 2016 as

library(lubridate)
set.seed(123)
timestamp1 <- seq(as.POSIXct("2016-01-01"),as.POSIXct("2016-01-10 23:59:59"), by = "hour")
timestamp2 <- seq(as.POSIXct("2016-01-15"),as.POSIXct("2016-01-20 23:59:59"), by = "hour")
data_obj <- data.frame(value = c (rnorm(length(timestamp1),150,5),rnorm(length(timestamp2),110,3)),timestamp = c(timestamp1,timestamp2))
data_obj$day <- lubridate::date(data_obj$timestamp)
data_obj$hour <- lubridate::hour(data_obj$timestamp)


When I plot a heat map using

ggplot(data_obj,aes(day,hour,fill=value)) + geom_tile()


I get heat map like below one; red marked rectangular region corresponds to missing values

enter image description here

How should I entirely hide this blank area and make a continuous heat map?

Note that I do not want to change the format of x-axis date and I don't want to show missing values with some other color.

Answer

Slightly different answer to @Jacob's that preserves the date label format and order:

library(lubridate)

set.seed(123)

timestamp1 <- seq(as.POSIXct("2016-01-01"),as.POSIXct("2016-01-10 23:59:59"), by = "hour")
timestamp2 <- seq(as.POSIXct("2016-01-15"),as.POSIXct("2016-01-20 23:59:59"), by = "hour")

data_obj <- data.frame(value = c (rnorm(length(timestamp1),150,5),
                                  rnorm(length(timestamp2),110,3)),
                       timestamp = c(timestamp1,timestamp2))  
data_obj$day <- lubridate::date(data_obj$timestamp)
data_obj$hour <- lubridate::hour(data_obj$timestamp)

# preserve the date order manally in a factor

data_obj$day_f <- format(data_obj$day, "%b %d")

dplyr::arrange(data_obj, day) %>% 
  dplyr::distinct(day_f) -> day_f_order

data_obj$day_f <- factor(data_obj$day_f, levels=day_f_order$day_f)

ggplot(data_obj, aes(day_f, hour, fill=value)) + 
  geom_tile() +
  scale_x_discrete(expand=c(0,0), breaks=c("Jan 04", "Jan 18")) +
  scale_y_continuous(expand=c(0,0)) +
  viridis::scale_fill_viridis(name=NULL) +
  coord_equal() +
  labs(x=NULL, y=NULL) +
  theme(panel.background=element_blank()) +
  theme(panel.grid=element_blank()) +
  theme(axis.ticks=element_blank()) +
  theme(legend.position="bottom")

enter image description here

Note: you're still mis-truthing the data to your audience without an explicit, very visible note that explains that there is missing data.