BarneyC BarneyC - 3 months ago 14
R Question

ggplot will not plot missing category

I'm struggling with ggplot (I always do). There are a number of very similar questions about forcing ggplot to include zero value categories in legends - here and here (for example). BUT I (think I) have a slightly different requirement to which all my mucking about with scale_x_discrete and scale_fill_manual has not helped.

Requirement: As you can see; the right-hand plot has no data in the TM=5 category - so is missing. What I need is for that right plot to have category 5 shown on the axis but obviously with no points or box.

enter image description here

Current Plot Script:

#data
plotData <- data.frame("TM" = c(3,2,3,3,3,4,3,2,3,3,4,3,4,3,2,3,2,2,3,2,3,3,3,2,3,1,3,2,2,4,4,3,2,3,4,2,3),
"Score" = c(5,4,4,4,3,5,5,5,5,5,5,3,5,5,4,4,5,4,5,4,5,4,5,4,4,4,4,4,5,4,4,5,3,5,5,5,5))
#vars
xTitle <- bquote("T"["M"])
v.I <- plotData$TM
depVar <- plotData$Score

#plot
p <- ggplot(plotData, aes_string(x=v.I,y=depVar,color=v.I)) +
geom_point() +
geom_jitter(alpha=0.8, position = position_jitter(width = 0.2, height = 0.2)) +
geom_boxplot(width=0.75,alpha=0.5,aes_string(group=v.I)) +
theme_bw() +
labs(x=xTitle) +
labs(y=NULL) +
theme(legend.position='none',
axis.text=element_text(size=10, face="bold"),
axis.title=element_text(size=16))


Attempted Solutions:


  1. drop=False
    to scales (suggested by @Jarretinha here) totally borks margins and x-axis labels

    > plot + scale_x_discrete(drop=FALSE) + scale_fill_manual(drop=FALSE)



enter image description here


  1. Following logic from here and manually setting the labels in
    scale_fill_manual
    does nothing and results in the same right-hand plot from example above.

    > p + scale_fill_manual(values = c("red", "blue", "green", "purple", "pink"),
    labels = c("Cat1", "Cat2", "Cat3", "Cat4", "Cat5"),
    drop=FALSE)

  2. Playing with this logic and trying something with
    scale_x_discrete
    results in a change to category names on x-axis but the fifth is still missing AND the margins (as attempt 1) are borked again. BUT apparent that
    scale_x_discrete
    is important and NOT the whole answer

    > p + scale_x_discrete(limits = c("Cat1", "Cat2", "Cat3", "Cat4", "Cat5"), drop=FALSE)



enter image description here

Plea: Pointers, help or indeed AN ANSWER please. Thanks

Answer

Here's a workaround you could use:

# generate dummy data 
set.seed(123)
df1 <- data.frame(lets = sample(letters[1:4], 20, replace = T),
                  y = rnorm(20), stringsAsFactors = FALSE)
# define factor, including the missing category as a level
df1$lets <- factor(df1$lets, levels = letters[1:5])
# make plot
ggplot(df1, aes(x = lets, y = y))+
    geom_boxplot(aes(fill = lets))+
    geom_point(data = NULL, aes(x = 'e', y = 0), pch = NA)+
    scale_fill_brewer(drop = F, palette = 'Set1')+
    theme_bw()

enter image description here

Basically, we plot an "empty" point (i.e. pch = NA) so that the category shows up on the x-axis, but has no visible geom associated with it. We also define our discrete variable, lets as a factor with five levels when only four are present in the data.frame. The missing category is the letter e.

NB: You'll have to adjust the positioning of this "empty" point so that it doesn't skew your y axis.

Otherwise, you could use the result from this answer to avoid having to plot an "empty" point.

# generate dummy data 
set.seed(123)
df1 <- data.frame(lets = sample(letters[1:4], 20, replace = T),
                  y = rnorm(20), stringsAsFactors = FALSE)
# define factor, including the missing category as a level
df1$lets <- factor(df1$lets, levels = letters[1:5])
# make plot
ggplot(df1, aes(x = lets, y = y)) +
    geom_boxplot(aes(fill = lets)) +
    scale_x_discrete(drop = F) +
    scale_fill_brewer(drop = F, palette = 'Set1') +
    theme_bw()

enter image description here

Comments