haggis haggis - 2 months ago 19
R Question

x-axis in reverse order

I've used the following way to create 3 histograms. The 4th one has suddenly a reverse order on the x-axis. However, there's nothing (at least nothing I know about) in the snippet that should affect the order.

The x-axis is expected to start with the lowest value on the left

The x-axis is expected to start with the lowest value on the left.

Here's the R code:

df <- mydata %>% mutate(length.class=cut(mydata$count,breaks = c(1,10,100,1000,10000,100000,1000000,10000000),include.lowest=TRUE,dig.lab=8)) %>% group_by(length.class) %>% summarise(count = n())
dftext <- as.data.frame(table(df$length.class))
colnames(dftext)[1] <- "x"
dftext$lab[dftext$x == "[1,10]"] <- 1063393
dftext$lab[dftext$x == "(10,100]"] <- 65986
dftext$lab[dftext$x == "(100,1000]"] <- 3206
dftext$lab[dftext$x == "(1000,10000]"] <- 386
dftext$lab[dftext$x == "(10000,100000]"] <- 32
dftext$lab[dftext$x == "(100000,1000000]"] <- 0
dftext$lab[dftext$x == "(1000000,10000000]"] <- 1

df$count[df$length.class == "(1000000,10000000]"] <- 1.1 // To make its bar visible

fmt <- function(decimals=0){
function(x) format(x,scientific = FALSE)
}

ggplot(df,aes(length.class,count)) + geom_bar(stat = "identity",width=0.9,fill="#999966") + scale_y_log10(labels = fmt()) + labs(x="", y="") + geom_text(data=dftext, aes(x=x, y=2, label=lab), size = 6) + theme(text = element_text(size=20)) +
theme(axis.line = element_line(colour = "black"),
panel.grid.major = element_line(color = "grey"),
panel.grid.minor = element_line(color = "grey"),
panel.background = element_blank(),
axis.title.x = element_text(margin=margin(t = 15, unit = "pt")),
axis.text.x = element_text(angle = 45, hjust = 1))


What is causing the reverse order and how can I get rid of it?

Edit:
You guys are fast! :) The answer of @mark-peterson looks pretty solid, however I didn't get any working results with it though.
Here's the requested data:
mydata.csv

Answer

Your two datasets have the same levels of the factors length.class and x, but there is no row for (100000,1000000] in your first dataset, df. This is because summarise has no drop = FALSE option to keep all levels of a factor in the dataset regardless of if they have any observations.

As you built your plot using the dataset with fewer factors in the rows, it looks like ggplot2 gets confused when you add the new layer that has more factor levels and things get ordered oddly.

A fix is to make sure the x axis doesn't drop any factor levels by using drop = FALSE in scale_x_discrete. That way you will be working with the same factor levels for the x axis for both datasets and things won't get mis-ordered.

+ scale_x_discrete(drop = FALSE)
Comments