ccamara ccamara - 1 month ago 12
R Question

Sorting a ggplot axis' factor according to another factor's levels

I have successfully created a boxplot that displays the score of several neighborhoods of a city and have coloured them according to the district they belong to. The result looks like this:

library(ggplot2)

df = read.csv("http://pastebin.com/raw/rpPLwSXn")

ggplot(df, aes(x = neighbourhood, y = score, fill = district)) +
geom_boxplot() +
ggtitle("Neighbourhoods' score") +
labs(x = "Neighbourhoods", y = "Score", fill = "District") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))


enter image description here

It looks quite well despite the fact that instead of sorting the neighborhoods on the x axis alphabetically (
neighbourhood
column on the dataframe) I would like them to be sorted according to the discrict they belong to (
district
variable on the dataframe)

I've read that I could use
factor
to relevel the values on
neighbourhood
column, but haven't succeeded with that since the vector lenght is different (there are less districts than neighbourhoods)

Answer

I like the facet idea in Ulrik's answer - that will probably be the nicest visualization. To order the factor levels of the neighbourhood column the easiest way is probably like this:

# order the data frame as desired
df = df[order(df$district, df$neighbourhood), ]
# set the neighbourhood levels in the order the occur in the data frame
df$neighbourhood = factor(df$neighbourhood, levels = unique(df$neighbourhood))

After the levels are in the order you want, the axis will follow.

Comments