arun arun - 4 months ago 96
R Question

R ggplot bar plot with month on X-axis

I want a bar plot with months on the X-axis, counts on the Y-axis and a binary column (

status
) as fill. Here is the code with the errors, warnings and the plot I am getting. How do I get the correct plot?

library(ggplot2)

# to read in date correctly
setClass("myDate")
setAs("character",
"myDate",
function(from) as.Date(from, format = "%Y-%m-%d"))

csvData <- "id,dt,status
1,2015-12-03,1
2,2015-12-05,1
3,2015-12-05,0
4,2015-11-24,1
5,2015-10-17,0
6,2015-12-18,0
7,2016-06-30,0
8,2016-05-21,1
9,2016-03-31,0
10,2015-12-31,0"

tmp <- read.csv(textConnection(csvData),
colClasses = c("integer", "myDate", "factor"))
tmp$mon <- as.Date(cut(tmp$dt, breaks = "month"))

# The plot must have this time frame on the X-axis
dtLimits <- as.Date(c("2015-01-01", "2016-08-01"))

# This does not work
# since x is a factor here and scale uses date
ggplot(data = tmp, aes(x = as.factor(mon))) +
geom_bar(aes(fill = status)) +
scale_x_date(date_breaks = "1 month",
labels = date_format("%y/%m"),
limits = dtLimits)
# Error: Invalid input: date_trans works with objects of class Date only

# wrong plot with warning message
ggplot(data = tmp, aes(x = mon)) +
geom_bar(aes(fill = status)) +
scale_x_date(date_breaks = "1 month",
labels = date_format("%y/%m"),
limits = dtLimits) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
# Warning message:
# position_stack requires non-overlapping x intervals


The plot produced by the last statement is like this:

enter image description here

The following code produces the correct plot, but does not have the required limits and is missing the months where the counts are 0.

ggplot(data = tmp,
aes(x = as.factor(format(mon, format = "%y/%m")))) +
geom_bar(aes(fill = status)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))


enter image description here

Answer

As you are working with dates, the x axis is on the scale of days. The bar width is set at 90% of the resolution of the data, so in this case each bar encompasses 0.9 days if you don't set the width argument. Change it to 30 to get bins of about a month.

ggplot(data = tmp, aes(x = mon)) + 
    geom_bar(aes(fill = status), width = 30) + 
    scale_x_date(date_breaks = "1 month", 
                 labels = date_format("%y/%m"),
                 limits = dtLimits)  +
    theme(axis.text.x = element_text(angle = 90, vjust = .5))

enter image description here