user3819981 user3819981 - 2 months ago 7
R Question

Months displayed incorrectly when using ggplot2

Hi having a problem where March appears twice in my graph but not in my Data.

My data looks like. My data frame is called try1.

Month Year tcol
2016-01-01 00:00:00 06 1461.0
2016-02-01 00:00:00 06 259.5
2016-03-01 00:00:00 06 191.2
2016-04-01 01:00:00 06 151.5
2016-05-01 01:00:00 06 119.6
2016-06-01 01:00:00 06 1372.5
2016-07-01 01:00:00 06 954.0
2016-08-01 01:00:00 06 1784.0
2016-09-01 01:00:00 06 1369.0
2016-10-01 01:00:00 06 6077.0
2016-11-01 00:00:00 06 1638.0
2016-12-01 00:00:00 06 3308.0


And my code looks like.

ggplot(try1, aes(Month,tcol)) +
geom_point(aes(colour = Year),size=2) +
geom_line(aes(colour = Year), size=0.73)+
theme_bw()+
guides(col = guide_legend(ncol = 2))+
scale_x_datetime(
breaks=date_breaks("1 months"),
labels=date_format("%B"))+
xlab("")+ #x axis label
ylab("Total Coliforms")


The problem is that when I plot my graph March appears twice. And October appears to be left out.

The resulting graph

Thanks for your help.

Answer

I suspect it is a timezone issue. E.g., with this data

structure(list(Month = structure(list(sec = c(0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L), hour = c(0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 0L, 0L), mday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), mon = 0:11, year = c(116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L), wday = c(5L, 1L, 2L, 5L, 
0L, 3L, 5L, 1L, 4L, 6L, 2L, 4L), yday = c(0L, 31L, 60L, 91L, 
121L, 152L, 182L, 213L, 244L, 274L, 305L, 335L), isdst = c(0L, 
0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L), zone = c("GMT", 
"GMT", "GMT", "BST", "BST", "BST", "BST", "BST", "BST", "BST", 
"GMT", "GMT"), gmtoff = c(NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_)), .Names = c("sec", 
"min", "hour", "mday", "mon", "year", "wday", "yday", "isdst", 
"zone", "gmtoff"), class = c("POSIXlt", "POSIXt"), tzone = c("Europe/London", 
"GMT", "BST")), Year = c(6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L), tcol = c(1461, 259.5, 191.2, 151.5, 119.6, 1372.5, 
954, 1784, 1369, 6077, 1638, 3308)), .Names = c("Month", "Year", 
"tcol"), row.names = c(NA, -12L), class = "data.frame")

I can reproduce your chart. Try changing the timezone

attr(try1$Month, "tzone") <- "UTC"

and replot.


Update. I was wondering why changing the timezone to "UTC" works. It turns out that date_format() takes a tz argument that defaults to "UTC". See ?date_format. This means that instead of changing the timezone of Month to "UTC", you can also fix your problem by changing the tz argument in date_format() to whatever the original timezone of Month is, which you can inspect via attr(try1$Month, "tzone").

Comments