Martin Martin - 2 months ago 5
R Question

R - Strange factor behavior in ggplot

I am trying to produce a pie chart from a small data frame. At first everything worked well

library(ggplot2)
library(data.table)

c1 <- c(2,3)
c2 <- c("second","third")
c2 <- factor(c2, levels = c("first","second","third","fourth"))
c3 <- c(0.7,0.3)
cs <- data.frame(c1,c2,c3)
ct <- data.table(cs)
colx <- c("blue","red")
midpoint <- cumsum(ct$c3) - ct$c3/2

keycols = c("c1")
setkeyv(ct,keycols)
ct



c1 c2 c3
1: 2 second 0.7
2: 3 third 0.3

vysg <- ggplot(ct, aes(x=1,y=c3,fill=c2)) +
geom_bar(stat="identity",width=2) +
coord_polar(theta='y')+
theme(axis.ticks=element_blank(), axis.title=element_blank(),
axis.text.y = element_blank(), panel.grid = element_blank(),
axis.text.x = element_text(color=colx,size=15,hjust=0))+

scale_y_continuous(breaks = midpoint, labels = ct$c2) +
scale_fill_manual(values=colx) +
scale_x_continuous(limits=c(-1,2.5))
vysg


enter image description here

The problems start when I need to add new rows to the dataframe(data.table) (the zero results for first and fourth)

ctlab <- data.table(levels(c2))
nlabs <- ctlab[!V1%in%ct$c2]
nlabs[, V1 := factor(V1,levels=c("first","second","third","fourth"))]
nct <- data.frame(c1=c(1,4),c2=nlabs[,V1],c3=0)
ct <- rbind(ct,nct)
colx <- c("green","blue","red","brown")
ct$c2 <- factor(ct$c2,levels=c("first","second","third","fourth"))
ct$c4 <- as.character(ct$c2)
keycols = c("c1")
setkeyv(ct, keycols)
ct

c1 c2 c3 c4
1: 1 first 0.0 first
2: 2 second 0.7 second
3: 3 third 0.3 third
4: 4 fourth 0.0 fourth


The data.table looks ok but the chart is not

midpoint <- cumsum(ct$c3) - ct$c3/2
vysg <- ggplot(ct, aes(x=1,y=c3,fill=c2)) +
geom_bar(stat="identity",width=2) +
coord_polar(theta='y') +
theme(axis.ticks=element_blank(), axis.title = element_blank(),
axis.text.y = element_blank(), panel.grid = element_blank(),
axis.text.x=element_text(color=colx,size=15,hjust=0)) +
scale_y_continuous(breaks = midpoint, labels = ct$c2) +
scale_fill_manual(values = colx) +
scale_x_continuous(limits = c(-1,2.5))
vysg

Warning message:
In `[[<-.factor`(`*tmp*`, n, value = "first/fourth") :
invalid factor level, NA generated


enter image description here

After replacing c2 by c4 (string) in labels the warning does not appear but the chart is not ok

midpoint <- cumsum(ct$c3) - ct$c3/2
vysg <- ggplot(ct, aes(x=1,y=c3,fill=c2)) +
geom_bar(stat="identity",width=2) +
coord_polar(theta = 'y') +
theme(axis.ticks=element_blank(), axis.title=element_blank(),
axis.text.y = element_blank(), panel.grid = element_blank(),
axis.text.x = element_text(color=colx,size=15,hjust=0)) +
scale_y_continuous(breaks = midpoint, labels = ct$c4) +
scale_fill_manual(values=colx) +
scale_x_continuous(limits=c(-1,2.5))
vysg


enter image description here

I guess the problem is hidden in the factor (c2) but cannot find a way how to amend it. I explicitly set levels in both - the old data.frame and the new one.

Answer

Your problem is that 0° and 360° are the same angle and ggplot2 knows this. You'd see that if you only plotted this:

ggplot(ct, aes(x=1, y=c3, fill=c2)) + 
  geom_bar(stat="identity",width=2) + 
  coord_polar(theta='y')

Thus, you need to do some preparation for the labels:

midpoint[midpoint == 1] <- 0
labs <- aggregate(ct$c2, list(midpoint), FUN = function(x) paste(x, collapse = "/"))
vysg<-ggplot(ct, aes(x=1, y=c3, fill=c2)) + 
  geom_bar(stat="identity",width=2) + 
  coord_polar(theta='y')+
  theme(axis.ticks=element_blank(), axis.title=element_blank(), 
        axis.text.y=element_blank(),  panel.grid  = element_blank(), 
        axis.text.x=element_text(color=colx,size=15,hjust=0))+ 
  scale_y_continuous(breaks=labs$Group.1, labels=labs$x)+
  scale_fill_manual(values=colx)+
  scale_x_continuous(limits=c(-1,2.5))
vysg

resulting plot

Different colors in the scale label are not possible (without editing at the grid graphics level). Maybe scale_fill_manual(values=colx[as.integer(factor(midpoint))]) would suit your needs.

Finally, the obligatory advice: There is usually a much better option than a pie chart to illustrate such data.

Comments