Farbod Farbod - 11 months ago 53
R Question

How to convert a ggplot2 bar plot into a violin plot

I have searched in some topics and I have found the main idea of ploting a violin plot but when I combine those scripts in mine (I am going to show it below), the results is not acceptable. it seems that drawing a violin plot from scratch is more simple than converting a bar plot to a violin plot.

Q: I have a bar plot script and I am trying to convert it to a violin plot (same as this),

would you please help me in this regard ? (Thank you in advance)

dat <- data.frame(
FunctionClass = factor(c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "Y", "Z"), levels=c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "Y", "Z")),
legend = c("A: RNA processing and modification", "B: Chromatin structure and dynamics", "C: Energy production and conversion", "D: Cell cycle control, cell division, chromosome partitioning", "E: Amino acid transport and metabolism", "F: Nucleotide transport and metabolism", "G: Carbohydrate transport and metabolism", "H: Coenzyme transport and metabolism", "I: Lipid transport and metabolism", "J: Translation, ribosomal structure and biogenesis", "K: Transcription", "L: Replication, recombination and repair", "M: Cell wall/membrane/envelope biogenesis", "N: Cell motility", "O: Posttranslational modification, protein turnover, chaperones", "P: Inorganic ion transport and metabolism", "Q: Secondary metabolites biosynthesis, transport and catabolism", "R: General function prediction only", "S: Function unknown", "T: Signal transduction mechanisms", "U: Intracellular trafficking, secretion, and vesicular transport", "V: Defense mechanisms", "W: Extracellular structures", "Y: Nuclear structure", "Z: Cytoskeleton"),
Frequency = c(360,391,897,1558,1168,448,1030,536,732,1292,2221,2098,789,117,1744,732,437,5162,1251,2191,603,216,2,14,739)
ggplot(data=dat, aes(x=FunctionClass, y=Frequency, fill=legend)) +
geom_bar(stat="identity", position=position_dodge(), colour="black")

Answer Source

I agree with @KonradRudolph that flipping the plot and showing the labels on the plot, rather than in a legend, is a better way to go here. See below for an example. I don't think you need to color the bars, but I've left the coloring in the example below. If the various x-values fall into a few natural categories, it might make more sense to color by those categories. You can also label the bars with counts and percentages, and I've included an example of that as well.


# Create a new label column for the x axis
dat$x = gsub(".: ", "", dat$legend)
dat$x = factor(dat$x, levels=dat$x)

ggplot(data=dat, aes(x=x, y=Frequency, fill=x)) +
  geom_bar(stat="identity", colour="black", show.legend=FALSE) +
    label=ifelse(Frequency>700, paste0(Frequency, " (",sprintf("%1.1f", Frequency/sum(Frequency)*100),"%)"),
                 ifelse(Frequency>300, Frequency, "")), y=0.5*Frequency), 
    colour="white", size=2.5) +
  scale_colour_gradientn(colours=rainbow(36)) +
  coord_flip() + theme_bw() + labs(x="") 

enter image description here