Farbod Farbod - 5 months ago 58
R Question

How to draw Clusters of Orthologous Groups (COG) bar plot using ggplot2 in R?

I am new to R and I have some data as below and I want to draw a histogram same as this with pkg::ggplot2 in R program (linux or Rstudio).

As you can see it is the letters from A to Z in the X axis (Function class) and the frequencies as numbers in the Y and the important point is this that each bar has its own unique color.

In addition, there is a "color help = legend" that describe each class by the same color of each bars that I am not sure if it is some characteristics of R ggplot2 package or not?

I have checked some online helps but I do not know how to insert my data in the ggplot2 and assign a unique color for each class.

my data sample:

A 5 RNA processing and modification
B 2 Chromatin structure and dynamics
C 18 Energy production and conversion
D 26 Cell cycle control, cell division, chromosome partitioning
E 15 Amino acid transport and metabolism
F 5 Nucleotide transport and metabolism
G 13 Carbohydrate transport and metabolism
H 6 Coenzyme transport and metabolism
I 15 Lipid transport and metabolism
J 20 Translation, ribosomal structure and biogenesis
K 24 Transcription
L 28 Replication, recombination and repair
M 18 Cell wall/membrane/envelope biogenesis
N 1 Cell motility
O 29 Posttranslational modification, protein turnover, chaperones
P 19 Inorganic ion transport and metabolism
Q 16 Secondary metabolites biosynthesis, transport and catabolism
R 85 General function prediction only
S 20 Function unknown
T 32 Signal transduction mechanisms
U 14 Intracellular trafficking, secretion, and vesicular transport
V 6 Defense mechanisms
Z 14 Cytoskeleton


Are those numbers next to the letters A-Z the height of the bars? If so, you're better of using a barplot:

# Simulate some data
obs = rpois(10, 5)
group = factor(1:10)
df = data.frame(obs, group)
ggplot(data = df, aes(x = group, y = obs, color = group, fill = group)) + 
  geom_bar(stat = 'identity')

To get the colored bars in general, make sure your dataframe has a grouping variable (must be a factor, otherwise you get continuous color assignment) and then in aes assign color/fill the value of the column name.