Jerry.Shad Jerry.Shad - 1 year ago 85
R Question

How to generate grouped bar plot or pie chart from list of csv files?

I got list of data.frame that need to be classified, I did manipulate these list and finally export them as csv files in default folder. However, to make these exported data more informative, I think it is better to generate grouped bar plot, or pie chart for each data.frame objects. As a beginner, I am still learning features of ggplot2 packages, so I have little idea how to do this easily. Can any one give me possible ideas how to generate grouped bar plot easily ? How can I generate well informative bar plot for list of files ? How can I make this happen ? Any idea ? Thanks in advance :)

reproducible data :

savedDF <- list(
bar.saved = data.frame(start=sample(100, 15), stop=sample(150, 15), score=sample(36, 15)),
cat.saved = data.frame(start=sample(100, 20), stop=sample(100,20), score=sample(45,20)),
foo.saved = data.frame(start=sample(125, 24), stop=sample(140, 24), score=sample(32, 24))

dropedDF <- list(
bar.droped = data.frame(start=sample(60, 12), stop=sample(90,12), score=sample(35,12)),
cat.droped = data.frame(start=sample(75, 18), stop=sample(84,18), score=sample(28,18)),
foo.droped = data.frame(start=sample(54, 14), stop=sample(72,14), score=sample(25,14))

so I am getting list of csv files from this pipeline :

comb <-"rbind", c(savedDF, dropedDF))
cn <- c("letter", "saved","seq")
DF <- cbind(read.table(text = chartr("_", ".", rownames(comb)), sep = ".", col.names = cn), comb)
DF <- transform(DF, updown = ifelse(score>= 12, "stringent", "weak"))
by(DF, DF[c("letter", "saved", "updown")],
function(x) write.csv(x[-(1:3)],
sprintf("%s_%s_%s.csv", x$letter[1], x$updown[1], x$saved[1])))

To better understand the exported data, I think generating grouped bar plot and pie chart for each data.frame object will be much informative.

In desired plot, I intend to see number of features in each csv files for each data.frame objects. Can any one give me ideas to do this task ?

How can I make this happen easily by using ggplot2 packages ? Is there any way to get this done more efficiently ? Thanks a lot

Answer Source

If I understand correctly, this may work for you as a rough solution. Please comment to let me know if this is acceptable. In the future, if you can provide a rough sketch along with your data to show what you're trying to achieve that would be a good idea.


plot_data <- DF %>% 
  group_by(letter, saved, updown) %>% 
  tally %>% 
  group_by(saved, updown) %>% 
  mutate(percentage = n/sum(n))

ggplot(plot_data, aes(x = saved, y = n, fill = saved)) +
  geom_bar(stat = "identity") +
  facet_wrap(~ letter + updown, ncol = 2)

enter image description here

You can always change the facet_wrap(~ letter + updown, ncol = 2) to an explicit facet_grid(letter ~ updown) if you wish.

Or you could view it this way:

ggplot(plot_data, aes(x = letter, y = n)) +
  geom_bar(stat = "identity") +
  facet_wrap(~updown+saved, ncol = 2)

enter image description here

For a pie (cleaning up and labeling is up to you):

ggplot(plot_data, aes(x = 1, y = percentage, fill = letter)) +
  geom_bar(stat = "identity", width =1) +
  facet_wrap(~updown+saved, ncol = 2) +
  coord_polar(theta = "y") +

enter image description here

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download