Pixelements Pixelements - 3 months ago 14
R Question

R Barplot with ggplot2 - two categories with different Numeric! values

I want to create a specific barplot with ggplot. So far so good, here is what I've got so far:

ggplot(only_savings, aes(DivisionName, Total_CR)) +
geom_bar(stat="summary", fun.y="sum")


Total_CR on Y with 1 bar

As you can see - there are 2 Divisions: Electrification Products and Power Grinds. On the Y-Axis we have numeric Savings that are summed up (Total_CR - total cost reduction). BUT, I would like to SPLIT the Bar in 2 more parts: Repetitive_Savings and MDF_Savings. So it would look like this:

Total_CR on Y with divided Bars

And here is the data:
(Ok, I can't post a screenshot, so I'll paste some rows)

DivisionName Repetitive_Savings MDF_Savings Total_CR
Power Grids 86.571656 0 86.571656
Power Grids 183.461221 0 183.461221
Power Grids 2326.963118 0 2326.963118
Electrification Products 1249.323277 0 1249.323277
Electrification Products 6.849336 0 6.849336
Electrification Products 3.808845 0 3.808846


DivisionName is a factor, the other 3 are numeric Values. How can I achieve the Barplots that I've sketched in paint?

Answer

Read in data

I changed your example a little, since values of 0 aren't going to show anything for us.

only_savings <- read.table(header = TRUE, text = "
DivisionName                Repetitive_Savings       MDF_Savings    Total_CR
'Power Grids'                 86.571656                500              86.571656
'Power Grids'                 183.461221               500              183.461221
'Power Grids'                 2326.963118              500              2326.963118
'Electrification Products'    1249.323277              500              1249.323277
'Electrification Products'    6.849336                 500              6.849336
'Electrification Products'    3.808845                 500              3.808846
")

Reshape

ggplot requires things to be in long form, or 'tidy' form, which means that each observation should be seperate row, which an additional column telling use whether that row belongs to Repetitive or MDF. One easy way to do that is with the tidyr package.

We'll have to filter out all the rows with Total though, since they aren't needed to be plotted.

library(tidyr)
pd <- gather(only_savings, 'key', 'value', -DivisionName)
pd <- pd[pd$key != 'Total_CR', ]

Create the plot

Now all that is left to do is to assign a fill colour to key.

library(ggplot2)
ggplot(pd, aes(DivisionName,  value, fill = key)) +
  geom_bar(stat = "summary", fun.y = "sum")

Note that we can also write it as follows, where the stacking of the observations is the same as summing them first.

ggplot(pd, aes(DivisionName,  value, fill = key)) +
  geom_bar(stat = "identity")

Result

enter image description here

Comments