Peanut Peanut - 5 days ago 4
R Question

Stacked barplot using R base: how to add values inside each stacked bar

I found a plot in a stats book, which I want to reproduce with the base package.

The plot looks like this:

enter image description here

So far I have the plot, but I have problems to add a centred labels to each part of the bar.

My code looks like this:

data <- sample( 5, 10 , replace = TRUE )

colors <- c('yellow','violet','green','pink','red')

relative.frequencies <- as.matrix( prop.table( table( data ) ) )

bc <- barplot( relative.frequencies, horiz = TRUE, axes = FALSE, col = colors )

Answer

For your given example, we can do (all readers can skip this part and jump to the next):

set.seed(0)  ## `set.seed` for reproducibility
dat <- sample( 5, 10 , replace = TRUE )
colors <- c('yellow','violet','green','pink')
h <- as.matrix( prop.table( table( dat ) ) )
## compute x-location of the centre of each bar
H <- apply(h, 2L, cumsum) - h / 2
## add text to barplot
bc <- barplot(h, horiz = TRUE, axes = FALSE, col = colors )
text(H, bc, labels = paste0(100 * h, "%"))

strip


For all readers

I will now construct a comprehensive example for you to digest the idea.

Step 1: generate a toy matrix of percentage for experiment

## a function to generate `n * p` matrix `h`, with `h > 0` and `colSums(h) = 1`
sim <- function (n, p) {
  set.seed(0)
  ## a positive random matrix of 4 rows and 3 columns
  h <- matrix(runif(n * p), nrow = n)
  ## rescale columns of `h` so that `colSums(h)` is 1
  h <- h / rep(colSums(h), each = n)
  ## For neatness we round `h` up to 2 decimals
  h <- round(h, 2L)
  ## but then `colSums(h)` is not 1 again
  ## no worry, we simply reset the last row:
  h[n, ] <- 1 - colSums(h[-n, ])
  ## now return this good toy matrix
  h
  }

h <- sim(4, 3)
#     [,1] [,2] [,3]
#[1,] 0.43 0.31 0.42
#[2,] 0.13 0.07 0.40
#[3,] 0.18 0.30 0.04
#[4,] 0.26 0.32 0.14

Step 2: understand a stacked bar-chart and get "mid-height" of each stacked bar

For stacked bar-chart, the height of the bar is the cumulative sum of each column of h:

H <- apply(h, 2L, cumsum)
#     [,1] [,2] [,3]
#[1,] 0.43 0.31 0.42
#[2,] 0.56 0.38 0.82
#[3,] 0.74 0.68 0.86
#[4,] 1.00 1.00 1.00

We now shift back h / 2 to get the mid / centre of each stacked bar:

H <- H - h / 2
#      [,1]  [,2] [,3]
#[1,] 0.215 0.155 0.21
#[2,] 0.495 0.345 0.62
#[3,] 0.650 0.530 0.84
#[4,] 0.870 0.840 0.93

Step 3: producing a bar-chart with filled numbers

For a vertical bar-chart, H above gives the y coordinate of the centre of each stacked bar. The x coordinate is returned by barplot (invisibly). Be aware, that we need to replicate each of x's element nrow(H) times when using text:

x <- barplot(h, col = 1 + 1:nrow(h), yaxt = "n")
text(rep(x, each = nrow(H)), H, labels = paste0(100 * h, "%"))

vertical barchart

For a horizontal bar-chart, H above gives the x coordinate of the centre of each stacked bar. The y coordinate is returned by barplot (invisibly). Be aware, that we need to replicate each of y's element nrow(H) times when using text:

y <- barplot(h, col = 1 + 1:nrow(h), xaxt = "n", horiz = TRUE)
text(H, rep(y, each = nrow(H)), labels = paste0(100 * h, "%"))

Horizontal bar-chart