Peanut - 9 months ago 60

R Question

I found a plot in a stats book, which I want to reproduce with the base package.

The plot looks like this:

So far I have the plot, but I have problems to add a centred labels to each part of the bar.

My code looks like this:

`data <- sample( 5, 10 , replace = TRUE )`

colors <- c('yellow','violet','green','pink','red')

relative.frequencies <- as.matrix( prop.table( table( data ) ) )

bc <- barplot( relative.frequencies, horiz = TRUE, axes = FALSE, col = colors )

Answer Source

For your given example, we can do (**all readers can skip this part and jump to the next**):

```
set.seed(0) ## `set.seed` for reproducibility
dat <- sample( 5, 10 , replace = TRUE )
colors <- c('yellow','violet','green','pink')
h <- as.matrix( prop.table( table( dat ) ) )
## compute x-location of the centre of each bar
H <- apply(h, 2L, cumsum) - h / 2
## add text to barplot
bc <- barplot(h, horiz = TRUE, axes = FALSE, col = colors )
text(H, bc, labels = paste0(100 * h, "%"))
```

**I will now construct a comprehensive example for you to digest the idea.**

**Step 1: generate a toy matrix of percentage for experiment**

```
## a function to generate `n * p` matrix `h`, with `h > 0` and `colSums(h) = 1`
sim <- function (n, p) {
set.seed(0)
## a positive random matrix of 4 rows and 3 columns
h <- matrix(runif(n * p), nrow = n)
## rescale columns of `h` so that `colSums(h)` is 1
h <- h / rep(colSums(h), each = n)
## For neatness we round `h` up to 2 decimals
h <- round(h, 2L)
## but then `colSums(h)` is not 1 again
## no worry, we simply reset the last row:
h[n, ] <- 1 - colSums(h[-n, ])
## now return this good toy matrix
h
}
h <- sim(4, 3)
# [,1] [,2] [,3]
#[1,] 0.43 0.31 0.42
#[2,] 0.13 0.07 0.40
#[3,] 0.18 0.30 0.04
#[4,] 0.26 0.32 0.14
```

**Step 2: understand a stacked bar-chart and get "mid-height" of each stacked bar**

For stacked bar-chart, the height of the bar is the cumulative sum of each column of `h`

:

```
H <- apply(h, 2L, cumsum)
# [,1] [,2] [,3]
#[1,] 0.43 0.31 0.42
#[2,] 0.56 0.38 0.82
#[3,] 0.74 0.68 0.86
#[4,] 1.00 1.00 1.00
```

We now shift back `h / 2`

to get the mid / centre of each stacked bar:

```
H <- H - h / 2
# [,1] [,2] [,3]
#[1,] 0.215 0.155 0.21
#[2,] 0.495 0.345 0.62
#[3,] 0.650 0.530 0.84
#[4,] 0.870 0.840 0.93
```

**Step 3: producing a bar-chart with filled numbers**

For a vertical bar-chart, `H`

above gives the `y`

coordinate of the centre of each stacked bar. The `x`

coordinate is returned by `barplot`

(invisibly). Be aware, that we need to **replicate** each of `x`

's element `nrow(H)`

times when using `text`

:

```
x <- barplot(h, col = 1 + 1:nrow(h), yaxt = "n")
text(rep(x, each = nrow(H)), H, labels = paste0(100 * h, "%"))
```

For a horizontal bar-chart, `H`

above gives the `x`

coordinate of the centre of each stacked bar. The `y`

coordinate is returned by `barplot`

(invisibly). Be aware, that we need to **replicate** each of `y`

's element `nrow(H)`

times when using `text`

:

```
y <- barplot(h, col = 1 + 1:nrow(h), xaxt = "n", horiz = TRUE)
text(H, rep(y, each = nrow(H)), labels = paste0(100 * h, "%"))
```