in code veritas - 3 months ago 14

R Question

I'm trying to replicate a problem from Gelman's Multilevel/Hierarchical models book.

He says:

Given N, p, se -- the vectors of population sizes, estimated proportions of Yes responses and standard errors -- we can compute the weighted average and its 95% confidence interval in R

He provides this code:

`w.avg <- sum(N*p)/sum(N)`

se.w.av <- sqrt (sum ((N*se/sum(N))^2))

int.95 <- w.avg + c(-2,2)*se.w.avg

I don't understand how to construct the vectors, N, Se and p.

He says N should be a vector like: N1, N2, N3... are the total number of adults in a country (say, France, Germany, Italy). Ntot is the total num in the European Union. The standard error of the weighted average is

`√ ((N1/Ntot)N1error)² + ((N2/Ntot)N2error)² + ((N3/Ntot)N3error)²`

I'm struggling to construct the vectors N, se and p.

I know how to construct a simple proportion vector:

`y <- 700`

n <- 1000

estimate <- y/n

se <- sqrt (estimate*(1-estimate)/n)

And how to construct a vector of discrete quantitites:

`y <- rep (c(0,1,2,3,4), c(600,300,50,30,20))`

n <- length(y)

estimate <- mean(y)

se <- sd(y)/sqrt(n)

I'm confused as to how to construct a vector of multiple discrete proportions, each with their own SEs and confidence intervals?

Answer

With vectors of numbers, R will calculate element by element, so your code to calculate proportion and standard error for one example will also work for multiple examples. e.g.

```
y <- c(700, 500)
n <- c(1000, 1000)
estimate <- y/n
se <- sqrt (estimate*(1-estimate)/n)
```

`estimate`

has values `0.7`

and `0.5`

, since `y/n`

calculates using corresponding pairs of numbers `700/1000`

and `500/1000`

. Elements correspond by position -- first element in `y`

goes with first in `n`

and so on.

`se`

likewise has two elements using corresponding elements of the `estimate`

and `n`

vectors.