in code veritas in code veritas - 4 months ago 18
R Question

How to compute a weighted average in R using vectors of population sizes, proportions and errors

I'm trying to replicate a problem from Gelman's Multilevel/Hierarchical models book.

He says:

Given N, p, se -- the vectors of population sizes, estimated proportions of Yes responses and standard errors -- we can compute the weighted average and its 95% confidence interval in R

He provides this code:

w.avg <- sum(N*p)/sum(N)
se.w.av <- sqrt (sum ((N*se/sum(N))^2))
int.95 <- w.avg + c(-2,2)*se.w.avg

I don't understand how to construct the vectors, N, Se and p.

He says N should be a vector like: N1, N2, N3... are the total number of adults in a country (say, France, Germany, Italy). Ntot is the total num in the European Union. The standard error of the weighted average is
√ ((N1/Ntot)N1error)² + ((N2/Ntot)N2error)² + ((N3/Ntot)N3error)²

I'm struggling to construct the vectors N, se and p.

I know how to construct a simple proportion vector:

y <- 700
n <- 1000
estimate <- y/n
se <- sqrt (estimate*(1-estimate)/n)

And how to construct a vector of discrete quantitites:

y <- rep (c(0,1,2,3,4), c(600,300,50,30,20))
n <- length(y)
estimate <- mean(y)
se <- sd(y)/sqrt(n)

I'm confused as to how to construct a vector of multiple discrete proportions, each with their own SEs and confidence intervals?


With vectors of numbers, R will calculate element by element, so your code to calculate proportion and standard error for one example will also work for multiple examples. e.g.

y <- c(700, 500)
n <- c(1000, 1000)
estimate <- y/n
se <- sqrt (estimate*(1-estimate)/n)

estimate has values 0.7 and 0.5, since y/n calculates using corresponding pairs of numbers 700/1000 and 500/1000. Elements correspond by position -- first element in y goes with first in n and so on.

se likewise has two elements using corresponding elements of the estimate and n vectors.