sweetmusicality sweetmusicality - 1 month ago 6
R Question

Vectorizing prop.test over dataframe in R

This is a basic question for which I am getting the following error.


Error in prop.test: 'x' and 'n' must have the same length


with this code

cv_MH$pval <- (prop.test(x = c(cv_MH$search, cv_MH$against), n = c(size, size2)))$p.value


where
size
and
size2
are constant numbers that are large (>200,000).

This is what
cv_MH
looks like

search against
45 23
384 274
657 883


Basically, I'm trying to create another variable within cv_MH that calculates the p-value.

Thanks.

Answer Source

I think you need to repeat the counts (n) for each value in x. What about this ?

cv_MH$pval <- prop.test(x = c(cv_MH$search, cv_MH$against), 
                        n = c(rep(size, length(cv_MH$search)),
                              rep(size2, length(cv_MH$against))))$p.value

x indicates the number of success (events of interest) and n the total number of events... x should have the same length of n, as suggested by your error message