HNSKD HNSKD - 3 months ago 9
R Question

Why does estimated odds ratio obtained in fisher.test() function different from calculated odds ratio?

Here is my contingency table:

X
# Yes No
# Pre 5 685
# Post 17 1351


Fisher Test

fisher.test(X)

# Fisher's Exact Test for Count Data

# data: X
# p-value = 0.3662
# alternative hypothesis: true odds ratio is not equal to 1
# 95 percent confidence interval:
# 0.1666371 1.6474344
# sample estimates:
# odds ratio
# 0.5802157


Calculated Odds Ratio

P1<-5/(5+685)
P2<-17/(17+1351)
(P1/(1-P1))/(P2/(1-P2))
# [1] 0.5800773


Why are the values different? How does fisher test function in R calculate the estimated odds ratio?

Answer

Looking at the code underlying fisher.test, I see

ESTIMATE <- c(`odds ratio` = mle(x))

Immediately above this is

mle <- function(x) {
            if (x == lo) 
                return(0)
            if (x == hi) 
                return(Inf)
            mu <- mnhyper(1)
            if (mu > x) 
                uniroot(function(t) mnhyper(t) - x, c(0, 1))$root
            else if (mu < x) 
                1/uniroot(function(t) mnhyper(1/t) - x, c(.Machine$double.eps, 
                  1))$root
            else 1
        }

Without exploring all the details of the code above the mle definition, it looks like fisher.test is solving an equation for the odds based on theoretical assumptions defined in mnhyper (another function defined in fisher.test[1]) and not calculating it directly from the data. I suspect if I wanted to get a full answer, I would need to read the references in ?fisher.test

[1] There are several functions in fisher.test such as dnhyper, mnhyper, and pnhyper which appear to be distribution functions for a non-central hypergeometric distribution.