cjena - 11 months ago 72

R Question

I'm trying to calculate p-values of a f-statistic with R.

The formula R uses in the lm() function is equal to (e.g. assume

x=100, df1=2, df2=40):

`pf(100, 2, 40, lower.tail=F)`

[1] 2.735111e-16

which should be equal to

`1-pf(100, 2, 40)`

[1] 2.220446e-16

It is not the same! There s no BIG difference, but where does it come from?

If I calculate (x=5, df1=2, df2=40):

`pf(5, 2, 40, lower.tail=F)`

[1] 0.01152922

1-pf(5, 2, 40)

[1] 0.01152922

it is exactly the same. Question is...what is happening here? Have I missed something?

Answer Source

As the comments note, this is a floating point precision issue. In fact both of the examples you show are not precisely equal as evaluated:

```
> pf(5, 2, 40, lower.tail=F) - (1-pf(5, 2, 40))
[1] 6.245005e-17
> pf(100, 2, 40, lower.tail=F) - (1-pf(500, 2, 40))
[1] 2.735111e-16
```

It's just that this difference is only apparent in your output for the much smaller number.