Twifty - 6 months ago 29

PHP Question

The question of how to compare floating point numbers has been answered here. This question is different in that I'm asking about the formulas. The two highest voted answers have a slightly different solution to the problem:

`if (abs(($a-$b)/$b) < $epsilon) { … }`

and

`if (abs($a-$b) < $epsilon) { … }`

Why does the first answer contain the division? Doesn't it cause inaccurate results? For example (using easy numbers), let both $a and $b equal 0.01, and assume $a - $b results in 0.0001, with an $epsilon of 0.001.

`((((0.01 - 0.01) == 0.0001) / 0.01 == 0.01) < 0.001) : false`

whereas

`(((0.01 - 0.01) == 0.0001) < 0.001) : true`

My math may be a little rusty, but what am I missing?

When should I use the one formula over the other?

Answer

This probably allows checking epsilon with a relative error rather than an absolute error.

Compare these two cases:

```
function areEqual(float $a, float $b) : bool {
return abs(($a - $b) / $b) < 0.00001;
}
areEqual(10000, 10000.01);
areEqual(0.0000001, 0);
```

Fact about the example values above: Our epsilon here is

`0.00001`

for convenience ‐ the smallest epsilon possible is much smaller than these values anyway, so let's ignore this fact. Our algorithm assumes that`$a`

and`$b`

are both similar, so it does not matter whether we divide by`$a`

or`$b`

. Actually,`10000`

should be much larger than that (a very enormous exponent), and`0.0000001`

can be much smaller, but for the sake of convenience, let's assume these are the values that may cause problems.

Now you can already see the difference.

For the large numbers: If the compared floats are extremely large, epsilon may be too small. The float internally can only store a definite number of digits for precision, while the exponent can be way greater than that. As a result, the source of floating point error, i.e. the final digits of floats, would appear at somewhere that can be higher than the unit digits. In other words, for extremely large floats, the absolute error can be greater than `1`

, much less our epsilon of `0.00001`

.

For the small numbers: This is even more obvious. Both numbers are smaller than the epsilon already. Even if you compare them with 0, while the relative error is infinitely large, you still think that they are equal. For this case, you either multiply up both operands, or you decrease the epsilon. They are actually the same, but in terms of implementation, it is more convenient to divide the difference with one of the operands, which will multiply up for small numbers (`/ 0.0001`

is equivalent to `* 10000`

) or divide down for large numbers (`/ 10000`

while the difference is hopefully way smaller than `10000`

)

There is another name for this check. While `abs($a - $b)`

is called the absolute error, we usually use the relative error, which is absolute error **÷** approximate value. Since the values can be negative as well, we `abs`

the whole thing `($a - $b) / $b`

instead. Our "epsilon", `0.00001`

, in this case, means that our tolerate relative error is `0.00001`

, i.e. 0.001% error.

Keep in mind that this is

As the best practice, avoid adding, subtracting or multiplying big numbers with small numbers. They will increase the chance of errors. When developing (especially simplifying) your algorithms, always put into consideration that their might be an error in your floats. This may increase work load to a stupid extent, but as long as you are aware of it, this kind of worry sometimes saves you from getting kicked out of teams.