VaioIsBorn - 3 months ago 23

C Question

I know, I've read about the difference between double precision and single precision, etc. But they should give the same results on most cases right?

I was solving a problem on a programming contest and there were calculations with floating point numbers that were not really big, so I decided to use float instead of double, and I checked it - I was getting the correct results. But when I send the solution, it said only 1 of 10 tests was correct. I checked again and again, until I found that using float is not the same using double. I put double for the calculations and double for the output, and the program gave the SAME results, but this time it passed all the 10 tests correctly.

I repeat, the output was the SAME, the results were the SAME, but putting float didn't work - only double. The values were not so big too, and the program gave the same results on the same tests both with float and double, but the online judge accepted only the double-provided solution.

Why? What is the difference?

Answer

Huge difference.

As the name implies, a `double`

has 2x the precision of `float`

^{[1]}. In general a `double`

has 15 decimal digits of precision, while `float`

only has 7

Here's how the number of digits are calculated:

`double`

has 52 mantissa bits + 1 hidden bit: log(2^{53})÷log(10) = 15.95 digits

`float`

has 23 mantissa bits + 1 hidden bit: log(2^{24})÷log(10) = 7.22 digits

This precision loss could lead to truncation errors much easier to float up, e.g.

```
float a = 1.f / 81;
float b = 0;
for (int i = 0; i < 729; ++ i)
b += a;
printf("%.7g\n", b); // prints 9.000023
```

while

```
double a = 1.0 / 81;
double b = 0;
for (int i = 0; i < 729; ++ i)
b += a;
printf("%.15g\n", b); // prints 8.99999999999996
```

Also, the maximum value of float is only about `3e38`

, but double is about `1.7e308`

, so using `float`

can hit Infinity much easier than double for something simple e.g. computing 60!.

Maybe the their test case contains these huge numbers which causes your program to fail.

Of course sometimes even `double`

isn't accurate enough, hence we have `long double`

^{[1]} (the above example gives 9.000000000000000066 on Mac), but all these floating point types suffer from round-off errors, so if precision is very important (e.g. money processing) you should use `int`

or a fraction class.

BTW, don't use `+=`

to sum lots of floating point numbers as the errors accumulate quickly. If you're using Python, use `fsum`

. Otherwise, try to implement the Kahan summation algorithm.

^{[1]: The C and C++ standards do not specify the representation of float, double and long double. It is possible that all three implemented as IEEE double-precision. Nevertheless, for most architectures (gcc, MSVC; x86, x64, ARM) float is indeed a IEEE single-precision floating point number (binary32), and double is a IEEE double-precision floating point number (binary64).}