plasmacel - 3 months ago 8

C++ Question

Is the addition

`x + x`

`2 * x`

`case_add`

`case_mul`

`#include <limits>`

template <typename T>

T case_add(T x, size_t n)

{

static_assert(std::numeric_limits<T>::is_iec559, "invalid type");

T result(x);

for (size_t i = 1; i < n; ++i)

{

result += x;

}

return result;

}

template <typename T>

T case_mul(T x, size_t n)

{

static_assert(std::numeric_limits<T>::is_iec559, "invalid type");

return x * static_cast<T>(n);

}

Answer

Is the addition

`x + x`

interchangeable by the multiplication`2 * x`

in IEEE 754 (IEC 559) floating-point standard

Yes, since they are both mathematically identical, they will give the same result (since the result is exact in floating point).

or more generally speaking is there any guarantee that case_add and case_mul always give exactly the same result?

Not generally, no. From what I can tell, it seems to hold for `n <= 5`

:

`n=3`

: as`x+x`

is exact (i.e. involves no rounding), so`(x+x)+x`

only involves one rounding at the final step.`n=4`

(and you're using the default rounding mode) then- if the last bit of
`x`

is 0, then`x+x+x`

is exact, and so the results are equal by the same argument as`n=3`

. - if the last 2 bits are
`01`

, then the exact value of`x+x+x`

will have last 2 bits of`1|1`

(where | indicates the final bit in the format), which will be rounded up to`0|0`

. The next addition will give an exact result`|01`

, so the result will be rounded down, cancelling out the previous error. - if the last 2 bits are
`11`

, then the exact value of`x+x+x`

will have last 2 bits of`0|1`

, which will be rounded down to`0|0`

. The next addition will give an exact result`|11`

, so the result will be rounded up, again cancelling out the previous error.

- if the last bit of
`n=5`

: since`x+x+x+x`

is exact, it holds for the same reason as`n=3`

.

For `n=6`

it fails, e.g. take `x`

to be `1.0000000000000002`

(the next `double`

after `1.0`

), in which case `6x`

is `6.000000000000002`

and `x+x+x+x+x+x`

is `6.000000000000001`