Mehrdad - 9 months ago 50

C Question

Why does Clang optimize away the loop in this code

`#include <time.h>`

#include <stdio.h>

static size_t const N = 1 << 27;

static double arr[N] = { /* initialize to zero */ };

int main()

{

clock_t const start = clock();

for (int i = 0; i < N; ++i) { arr[i] *= 1.0; }

printf("%u ms\n", (unsigned)(clock() - start) * 1000 / CLOCKS_PER_SEC);

}

but not the loop in this code?

`#include <time.h>`

#include <stdio.h>

static size_t const N = 1 << 27;

static double arr[N] = { /* initialize to zero */ };

int main()

{

clock_t const start = clock();

for (int i = 0; i < N; ++i) { arr[i] += 0.0; }

printf("%u ms\n", (unsigned)(clock() - start) * 1000 / CLOCKS_PER_SEC);

}

(Tagging as both C and C++ because I would like to know if the answer is different for each.)

Answer

The IEEE 754-2008 Standard for Floating-Point Arithmetic and the ISO/IEC 10967 Language Independent Arithmetic (LIA) Standard, Part 1 answer why this is so.

## IEEE 754 § 6.3 The sign bit

When either an input or result is NaN, this standard does not interpret the sign of a NaN. Note, however, that operations on bit strings — copy, negate, abs, copySign — specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand. The logical predicate totalOrder is also affected by the sign bit of a NaN operand. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation.

When neither the inputs nor result are NaN, the sign of a product or quotient is the exclusive OR of the operands’ signs; the sign of a sum, or of a difference x − y regarded as a sum x + (−y), differs from at most one of the addends’ signs; and the sign of the result of conversions, the quantize operation, the roundTo-Integral operations, and the roundToIntegralExact (see 5.3.1) is the sign of the first or only operand. These rules shall apply even when operands or results are zero or infinite.

When the sum of two operands with opposite signs (or the difference of two operands with like signs) is exactly zero, the sign of that sum (or difference) shall be +0 in all rounding-direction attributes except roundTowardNegative; under that attribute, the sign of an exact zero sum (or difference) shall be −0. However, x + x = x − (−x) retains the same sign as x even when x is zero.

We see that `x+0.0`

produces `x`

, EXCEPT when `x`

is `-0.0`

: In that case we have a sum of two operands with opposite signs whose sum is zero, and §6.3 paragraph 3 rules this addition produces `+0.0`

in the default rounding mode (Round-to-Nearest, Ties-to-Even).

Since `+0.0`

is not *bitwise* identical to the original `-0.0`

, and that `-0.0`

is a legitimate value that may occur as input, the compiler is obliged to put in the code that will transform potential negative zeros to `+0.0`

.

The summary: Under the default rounding mode, in `x+0.0`

, if `x`

**is not**`-0.0`

, then`x`

itself is an acceptable output value.**is**`-0.0`

, then the output value*must be*`+0.0`

, which is not bitwise identical to`-0.0`

.

No such problem occurs with `x*1.0`

. If `x`

:

- is a (sub)normal number,
`x*1.0 == x`

always. - is
`+/- infinity`

, then the result is`+/- infinity`

of the same sign. is

`NaN`

, then according to### IEEE 754 § 6.2.3 NaN Propagation

An operation that propagates a NaN operand to its result and has a single NaN as an input should produce a NaN with the payload of the input NaN if representable in the destination format.

which means that the exponent and mantissa (though not the sign) of

`NaN*1.0`

are*recommended*to be unchanged from the input`NaN`

. The sign is unspecified in accordance with §6.3p1 above, but an implementation may specify it to be identical to the source`NaN`

.- is
`+/- 0`

, then the result is a`0`

with its sign bit XORed with the sign bit of`1.0`

, in agreement with §6.3p2. Since the sign bit of`1.0`

is`0`

, the output value is unchanged from the input. Thus,`x*1.0 == x`

even when`x`

is a (negative) zero.

The IEEE 754-2008 Standard has the following interesting quote:

## IEEE 754 § 10.4 Literal meaning and value-changing optimizations

[...]

The following value-changing transformations, among others, preserve the literal meaning of the source code:

- Applying the identity property 0 + x when x is not zero and is not a signaling NaN and the result has the same exponent as x.
- Applying the identity property 1 × x when x is not a signaling NaN and the result has the same exponent as x.
- Changing the payload or sign bit of a quiet NaN.
- [...]

Since all NaNs and all infinities share the same exponent, and the correctly rounded result of `x+0.0`

and `x*1.0`

for finite `x`

has exactly the same magnitude as `x`

, their exponent is the same.

Signaling NaNs are floating-point trap values; They are special NaN values whose use as a floating-point operand results in an invalid operation exception (SIGFPE). If a loop that triggers an exception were optimized out, the software would no longer behave the same.

However, as user2357112 *points out in the comments*, the C11 Standard explicitly leaves undefined the behaviour of signaling NaNs (`sNaN`

), so the compiler is allowed to assume they do not occur, and thus that the exceptions that they raise also do not occur. The C++11 standard omits describing a behaviour for signaling NaNs, and thus also leaves it undefined.

Clang and GCC, even at `-O3`

, remains IEEE-754 compliant. This means it must keep to the above rules of the IEEE-754 standard. `x+0.0`

is **not bit-identical** to `x`

for all `x`

under those rules, but `x*1.0`

*may be chosen to be so*: Namely, when we

- Obey the recommendation to pass unchanged the payload of
`x`

when it is a NaN. - Leave the sign bit of a NaN result unchanged by
`* 1.0`

. - Obey the order to XOR the sign bit during a quotient/product, when
`x`

is*not*a NaN.

To enable the IEEE-754-unsafe optimization `(x+0.0) -> x`

, the flag `-ffast-math`

needs to be passed to Clang or GCC.

Source (Stackoverflow)