I'm comparing outputs of signal processing library using floating-point math, which was built for AArch64 (ARMv8) using e.g. gcc 4.9.
Differences occur depending on the optimization level. Unoptimized builds (O0) calculate bit-exact results with respect to an ARMv7 reference. In ARMv7 environments 'O2' builds did not introduce deviations in the floating point calculations.
This is not the case for ARMv8. Optimized builds actually calculate a different result.
Are compiler switches available to retain bit-exactness to non-optimized builds?
Tests have been performed on a DragonBoard 410c (Cortex-A53).
Depending on your options to your ARMv7-A builds (If you were using
-mfpu=vfpv4 or equivalent, this answer is probably wrong) the most likely difference you are seeing will be the generation of FMA operations.
To avoid this, use
-ffp-contract=off. The GCC documentation for this option says:
-ffp-contract=offdisables floating-point expression contraction.
-ffp-contract=fastenables floating-point expression contraction such as forming of fused multiply-add operations if the target has native support for them.
-ffp-contract=onenables floating-point expression contraction if allowed by the language standard. This is currently not implemented and treated equal to
The default is