plasmacel - 4 months ago 37

C++ Question

It seems if the floating-point representation has radix 2 (i.e.

`FLT_RADIX == 2`

`std::ldexp(1, x)`

`std::exp2(x)`

`2`

`x`

Does the standard define or mention any expected behavioral and/or performance difference between them? What is the practical experience over different compilers?

Answer

`exp2(x)`

and `ldexp(x,i)`

perform two different operations. The former computes 2^{x}, where `x`

is a *floating-point* number, while the latter computes x*2^{i}, where `i`

is an *integer*. For integer values of `x`

, `exp2(x)`

and `ldexp(1,int(x))`

would be equivalent, provided the conversion of `x`

to integer doesn't overflow.

The question about the relative efficiency of these two functions doesn't have a clear-cut answer. It will depend on the capabilities of the hardware platform and the details of the library implementation. While conceptually, `ldexpf()`

looks like simple manipulation of the exponent part of a floating-point operand, it is actually a bit more complicated than that, once one considers overflow and gradual underflow via denormals. The latter case involves the rounding of the significand (mantissa) part of the floating-point number.

As `ldexp()`

is generally an infrequently used function, it is in my experience fairly common that less of an optimization effort is applied to it by math library writers than to other math functions.

On some platforms, `ldexp()`

, or a faster (custom) version of it, will be used as a building block in the software implementation of `exp2()`

. The following code provides an exemplary implementation of this approach for `float`

arguments:

```
#include <cmath>
/* Compute exponential base 2. Maximum ulp error = 0.86770 */
float my_exp2f (float a)
{
const float cvt = 12582912.0f; // 0x1.8p23
const float large = 1.70141184e38f; // 0x1.0p127
float f, r;
int i;
// exp2(a) = exp2(i + f); i = rint (a)
r = (a + cvt) - cvt;
f = a - r;
i = (int)r;
// approximate exp2(f) on interval [-0.5,+0.5]
r = 1.53720379e-4f; // 0x1.426000p-13f
r = fmaf (r, f, 1.33903872e-3f); // 0x1.5f055ep-10f
r = fmaf (r, f, 9.61817801e-3f); // 0x1.3b2b20p-07f
r = fmaf (r, f, 5.55036031e-2f); // 0x1.c6af7ep-05f
r = fmaf (r, f, 2.40226522e-1f); // 0x1.ebfbe2p-03f
r = fmaf (r, f, 6.93147182e-1f); // 0x1.62e430p-01f
r = fmaf (r, f, 1.00000000e+0f); // 0x1.000000p+00f
// exp2(a) = 2**i * exp2(f);
r = ldexpf (r, i);
if (!(fabsf (a) < 150.0f)) {
r = a + a; // handle NaNs
if (a < 0.0f) r = 0.0f;
if (a > 0.0f) r = large * large; // + INF
}
return r;
}
```

Most real-life implementations of `exp2()`

do not invoke `ldexp()`

, but a custom version, for example when fast bit-wise transfer between integer and floating-point data is supported, here represented by internal functions `__float_as_int()`

and `__int_as_float()`

that re-interpret an IEEE-754 `binary32`

as an `int32`

and vice versa:

```
/* For a in [0.5, 4), compute a * 2**i, -250 < i < 250 */
float fast_ldexpf (float a, int i)
{
int ia = (i << 23) + __float_as_int (a); // scale by 2**i
a = __int_as_float (ia);
if ((unsigned int)(i + 125) > 250) { // |i| > 125
i = (i ^ (125 << 23)) - i; // ((i < 0) ? -125 : 125) << 23
a = __int_as_float (ia - i); // scale by 2**(+/-125)
a = a * __int_as_float ((127 << 23) + i); // scale by 2**(+/-(i%125))
}
return a;
}
```

On other platforms, the hardware provides a single-precision version of `exp2()`

as a fast hardware instruction. Internal to the processor these are typically implemented by a table lookup with linear or quadratic interpolation. On such hardware platforms, `ldexp(float)`

may be implemented in terms of `exp2(float)`

, for example:

```
float my_ldexpf (float x, int i)
{
float r, fi, fh, fq, t;
fi = (float)i;
/* NaN, Inf, zero require argument pass-through per ISO standard */
if (!(fabsf (x) <= 3.40282347e+38f) || (x == 0.0f) || (i == 0)) {
r = x;
} else if (abs (i) <= 126) {
r = x * exp2f (fi);
} else if (abs (i) <= 252) {
fh = (float)(i / 2);
r = x * exp2f (fh) * exp2f (fi - fh);
} else {
fq = (float)(i / 4);
t = exp2f (fq);
r = x * t * t * t * exp2f (fi - 3.0f * fq);
}
return r;
```

}

Lastly, there are platforms that basically provide both `exp2()`

and `ldexp()`

functionality in hardware, such as the x87 instructions `F2XM1`

and `FSCALE`

on x86 processors.