John Am - 1 year ago 67
C Question

Computation without floats to multiply a long integer (32 bit ) with 0.0000000004656f

I'm trying to eliminate all floating point computations in an embedded application and I need to scale/multiply a signed long 32 bit integer with

`0.0000000004656f`
. (1/2147483648)

The context is

``````( pulse[i] * ( triosc[i] * 0.0000000004656f ) )
``````

Both
`pulse[i]`
and
`triosc[i]`
are signed long 32 bit integers

So I need my
`triosc[i]`
value to be constrained between
`0.0f`
and
`1.0f`
without using floating arithmetic.

EDIT:

``````saw_x2[i] = (long)( pulse[i] * (triosc[i] * 0.0000000004656f) );
sine_osc[i] = (long)( ((triangle2[i] * (saw_x2[i] * 0.0000000004656f))) *
(pulse[i] * 0.0000000004656f) ) << 2;
return (sine_osc[i]);
``````

The fixed point values in `pulse[i]` and `triosc[i]` are signed quantities expressed in units of 2-31. The mathematical values are pulse[i] / 231 and triosc[i] / 231. While you can add these values as long as you do not overflow, multiplying them requires an adjustment by 231. This is what is done approximately with `pulse[i] * (triosc[i] * 0.0000000004656f))`, but note that the floating point value is not precise enough, it would be more precise to write `pulse[i] * (triosc[i] / 2147483648.F)`, but the result would still lose precision due to the `float` representation with only 23 bits of matissa.

Performing the multiplication in integer arithmetic with a 64 bit intermediary step is actually more precise.

It can be done this way:

``````((uint64_t)pulse[i] * triosc[i]) >> 31
``````

or equivalently:

``````((long long)pulse[i] * triosc[i]) >> 31
``````

EDIT

You really should use types from `<stdint.h>` to avoid making assumptions about the size of `long`. It is 32 bits on your current system, but it may be 64 on the next hardware. Here is how you can rewrite the expressions:

``````int32_t saw_x2[SIZE];
int32_t pulse[SIZE];
int32_t triosc[SIZE];
int32_t triangle2[SIZE];
int32_t sine_osc[SIZE];

...

saw_x2[i] = (int32_t)(((int64_t)pulse[i] * triosc[i]) >> 31);
int64_t temp = ((int64_t)triangle2[i] * saw_x2[i]) >> 31;
sine_osc[i] = (int32_t)(((temp * pulse[i]) >> 31) << 2);
return sine_osc[i];
``````

Note however that if any of these values become negative, right shifting is not guaranteed to produce the correct result. Dividing by `2147483648` would be the required method but may produce less efficient code:

``````saw_x2[i] = (int32_t)((int64_t)pulse[i] * triosc[i] / 2147483648);
int64_t temp = (int64_t)triangle2[i] * saw_x2[i] / 2147483648;
sine_osc[i] = (int32_t)((temp * pulse[i] / 2147483648) << 2);
return sine_osc[i];
``````

Also, since you multiply by 4 in the last step, you would get 2 more bits of precision by dividing by 229 instead:

``````sine_osc[i] = (int32_t)(temp * pulse[i] / 536870912);
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download