Apo Apo - 1 year ago 158
C++ Question

float(0.0) vs 0.0f in CUDA

I want to write code valid for both float and double precision.
I am doing something like this:

typedef real float;
//typedef real double;

__global__ void foo(real a, real *b){
b[0] = real(0.5)*a;

int main(){
real a = 1.0f;
real *b;
cudaMalloc(&f, sizeof(real));
return 0;

This got me thinking, I don't want to loose precision in the constant as 0.5f when doing double precision, but I don't want to promote 0.5 to double when doing single precision!

So I ended up using the operator real() as in the example.
In single precision mode, If I disassemble the function 'foo' using real(0.5) I get that there is no promotion to double, as opposed to using just 0.5, where promotion happens.

You can check using:

$nvcc test.cu -arch=sm_52 -lineinfo --source-in-ptx -g -G -O0 ; cuobjdump -sass a.out | grep "foo" -A 35

I see

/*0078*/ FMUL R0, R0, 0.5; /* 0x3868004000070000 */

When using real(0.5) or 0.5f

/*0078*/ F2F.F64.F32 R4, R0; /* 0x5ca8000000070b04 */
/*0088*/ DMUL R4, R4, 0.5; /* 0x3880004000070404 */
/*0090*/ F2F.F32.F64 R0, R4; /* 0x5ca8000000470e00 */

When writing just 0.5.

This may sound too obvious. But as I do not know what is "real(0.5)" doing I cannot know if this is just the compiler playing along in this very particular case. The disassembled code seems identical in both real(0.5) and 0.5f!

So the question remains:

What is real(0.5) (AKA float(0.5)) exactly doing?

Is there any difference between float(0.5) and 0.5f? (OR double(0.5) and 0.5)

This applies to C/C++ aswel I suppose.

Answer Source

What is real(0.5) (AKA float(0.5)) exactly doing?

real(0.5) function-style cast, and in this case it lowers to static_cast

static_cast<real>(0.5) //exactly the same thing

This means a is multiplied with a real variable (in this case float), meaning that there's no need to perform promotion to double, as would be the case for double * float multiplication.

Is there any difference between float(0.5) and 0.5f? (OR double(0.5) and 0.5)

One could argue that initialization of float with 0.5 could happen at runtime, but this isn't realistic for any modern compiler. It should be a no-op, it already is for the OP.

Other than that, using float(0.5f) is makes no difference whatsoever from just using 0.5f, and the same goes for double(0.5) and 0.5.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download