cukier9a7b5 cukier9a7b5 - 1 year ago 73
C++ Question

C/C++ "inline" keyword in CUDA device-side code

I am total "newbie", when it comes to CUDA. So if my question is trivial, pardon me.

Does nvcc understands meaning of

C keyword?

I know about
, and similar nvcc "macros", therefor I am not asking how to write
cuda device-side code.

I know also, that my code is "split" between nvcc and c/c++ compiler (I am using Visual Studio IDE).
Does that mean
keyword is ignored by nvcc when it "stands next to"


P.S. I had searched cuda programing guide. I could not find anything useful under
entry, similar "tags" does not help either.

Answer Source

CUDA is a programming language in the C++ family. Therefore, the CUDA documentation generally does not duplicate standard C++ documentation, it merely points out differences and extensions. If you can't find a description of the use of the inline specifier with functions in CUDA documentation, that is a good indication that it is processed in the standard C++ fashion.

Interpolating between the various parts of your questions, it seems you are mostly concerned how the use of inline affects the actual inlining of functions in the generated code.

The ISO C++11 standard specifies inline as a function attribute in section 7.1.2. Besides provisions about linkage and duplicate definitions, it states the following about the actual inlining of functions with the inline specifier:

The inline specifier indicates to the implementation that inline substitution of the function body at the point of call is to be preferred to the usual function call mechanism. An implementation is not required to perform this inline substitution at the point of call;

So inline is merely a suggestion to the compiler, which it is free to ignore. Since the CUDA compiler inlines functions aggressively in device code by default (for performance reasons), the use of inline seems quite redundant for device code, but programmers are free to use it.

The inlining heuristics used by the CUDA compiler may prevent inlining of a particular function that a programmer would like to have inlined under all circumstances. For this purpose, CUDA provides the non-standard __forceinline__ function attribute. This specifier affects both device code and host code, as nvcc translates it into the equivalent host-compiler specific attribute for host code, such as __forceinline for MSVC. This can be verified by dumping and inspecting the intermediate C++ files that nvcc sends to the host compiler.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download