VisionIncision VisionIncision - 1 year ago 103
C++ Question

TensorFlow CPU and CUDA code sharing

I am writing an Op in C++ and CUDA for TensorFlow that has shared custom function code. Usually when code sharing between CPU and CUDA implementations, one would define a macro to insert the

__device__
specifier into the function signature, if compiling for CUDA. Is there an inbuilt way to share code in this manner in TensorFlow?

How does one define utility functions(usually inlined) that can run on the CPU and GPU?

Answer Source

It turns out that the following macro's in TensorFlow will do what I describe.

namespace tensorflow{
    EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE
    void foo() {
        //
    }
}
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download