VisionIncision VisionIncision - 6 months ago 26
C++ Question

TensorFlow CPU and CUDA code sharing

I am writing an Op in C++ and CUDA for TensorFlow that has shared custom function code. Usually when code sharing between CPU and CUDA implementations, one would define a macro to insert the

specifier into the function signature, if compiling for CUDA. Is there an inbuilt way to share code in this manner in TensorFlow?

How does one define utility functions(usually inlined) that can run on the CPU and GPU?

Answer Source

It turns out that the following macro's in TensorFlow will do what I describe.

namespace tensorflow{
    void foo() {
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download