D Liebman D Liebman - 22 days ago 12
Python Question

tensorflow making an op for gpu usage -- demo app doesn't work

I'm using tensorflow and after creating the following files I get the error below. I suspect I'm supplying the wrong kind of input, but I don't know how to change it to the proper representation.

dijkstra.py :

self.maze = tf.Variable(tf.zeros([64], dtype=tf.int32), name="grid")

print self.maze
if True :
self.grid_module = tf.load_op_library('d_grid_gpu.so')
with tf.Session('') as sess:
sess.run(tf.initialize_all_variables())
self.output = self.grid_module.grid_gpu(
self.maze

).eval()


d_grid_gpu.cc :

#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/op_kernel.h"

using namespace tensorflow;

REGISTER_OP("GridGpu").Input("grid: int32").Output("prev: int32");

void run( int * in);

class DGridGpuOp : public OpKernel {
public:
explicit DGridGpuOp(OpKernelConstruction* context) : OpKernel(context) {


}

void Compute(OpKernelContext* context) override {


Tensor* prev_tensor = NULL;

Tensor grid_tensor = context->input(0);

auto grid = grid_tensor.flat<int32>();


OP_REQUIRES_OK(context, context->allocate_output(
0,
TensorShape({64}), &prev_tensor));

auto prev = prev_tensor->template flat<int32>();


run(grid.data());//


}

};

REGISTER_KERNEL_BUILDER(Name("GridGpu").Device(DEVICE_GPU), DGridGpuOp);


d_grid_gpu.cu.cc :

#if GOOGLE_CUDA
#define EIGEN_USE_GPU
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"

#include <stdio.h>
#define SIZE 10

__global__ void VectorAdd( int *in, int n)
{
int i = threadIdx.x;

if (i < n)
in[i] = in[i] + i;
}


void run( int * in){

VectorAdd<<< 1, SIZE >>>( in, SIZE);

/*
//these lines cause the segfault
//for (int i = 0; i < SIZE; i ++) {
// printf("%i, " , in[i]);
//}
*/
}


#endif


build script :

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')


nvcc -std=c++11 -c -o d_grid_gpu.cu.o d_grid_gpu.cu.cc \
-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC --expt-relaxed-constexpr

g++ -std=c++11 -shared -o d_grid_gpu.so d_grid_gpu.cc \
d_grid_gpu.cu.o -I $TF_INC -fPIC -lcudart -D_GLIBCXX_USE_CXX11_ABI=0 -L /usr/lib/x86_64-linux-gnu/


edit : I removed the old output.

I tried the 'add_one' op (from the TF howto page) and I think I got it to work. This leads me to believe my installation is OK. This example compiles. I just cannot get the registration right I guess -- or something. Any help would be welcome.

edit: I reinstalled tensorflow and now the error is a little different

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
simple dijkstra for tensorflow
<tensorflow.python.ops.variables.Variable object at 0x7fdec57c1b50>
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: GeForce GTX 850M
major: 5 minor: 0 memoryClockRate (GHz) 0.9015
pciBusID 0000:0a:00.0
Total memory: 3.95GiB
Free memory: 3.64GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 850M, pci bus id: 0000:0a:00.0)
Traceback (most recent call last):
File "test_op.py", line 45, in <module>
d.eval()
File "/home/dave/workspace/awesome-tf/test_gpu/dijkstra.py", line 57, in eval
self.maze
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 559, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3761, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 915, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 965, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 985, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.FailedPreconditionError: Attempting to use uninitialized value grid
[[Node: grid/read = Identity[T=DT_INT32, _class=["loc:@grid"], _device="/job:localhost/replica:0/task:0/cpu:0"](grid)]]

Caused by op u'grid/read', defined at:
File "test_op.py", line 45, in <module>
d.eval()
File "/home/dave/workspace/awesome-tf/test_gpu/dijkstra.py", line 50, in eval
self.maze = tf.Variable(tf.zeros([64], dtype=tf.int32), name="grid")
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 215, in __init__
dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 327, in _init_from_args
self._snapshot = array_ops.identity(self._variable, name="read")
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1128, in identity
result = _op_def_lib.apply_op("Identity", input=input, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1298, in __init__
self._traceback = _extract_stack()

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value grid
[[Node: grid/read = Identity[T=DT_INT32, _class=["loc:@grid"], _device="/job:localhost/replica:0/task:0/cpu:0"](grid)]]


sometimes this is my output:

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
simple dijkstra for tensorflow
<tensorflow.python.ops.variables.Variable object at 0x7fba5d0dafd0>
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: GeForce GTX 850M
major: 5 minor: 0 memoryClockRate (GHz) 0.9015
pciBusID 0000:0a:00.0
Total memory: 3.95GiB
Free memory: 3.67GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 850M, pci bus id: 0000:0a:00.0)
Segmentation fault (core dumped)


This is the case when I use
initialize_all_variables()

Answer

You probably want to use tf.initialize_all_variables to initialize, for instance: with tf.Session() as sess: sess.run(tf.initialize_all_variables())

Comments