Domenic Quirl Domenic Quirl - 1 month ago 19
Python Question

TensorFlow fails to convert string to number

I'm building a neural network that is supposed to classify input words in some way. Without going into much detail on the network itself, I was looking for a way to convert my input words to an integer format, in order to use TensorFlow's

tf.nn.embedding_lookup(...)
for input encoding.

I noticed that
tf.string_to_number()
exists, so I tried using that, but it failed. First I thought it was related to what I'm doing in my network, but even when doing something like

import tensorflow as tf
s = tf.string_to_number("TEST", out_type=tf.int32)
sess = tf.InteractiveSession()
sess.run(s)


in a python console, I get the same error of

tensorflow.python.framework.errors.InvalidArgumentError:
StringToNumberOp could not correctly convert string: TEST


I also tried creating a
tf.constant("TEST", dtype=tf.string)
first and passing that on to
tf.string_to_number()
and ran this test code on a webserver to make sure it wasn't related to my setup, but with the same result.

Can anyone tell me what I'm missing here? Thanks in advance!

Answer

Can anyone tell me what I'm missing here?

You are missing the purpose of string_to_number it is supposed to convert a number, represented as string, to the numerical type, like tf.string_to_number('1'), it is not "one hot encoder" for strings (how would it be able to figure out the size in the vocab in the first place?)

There is a nice tutorial in tensorflow itself which shows how to train embedding models in word2vec_basic.py which goes through everything, starting with data reading and ending with full embedding using the lookup op.