DaveTheAl DaveTheAl - 22 days ago 9
Python Question

How to use sampled_softmax_loss in Tensorflow

I am quite a beginner with tensorflow. I have built simple models, but haven't tried out something like an multi-layer LSTM yet, so any kind of feedback is greatly appreciated :)

I am currently trying to recode a character-level model as built by sherjilozair from the ground up, simply because I wanted to know how to use tensorflow (I had previously built my own really small DL-library as assigned by cs231n). Now I am currently struggling to build a simple 2-layer LSTM model, and am not sure what is wrong. Here is the code I've written so far:

class Model():
def __init__(self, batch_size, seq_length, lstm_size, num_layers, grad_clip, vocab_size):
self.lr = tf.Variable(0.0, trainable=False)

#Define input and output
self.input_data = tf.placeholder(tf.float32, [batch_size, seq_length])
self.output_data = tf.placeholder(tf.float32, [batch_size, seq_length]) #although int would be better for character level..

#Define the model
cell = tf.nn.rnn_cell.BasicLSTMCell(num_units=lstm_size) #can choose if basic or otherwise later on...
self.cell = cell = rnn_cell.MultiRNNCell([cell] * num_layers)
self.initial_state = cell.zero_state(batch_size, tf.float32)


with tf.variable_scope("lstm"):
softmax_w = tf.get_variable("softmax_w", [lstm_size, vocab_size])
softmax_b = tf.get_variable("softmax_b", [vocab_size])

#_, enc_state = rnn.rnn(cell, encoder_inputs, dtype=dtype)
#outputs, states = rnn_decoder(decoder_inputs, enc_state, cell)


outputs, states = seq2seq.basic_rnn_seq2seq(
[self.input_data],
[self.output_data],
cell,
scope='lstm'
)


#see how attention helps improving this model state...

#was told that we should actually use samples softmax loss
self.loss = tf.nn.sampled_softmax_loss(
softmax_w,
softmax_b,
outputs,
self.output_data,
batch_size,
vocab_size
)


And I am currently getting issues with the tf.nn.sampled_softmax_loss. I've come a long way with debugging and don't understand Tensorflow's input conventions. Do I have to input list of tensors everytime?

I get the following error:

Traceback (most recent call last):
File "Model.py", line 76, in <module>
vocab_size=82
File "Model.py", line 52, in __init__
vocab_size
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/nn.py", line 1104, in sampled_softmax_loss
name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/nn.py", line 913, in _compute_sampled_logits
array_ops.expand_dims(inputs, 1),
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 506, in expand_dims
return _op_def_lib.apply_op("ExpandDims", input=input, dim=dim, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 411, in apply_op
as_ref=input_arg.is_ref)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 566, in convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/constant_op.py", line 179, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/constant_op.py", line 162, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape))
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 332, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 269, in _AssertCompatible
raise TypeError("List of Tensors when single Tensor expected")
TypeError: List of Tensors when single Tensor expected


I am not sure what I am doing wrong, with either the inputs, or the generation of the variables etc. The problem - as said - seems to be in the sampled_softmax_loss function, but I am really not sure.. I am calling the class with the following parameters (just as placeholders, just to test if the model is 'runnable'):

Model = Model(batch_size=32,
seq_length=128,
lstm_size=512,
num_layers=2,
grad_clip=5,
vocab_size=82
)


Also, if I have made any other mistakes etc. please let me know in the comments! This is my first model with seq2seq models in tensorflow, so any advice is greatly appreciated!

Answer

This particular error is about passing outputs which is a list, when tf.nn.sampled_softmax_loss expects a single tensor.

The seq2seq.basic_rnn_seq2seq function returns a list of tensors of size [batch_size x output_size] as the first output. Assuming each of your outputs is one-dimensional, you want to concatenate the output list using tf.concat (creating a tensor of size [seq_len x batch_size x 1]), tf.squeeze the last dimension (resulting [seq_len x batch_size]) and tf.transpose to make output have size [batch_size x seq_len], same as self.output_data.

To debug the problem, print the tensor sizes using print(output.get_shape()).

Comments