DaveTheAl - 1 year ago 130

Python Question

I am quite a beginner with tensorflow. I have built simple models, but haven't tried out something like an multi-layer LSTM yet, so any kind of feedback is greatly appreciated :)

I am currently trying to recode a character-level model as built by sherjilozair from the ground up, simply because I wanted to know how to use tensorflow (I had previously built my own really small DL-library as assigned by cs231n). Now I am currently struggling to build a simple 2-layer LSTM model, and am not sure what is wrong. Here is the code I've written so far:

`class Model():`

def __init__(self, batch_size, seq_length, lstm_size, num_layers, grad_clip, vocab_size):

self.lr = tf.Variable(0.0, trainable=False)

#Define input and output

self.input_data = tf.placeholder(tf.float32, [batch_size, seq_length])

self.output_data = tf.placeholder(tf.float32, [batch_size, seq_length]) #although int would be better for character level..

#Define the model

cell = tf.nn.rnn_cell.BasicLSTMCell(num_units=lstm_size) #can choose if basic or otherwise later on...

self.cell = cell = rnn_cell.MultiRNNCell([cell] * num_layers)

self.initial_state = cell.zero_state(batch_size, tf.float32)

with tf.variable_scope("lstm"):

softmax_w = tf.get_variable("softmax_w", [lstm_size, vocab_size])

softmax_b = tf.get_variable("softmax_b", [vocab_size])

#_, enc_state = rnn.rnn(cell, encoder_inputs, dtype=dtype)

#outputs, states = rnn_decoder(decoder_inputs, enc_state, cell)

outputs, states = seq2seq.basic_rnn_seq2seq(

[self.input_data],

[self.output_data],

cell,

scope='lstm'

)

#see how attention helps improving this model state...

#was told that we should actually use samples softmax loss

self.loss = tf.nn.sampled_softmax_loss(

softmax_w,

softmax_b,

outputs,

self.output_data,

batch_size,

vocab_size

)

And I am currently getting issues with the tf.nn.sampled_softmax_loss. I've come a long way with debugging and don't understand Tensorflow's input conventions. Do I have to input list of tensors everytime?

I get the following error:

`Traceback (most recent call last):`

File "Model.py", line 76, in <module>

vocab_size=82

File "Model.py", line 52, in __init__

vocab_size

File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/nn.py", line 1104, in sampled_softmax_loss

name=name)

File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/nn.py", line 913, in _compute_sampled_logits

array_ops.expand_dims(inputs, 1),

File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 506, in expand_dims

return _op_def_lib.apply_op("ExpandDims", input=input, dim=dim, name=name)

File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 411, in apply_op

as_ref=input_arg.is_ref)

File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 566, in convert_to_tensor

ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)

File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/constant_op.py", line 179, in _constant_tensor_conversion_function

return constant(v, dtype=dtype, name=name)

File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/constant_op.py", line 162, in constant

tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape))

File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 332, in make_tensor_proto

_AssertCompatible(values, dtype)

File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 269, in _AssertCompatible

raise TypeError("List of Tensors when single Tensor expected")

TypeError: List of Tensors when single Tensor expected

I am not sure what I am doing wrong, with either the inputs, or the generation of the variables etc. The problem - as said - seems to be in the sampled_softmax_loss function, but I am really not sure.. I am calling the class with the following parameters (just as placeholders, just to test if the model is 'runnable'):

`Model = Model(batch_size=32,`

seq_length=128,

lstm_size=512,

num_layers=2,

grad_clip=5,

vocab_size=82

)

Also, if I have made any other mistakes etc. please let me know in the comments! This is my first model with seq2seq models in tensorflow, so any advice is greatly appreciated!

Answer Source

This particular error is about passing `outputs`

which is a list, when tf.nn.sampled_softmax_loss expects a single tensor.

The seq2seq.basic_rnn_seq2seq function returns a list of tensors of size `[batch_size x output_size]`

as the first output. Assuming each of your outputs is one-dimensional, you want to concatenate the output list using tf.concat (creating a tensor of size `[seq_len x batch_size x 1]`

), tf.squeeze the last dimension (resulting `[seq_len x batch_size]`

) and tf.transpose to make `output`

have size `[batch_size x seq_len]`

, same as `self.output_data`

.

To debug the problem, print the tensor sizes using `print(output.get_shape())`

.