Seja Nair Seja Nair - 28 days ago 9x
Python Question

Tensorflow RNN with varying length sentences

I was trying to use RNN ( LSTM ) for sequential prediction . Here, I was faced with some issues. For example :

sent_1 = ' I am flying to Dubain'
sent_2 = " I was traveling from US to Dubai'

What I am trying to do here is predicting the next word after the previous one, as a simple RNN based on

But, the num_steps parameter ( used for unrolling to previous hidden states ), should remain the same in each Tensorflow[epoch?]. Basically, batching of sentences is not possible as it is vary in length.

# inputs = [tf.squeeze(input_, [1])
# for input_ in tf.split(1, num_steps, inputs)]
# outputs, states = rnn.rnn(cell, inputs, initial_state=self._initial_state)

Here, num_steps need to be changed in my case for every sentence. I have tried several hack, but nothing seems working.


You can use ideas of bucketing and padding which are described hear:

Also rnn function which creates RNN network accepts parameter sequence_length.

As example you can create buckets of sentances of the same size, padd them with necessary amount of zeros, or placeholdres which stands for zero word and afterwards feed them along with seq_length = len(zero_words).

seq_length = tf.placeholder(tf.int32)
outputs, states = rnn.rnn(cell, inputs, initial_state=initial_state, sequence_length=seq_length)

sess = tf.Session()
feed = {
    seq_lenght: 20,
    #other feeds
}, feed_dict=feed)

Take a look at this reddit thread as well: