I'm using Tensorflow 1.0 and its CTC loss [1].
When training, I sometimes get the "No valid path found." warning (which harms learning). It is not due to a high learning rate as sometimes reported by other Tensorflow users.
After analyzing it a bit, I found the pattern that causes this warning:
import tensorflow as tf
import numpy as np
def createGraph():
tinputs=tf.placeholder(tf.float32, [100, 1, 65]) # max 100 time steps, 1 batch element, 64+1 classes
tlabels=tf.SparseTensor(tf.placeholder(tf.int64, shape=[None,2]) , tf.placeholder(tf.int32,[None]), tf.placeholder(tf.int64,[2])) # labels
tseqLen=tf.placeholder(tf.int32, [None]) # list of sequence length in batch
tloss=tf.reduce_mean(tf.nn.ctc_loss(labels=tlabels, inputs=tinputs, sequence_length=tseqLen, ctc_merge_repeated=True)) # ctc loss
return (tinputs, tlabels, tseqLen, tloss)
def getNextBatch(nc): # next batch with given number of chars in label
indices=[[0,i] for i in range(nc)]
values=[i%65 for i in range(nc)]
values[0]=0
values[1]=0 # TODO: (un)comment this to trigger warning
shape=[1, nc]
labels=tf.SparseTensorValue(indices, values, shape)
seqLen=[nc]
inputs=np.random.rand(100, 1, 65)
return (labels, inputs, seqLen)
(tinputs, tlabels, tseqLen, tloss)=createGraph()
sess=tf.Session()
sess.run(tf.global_variables_initializer())
nc=3 # number of chars in label
print('next batch with 1 element has label len='+str(nc))
(labels, inputs, seqLen)=getNextBatch(nc)
res=sess.run([tloss], { tlabels: labels, tinputs:inputs, tseqLen:seqLen } )
// It is possible that no valid path is found if the activations for the
// targets are zero.
if (log_p_z_x == kLogZero) {
LOG(WARNING) << "No valid path found.";
dy_b = y;
return;
}
ok, got it, that's not a bug, that's just how CTC works: let's take an example for which the warning occurs: length of input sequence is 2, labelling is "aa" (also length 2).
Now the shortest path which yields "aa" is a->blank->a (length 3). But for a labelling "ab", the shortest path is a->b (length 2). That shows why for repeated labels like in "aa" the input sequence must be longer. Its simply the way how repeated labels get encoded in the CTC by inserting blanks.
Label repeatings therefore decrease the maximum length of an allowed labelling when fixing the input size.