clickListener clickListener - 1 year ago 148
Python Question

cross entropy is nan

I am deploying my conv-deconv net. My question is the cross entropy was always nan while training so the solver didn't update the weights. I checked my code all day but I didn't know where did I go wrong. The following is my architecture:
enter image description here
here is my cross entropy function

ys_reshape = tf.reshape(ys,[-1,1])
prediction = tf.reshape(relu4,[-1,1])
cross_entropy = tf.reduce_mean(-(ys_reshape*tf.log(prediction)))
train_step = tf.train.AdamOptimizer(0.01).minimize(cross_entropy)

where the dimension of ys is [1,500,500,1], ys_reshape is [250000,1], relu4 is [1,500,500,1] and prediction is [250000,1]. The value of label matrix, ys are {0,1}, which is a two categories dense prediction.

If I print train_step out it would display None. Can anyone help me?

Answer Source

You did a great job of narrowing the problem down to the right couple of lines of code.

So your predicted probability is directly the output of ReLU4?

There are two problems with that.

First: it can be greater than one.


It can be exactly zero (Anywhere the input to ReLU4 is negative, it's output will be zero).

log(0) -> NaN

The usual approach to this is to treat the linear activations (No ReLU) as the log-odds of each class.

A naive implementation is always broken (numerical issues).

Since you have a single class, you should use tf.sigmoid_cross_entropy_with_logits