clickListener - 11 months ago 103

Python Question

I am deploying my conv-deconv net. My question is the cross entropy was always nan while training so the solver didn't update the weights. I checked my code all day but I didn't know where did I go wrong. The following is my architecture:

here is my cross entropy function

`ys_reshape = tf.reshape(ys,[-1,1])`

prediction = tf.reshape(relu4,[-1,1])

cross_entropy = tf.reduce_mean(-(ys_reshape*tf.log(prediction)))

train_step = tf.train.AdamOptimizer(0.01).minimize(cross_entropy)

where the dimension of ys is [1,500,500,1], ys_reshape is [250000,1], relu4 is [1,500,500,1] and prediction is [250000,1]. The value of label matrix, ys are {0,1}, which is a two categories dense prediction.

If I print train_step out it would display None. Can anyone help me?

Answer Source

You did a great job of narrowing the problem down to the right couple of lines of code.

So your predicted probability is directly the output of `ReLU4`

?

There are two problems with that.

First: it can be greater than one.

Second:

It can be exactly zero (Anywhere the input to `ReLU4`

is negative, it's output will be zero).

`log(0) -> NaN`

The usual approach to this is to treat the linear activations (No ReLU) as the log-odds of each class.

A naive implementation is always broken (numerical issues).

Since you have a single class, you should use tf.sigmoid_cross_entropy_with_logits