Vineet Kaushik - 5 months ago 59

Python Question

I am trying to use a deep neural network architecture to classify against a binary label value - -1 and +1. Here is my code to do it in

`tensorflow`

`import tensorflow as tf`

import numpy as np

from preprocess import create_feature_sets_and_labels

train_x,train_y,test_x,test_y = create_feature_sets_and_labels()

x = tf.placeholder('float', [None, 5])

y = tf.placeholder('float')

n_nodes_hl1 = 500

n_nodes_hl2 = 500

n_nodes_hl3 = 500

n_classes = 1

batch_size = 100

def neural_network_model(data):

hidden_1_layer = {'weights':tf.Variable(tf.random_normal([5, n_nodes_hl1])),

'biases':tf.Variable(tf.random_normal([n_nodes_hl1]))}

hidden_2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),

'biases':tf.Variable(tf.random_normal([n_nodes_hl2]))}

hidden_3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),

'biases':tf.Variable(tf.random_normal([n_nodes_hl3]))}

output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),

'biases':tf.Variable(tf.random_normal([n_classes]))}

l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['biases'])

l1 = tf.nn.relu(l1)

l2 = tf.add(tf.matmul(l1, hidden_2_layer['weights']), hidden_2_layer['biases'])

l2 = tf.nn.relu(l2)

l3 = tf.add(tf.matmul(l2, hidden_3_layer['weights']), hidden_3_layer['biases'])

l3 = tf.nn.relu(l3)

output = tf.transpose(tf.add(tf.matmul(l3, output_layer['weights']), output_layer['biases']))

return output

def train_neural_network(x):

prediction = neural_network_model(x)

cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(prediction, y))

optimizer = tf.train.AdamOptimizer().minimize(cost)

hm_epochs = 10

with tf.Session() as sess:

sess.run(tf.initialize_all_variables())

for epoch in range(hm_epochs):

epoch_loss = 0

i = 0

while i < len(train_x):

start = i

end = i + batch_size

batch_x = np.array(train_x[start:end])

batch_y = np.array(train_y[start:end])

_, c = sess.run([optimizer, cost], feed_dict={x: batch_x,

y: batch_y})

epoch_loss += c

i+=batch_size

print('Epoch', epoch, 'completed out of', hm_epochs, 'loss:', epoch_loss)

# correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))

# accuracy = tf.reduce_mean(tf.cast(correct, 'float'))

print (test_x.shape)

accuracy = tf.nn.l2_loss(prediction-y,name="squared_error_test_cost")/test_x.shape[0]

print('Accuracy:', accuracy.eval({x: test_x, y: test_y}))

train_neural_network(x)

This is the output I get when I run this:

`('Epoch', 0, 'completed out of', 10, 'loss:', -8400.2424869537354)`

('Epoch', 1, 'completed out of', 10, 'loss:', -78980.956665039062)

('Epoch', 2, 'completed out of', 10, 'loss:', -152401.86713409424)

('Epoch', 3, 'completed out of', 10, 'loss:', -184913.46441650391)

('Epoch', 4, 'completed out of', 10, 'loss:', -165563.44775390625)

('Epoch', 5, 'completed out of', 10, 'loss:', -360394.44857788086)

('Epoch', 6, 'completed out of', 10, 'loss:', -475697.51550292969)

('Epoch', 7, 'completed out of', 10, 'loss:', -588638.92993164062)

('Epoch', 8, 'completed out of', 10, 'loss:', -745006.15966796875)

('Epoch', 9, 'completed out of', 10, 'loss:', -900172.41955566406)

(805, 5)

('Accuracy:', 5.8077128e+09)

I don't understand if the values I am getting are correct as there is a real dearth of non-MNIST binary classification examples. The accuracy is nothing like what I expected. I was expecting a percentage instead of that large value.

I am also somewhat unsure of the theory behind machine learning which is why I can't tell the correctness of my approach using tensorflow.

Can someone please tell me if my approach towards binary classification is correct?

Also is the accuracy part of my code correct?

Answer

From this:

a binary label value - -1 and +1

. . . I am assuming your values in `train_y`

and `test_y`

are actually -1.0 and +1.0

This is not going to work very well with your chosen loss function `sigmoid_cross_entropy_with_logits`

- which assumes 0.0 and +1.0. The negative `y`

values are causing mayhem! However, the loss function choice is good for binary classification. I suggest change your `y`

values to 0 and 1.

In addition, technically the output of your network is not the final prediction. The loss function `sigmoid_cross_entropy_with_logits`

is designed to work with a network with sigmoid transfer function in the output layer, although you have got it right that the loss function is applied *before* this is done. So your training code appears correct

I'm not 100% sure about the `tf.transpose`

though - I would see what happens if you remove that, personally I.e.

```
output = tf.add(tf.matmul(l3, output_layer['weights']), output_layer['biases'])
```

Either way, this is the "logit" output, but not your prediction. The value of `output`

can get high for very confident predictions, which probably explains your very high values later due to missing the sigmoid function. So add a prediction tensor (this represents the probability/confidence that the example is in the positive class):

```
prediction = tf.sigmoid(output)
```

You can use that to calculate accuracy. Your accuracy calculation should not be based on L2 error, but sum of correct values - closer to the code you had commented out (which appears to be from a multiclass classification). For a comparison with true/false for binary classification, you need to threshold the predictions, and compare with the true labels. Something like this:

```
predicted_class = tf.greater(prediction,0.5)
correct = tf.equal(predicted_class, tf.equal(y,1.0))
accuracy = tf.reduce_mean( tf.cast(correct, 'float') )
```

The accuracy value should be between 0.0 and 1.0. If you want as a percentage, just multiply by 100 of course.