w4nderlust - 1 year ago 303

Python Question

I'm trying to implement a max margin loss in TensorFlow.

the idea is that I have some positive example and i sample some negative examples and want to compute something like

where B is the size of my batch and N is the number of negative samples I want to use.

I'm new to tensorflow and I'm finding it tricky to implement it.

My model computes a vector of scores of dimension

`B * (N + 1)`

The ideal would be to get values like

`[1, 0, 0, 1, 0, 0]`

What I could came up with is the following, using while and conditions:

`# Function for computing max margin inner loop`

def max_margin_inner(i, batch_examples_t, j, scores, loss):

idx_pos = tf.mul(i, batch_examples_t)

score_pos = tf.gather(scores, idx_pos)

idx_neg = tf.add_n([tf.mul(i, batch_examples_t), j, 1])

score_neg = tf.gather(scores, idx_neg)

loss = tf.add(loss, tf.maximum(0.0, 1.0 - score_pos + score_neg))

tf.add(j, 1)

return [i, batch_examples_t, j, scores, loss]

# Function for computing max margin outer loop

def max_margin_outer(i, batch_examples_t, scores, loss):

j = tf.constant(0)

pos_idx = tf.mul(i, batch_examples_t)

length = tf.gather(tf.shape(scores), 0)

neg_smp_t = tf.constant(num_negative_samples)

cond = lambda i, b, j, bi, lo: tf.logical_and(

tf.less(j, neg_smp_t),

tf.less(pos_idx, length))

tf.while_loop(cond, max_margin_inner, [i, batch_examples_t, j, scores, loss])

tf.add(i, 1)

return [i, batch_examples_t, scores, loss]

# compute the loss

with tf.name_scope('max_margin'):

loss = tf.Variable(0.0, name="loss")

i = tf.constant(0)

batch_examples_t = tf.constant(batch_examples)

condition = lambda i, b, bi, lo: tf.less(i, b)

max_margin = tf.while_loop(

condition,

max_margin_outer,

[i, batch_examples_t, scores, loss])

The code has two loops, one for the outer sum and the other for the inner one. The problem I'm facing is that the loss variable keeps accumulating errors at each iteration without being reset after each iteration. So it actually doesn't work at all.

Moreover, it seems really not in line with tensorflow way of implementing things. I guess there could be better ways, more vectorized ways to implement it, hope someone will suggest options or point me to examples.

Answer Source

First we need to clean the input:

- we want an array of positive scores, of shape
`[B, 1]`

- we want a matrix of negative scores, of shape
`[B, N]`

```
import tensorflow as tf
B = 2
N = 2
scores = tf.constant([0.5, 0.2, -0.1, 1., -0.5, 0.3]) # shape B * (N+1)
scores = tf.reshape(scores, [B, N+1])
scores = tf.transpose(scores)
scores_pos = tf.slice(scores, [0, 0], [1, B])
scores_pos = tf.transpose(scores_pos) # shape [B, 1]
scores_neg = tf.slice(scores, [1, 0], [N, B])
scores_neg = tf.transpose(scores_neg) # shape [B, N]
```

Okay, that was a bit messy, but we needed to clean up the input, maybe you can make your code return directly `scores_pos`

and `scores_neg`

.

Now we only have to compute the matrix of the loss, i.e. all the individual loss for every pair (positive, negative), and compute its sum.

```
loss_matrix = tf.maximum(0., 1. - scores_pos + scores_neg) # we could also use tf.nn.relu here
loss = tf.reduce_sum(loss_matrix)
```