саша - 1 year ago 157

Python Question

I wanted to output the log-probability during learning of the word and doc vectors in gensim. I have taken a look at the implementation of the score function in the "slow plain numpy" version.

`def score_cbow_pair(model, word, word2_indices, l1):`

l2a = model.syn1[word.point] # 2d matrix, codelen x layer1_size

sgn = (-1.0)**word.code # ch function, 0-> 1, 1 -> -1

lprob = -log(1.0 + exp(-sgn*dot(l1, l2a.T)))

return sum(lprob)

The score function should make use of the parameters learned during hierarchical softmax training. But in the calculation of the log-probability there is supposed to be a sigmoid function( word2vec Parameter Learning Explained equation (45)).

So does gensim really calculate the log-probability in

`lprob`

I would have calculated the log-probability as follows:

`-log(1.0/(1.0+exp(-sgn*dot(l1, l2a.T))))`

Is this equation not used because it explodes for values close to zero or is it in general wrong?

Answer Source

I've overlooked that the logarithm of the sigmoid function can be rewritten: `log(1.0/(1.0+exp(-sgn*dot(l1, l2a.T)))) = log(1)-log(1.0+exp(-sgn*dot(l1, l2a.T))) = -log(1.0+exp(-sgn*dot(l1, l2a.T)))`

So the code does compute the log-likelihood.