alvas - 4 months ago 75

Python Question

From the Udacity's deep learning class, the softmax of y_i is simply the exponential divided by the sum of exponential of the whole Y vector:

Where

`S(y_i)`

`y_i`

`e`

`j`

I've tried the following:

`import numpy as np`

def softmax(x):

"""Compute softmax values for each sets of scores in x."""

e_x = np.exp(x - np.max(x))

return e_x / e_x.sum()

scores = [3.0, 1.0, 0.2]

print(softmax(scores))

which returns:

`[ 0.8360188 0.11314284 0.05083836]`

And the suggested solution was:

`def softmax(x):`

"""Compute softmax values for each sets of scores in x."""

return np.exp(x) / np.sum(np.exp(x), axis=0)

And it outputs the

Answer

They're both correct but yours has an unnecessary term.

You start with

e ^ (x - max(x)) / sum(e^(x - max(x))

By using the fact that a^(b - c) = (a^b)/(a^c) we have

= e ^ x / e ^ max(x) * sum(e ^ x / e ^ max(x))

= e ^ x / sum(e ^ x)

Which is what the other answer says. You could replace max(x) with any variable and it would cancel out.