alvas - 28 days ago 18
Python Question

# Softmax function - python

From the Udacity's deep learning class, the softmax of y_i is simply the exponential divided by the sum of exponential of the whole Y vector:

Where

`S(y_i)`
is the softmax function of
`y_i`
and
`e`
is the exponentia and
`j`
is the no. of columns in the input vector Y.

I've tried the following:

``````import numpy as np

def softmax(x):
"""Compute softmax values for each sets of scores in x."""
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum()

scores = [3.0, 1.0, 0.2]
print(softmax(scores))
``````

which returns:

``````[ 0.8360188   0.11314284  0.05083836]
``````

And the suggested solution was:

``````def softmax(x):
"""Compute softmax values for each sets of scores in x."""
return np.exp(x) / np.sum(np.exp(x), axis=0)
``````

And it outputs the same output as the first implementation that really tax the difference of each column and the max and then divided by the sum.

Can someone show mathematically why? Is one correct and the other one wrong?

Are the implementation similar in terms of code and time complexity? Which is more efficient?

They're both correct but yours has an unnecessary term.