Skip to content Skip to sidebar Skip to footer

Implementation Of Softmax Function Returns Nan For High Inputs

I am trying to implement softmax at the end of cnn, The output I got is nan and zeros. I am giving high input values to softmax around 10-20k I'm giving an array of X=[2345,3456,6

Solution 1:

According to softmax function, you need to iterate all elements in the array and compute the exponential for each individual element then divide it by the sum of the exponential of the all elements:

import numpy as np

a = [1,3,5]
for i in a:
    print np.exp(i)/np.sum(np.exp(a))

0.0158762399764667650.117310427826198370.8668133321973349

However if the numbers are too big the exponents will probably blow up (computer can not handle such big numbers):

a= [2345,3456,6543]
for i in a:printnp.exp(i)/np.sum(np.exp(a))__main__:2: RuntimeWarning:invalidvalueencounteredindouble_scalarsnannannan

To avoid this, first shift the highest value in array to zero. Then compute the softmax. For example, to compute the softmax of [1, 3, 5] use [1-5, 3-5, 5-5] which is [-4, -2, 0]. Also you may choose the implement it in vectorized way (as you intendet to do in question):

defsoftmax(x):
    f = np.exp(x - np.max(x))  # shift valuesreturn f / f.sum(axis=0)

softmax([1,3,5])
# prints: array([0.01587624, 0.11731043, 0.86681333])

softmax([2345,3456,6543,-6789,-9234])
# prints: array([0., 0., 1., 0., 0.])

For detailed information check out the cs231n course page. The Practical issues: Numeric stability. heading is exactly what I'm trying to explain.

Solution 2:

In case of applying softmax on a large numbers, you can try using max normalization:

import numpy as np

def softmax (x):
    B=np.exp(x)
    C=np.sum(np.exp(x))
    return B/C

arr = np.array([1,2,3,4,5])

softmax(arr)
# array([0.01165623, 0.03168492, 0.08612854, 0.23412166, 0.63640865])

softmax(arr - max(arr))
# array([0.01165623, 0.03168492, 0.08612854, 0.23412166, 0.63640865])

As you can see, this does not affect the result of softmax. Applying this on your softmax:

def softmax(x):
    B = np.exp(x - max(x))
    C = np.sum(B)
    return B/C
op_arr = np.array([2345,3456,6543,-6789,-9234])
softmax(op_arr)
# array([0., 0., 1., 0., 0.])

Solution 3:

When I run the same code, I get:

RuntimeWarning: overflow encountered in expRuntimeWarning: overflow encountered in expRuntimeWarning: invalid value encountered in true_divide

This is not very surprising since e^(6543) is around 0.39 * 10^2842 probably causing an overflow in the following operations.

To do : normalize your data before giving it to softmax: could you divide it by 1000 before giving it to softmax, so that, instead of having input in [-20000,20000], you would have an input as floats in [-20, 20].

Post a Comment for "Implementation Of Softmax Function Returns Nan For High Inputs"