RockJake28 RockJake28 - 1 year ago 125
Python Question

Normalise between 0 and 1 ignoring NaN

For a list of numbers ranging from

x
to
y
that may contain
NaN
, how can I normalise between 0 and 1, ignoring the
NaN
values (they stay as
NaN
).

Typically I would use
MinMaxScaler
from
sklearn.preprocessing
, but this cannot handle
NaN
and recommends imputing the values based on mean or median etc. it doesn't offer the option to ignore all the
NaN
values.

Answer Source

consider pd.Series s

s = pd.Series(np.random.choice([3, 4, 5, 6, np.nan], 100))
s.hist()

enter image description here


Option 1
Min Max Scaling

new = s.sub(s.min()).div((s.max() - s.min()))
new.hist()

enter image description here


NOT WHAT OP ASKED FOR
I put these in because I wanted to

Option 2
sigmoid

sigmoid = lambda x: 1 / (1 + np.exp(-x))

new = sigmoid(s.sub(s.mean()))
new.hist()

enter image description here


Option 3
tanh (hyperbolic tangent)

new = np.tanh(s.sub(s.mean())).add(1).div(2)
new.hist()

enter image description here