RockJake28 RockJake28 - 2 months ago 6
Python Question

Normalise between 0 and 1 ignoring NaN

For a list of numbers ranging from

x
to
y
that may contain
NaN
, how can I normalise between 0 and 1, ignoring the
NaN
values (they stay as
NaN
).

Typically I would use
MinMaxScaler
from
sklearn.preprocessing
, but this cannot handle
NaN
and recommends imputing the values based on mean or median etc. it doesn't offer the option to ignore all the
NaN
values.

Answer

consider pd.Series s

s = pd.Series(np.random.choice([3, 4, 5, 6, np.nan], 100))
s.hist()

enter image description here


Option 1
Min Max Scaling

new = s.sub(s.min()).div((s.max() - s.min()))
new.hist()

enter image description here


NOT WHAT OP ASKED FOR
I put these in because I wanted to

Option 2
sigmoid

sigmoid = lambda x: 1 / (1 + np.exp(-x))

new = sigmoid(s.sub(s.mean()))
new.hist()

enter image description here


Option 3
tanh (hyperbolic tangent)

new = np.tanh(s.sub(s.mean())).add(1).div(2)
new.hist()

enter image description here

Comments