xxx222 xxx222 - 1 year ago 47
Python Question

How to disregard the NaN data point in numpy array and generate the normalized data in Python?

Say I have a numpy array that has some float('nan'), I don't want to impute those data now and I want to first normalize those and keep the NaN data at the original space, is there any way I can do that?

Previously I used

function in
, but that function seems can't take any NaN contained array as input.

Answer Source

You can mask your array using the function and subsequently apply any numpy operation:

import numpy as np

a = np.random.rand(10)            # Generate random data.
a = np.where(a > 0.8, np.nan, a)  # Set all data larger than 0.8 to NaN

a =, mask=np.isnan(a)) # Use a mask to mark the NaNs

a_norm  = a / np.sum(a) # The sum function ignores the masked values.
a_norm2 = a / np.std(a) # The std function ignores the masked values.

You can still access your raw data: