xxx222 - 6 months ago 11

Python Question

Say I have a numpy array that has some float('nan'), I don't want to impute those data now and I want to first normalize those and keep the NaN data at the original space, is there any way I can do that?

Previously I used

`normalize`

`sklearn.Preprocessing`

Answer

You can mask your array using the `numpy.ma.array`

function and subsequently apply any `numpy`

operation:

```
import numpy as np
a = np.random.rand(10) # Generate random data.
a = np.where(a > 0.8, np.nan, a) # Set all data larger than 0.8 to NaN
a = np.ma.array(a, mask=np.isnan(a)) # Use a mask to mark the NaNs
a_norm = a / np.sum(a) # The sum function ignores the masked values.
a_norm2 = a / np.std(a) # The std function ignores the masked values.
```

You can still access your raw data:

```
print a.data
```