user3601754 user3601754 - 4 months ago 30
Python Question

Python - Scipy linear regression with nan values

I would like to obtain the slopes of the linear regression of my data, but the Y contains some nan values...thus it perturbs linregress function...
For example :

from scipy import stats
import numpy as np

X = np.array([0,1,2,3,4,5])
Y = np.array([np.NaN,4, 5, 10, 2, 5])
stats.linregress(X,Y)


But, I obtain : (nan, nan, nan, nan, nan)
Thus i try to mask invalid values as you can see:

import numpy.ma as ma
stats.linregress((X),ma.masked_invalid(Y))


But it s the same...I dont see what i have to do...

Answer

Try the following:

Y=Y[np.logical_not(np.isnan(Y))]
X=X[np.logical_not(np.isnan(Y))]

upd: as Warren noticed, Y will be updated, so the nans are gone. You can feed Y[np.logical_not(np.isnan(Y))] and X=X[np.logical_not(np.isnan(Y))] directly into linear regression. Or see Warren's answer with np.isfinite

Comments