krakenwagon - 4 months ago 72

Python Question

I am trying to calculate the moving average in a large numpy array that contains NaNs. Currently I am using:

`import numpy as np`

def moving_average(a,n=5):

ret = np.cumsum(a,dtype=float)

ret[n:] = ret[n:]-ret[:-n]

return ret[-1:]/n

When calculating with a masked array:

`x = np.array([1.,3,np.nan,7,8,1,2,4,np.nan,np.nan,4,4,np.nan,1,3,6,3])`

mx = np.ma.masked_array(x,np.isnan(x))

y = moving_average(mx).filled(np.nan)

print y

>>> array([3.8,3.8,3.6,nan,nan,nan,2,2.4,nan,nan,nan,2.8,2.6])

The result I am looking for (below) should ideally have NaNs only in the place where the original array, x, had NaNs and the averaging should be done over the number of non-NaN elements in the grouping (I need some way to change the size of n in the function.)

`y = array([4.75,4.75,nan,4.4,3.75,2.33,3.33,4,nan,nan,3,3.5,nan,3.25,4,4.5,3])`

I could loop over the entire array and check index by index but the array I am using is very large and that would take a long time. Is there a numpythonic way to do this?

Answer

If I understand correctly, you want to create a moving average and then populate the resulting elements as `nan`

if their index in the original array was `nan`

.

```
import numpy as np
>>> inc = 5 #the moving avg increment
>>> x = np.array([1.,3,np.nan,7,8,1,2,4,np.nan,np.nan,4,4,np.nan,1,3,6,3])
>>> mov_avg = np.array([np.nanmean(x[idx:idx+inc]) for idx in range(len(x))])
# Determine indices in x that are nans
>>> nan_idxs = np.where(np.isnan(x))[0]
# Populate output array with nans
>>> mov_avg[nan_idxs] = np.nan
>>> mov_avg
array([ 4.75, 4.75, nan, 4.4, 3.75, 2.33333333, 3.33333333, 4., nan, nan, 3., 3.5, nan, 3.25, 4., 4.5, 3.])
```