user27478 user27478 - 2 months ago 8
Python Question

How to resample a TimeSeries in pandas with a fill_value?

I have a

TimeSeries
of integers that I would like to downsample using
resample()
. The problem is that I have some periods with missing data that are converted to
NaN
. Since pandas does not support Integer NA values the integers are converted to floats.

Is it possible to resample a
TimeSeries
using a
fill_value
for missing data like I can with
reindex(fill_value=0)
? I don't want my integers cast into floats.

>>> dates = (datetime(2013, 1, 1), datetime(2013,1,2), datetime(2013,3,1))
>>> s = Series([1,2,4],index=dates)
>>> s
2013-01-01 1
2013-01-02 2
2013-03-01 4
dtype: int64
>>> s.resample('M', how='sum')
2013-01-31 3
2013-02-28 NaN
2013-03-31 4
Freq: M, dtype: float64

# Desired output (doesn't work)
>>> s.resample('M', how='sum', fill_value=0)
2013-01-31 3
2013-02-28 0
2013-03-31 4
Freq: M, dtype: int64

Answer

You can define your own function to avoid NaN

In [36]: def _sum(x):
   ....:     if len(x) == 0: return 0
   ....:     else: return sum(x)
   ....:     

In [37]: s.resample('M', how=_sum)
Out[37]: 
2013-01-31    3   
2013-02-28    0   
2013-03-31    3   
Freq: M, dtype: int64