abutremutante abutremutante - 1 month ago 22
Python Question

Python - Statsmodels.tsa.seasonal_decompose - missing values in head and tail of dataframe

I have the following dataframe, that I'm calling "sales_df":

Value
Date
2004-01-01 0
2004-02-01 173
2004-03-01 225
2004-04-01 230
2004-05-01 349
2004-06-01 258
2004-07-01 270
2004-08-01 223
... ...
2015-06-01 218
2015-07-01 215
2015-08-01 233
2015-09-01 258
2015-10-01 252
2015-11-01 256
2015-12-01 188
2016-01-01 70


I want to separate its trend from its seasonal component and for that I use statsmodels.tsa.seasonal_decompose through the following code:

decomp=sm.tsa.seasonal_decompose(sales_df.Value)
df=pd.concat([sales_df,decomp.trend],axis=1)
df.columns=['sales','trend']


This is getting me this:

sales trend
Date
2004-01-01 0 NaN
2004-02-01 173 NaN
2004-03-01 225 NaN
2004-04-01 230 NaN
2004-05-01 349 NaN
2004-06-01 258 NaN
2004-07-01 270 236.708333
2004-08-01 223 248.208333
2004-09-01 243 251.250000
... ... ...
2015-05-01 270 214.416667
2015-06-01 218 215.583333
2015-07-01 215 212.791667
2015-08-01 233 NaN
2015-09-01 258 NaN
2015-10-01 252 NaN
2015-11-01 256 NaN
2015-12-01 188 NaN
2016-01-01 70 NaN


Note that there are 6 NaN's in the start and in the end of the Trend's series.
So I ask, is that right? Why is that happening?

Answer Source

This is expected as seasonal_decompose uses a symmetric moving average by default if the filt argument is not specified (as you did). The frequency is inferred from the time series. https://searchcode.com/codesearch/view/86129185/