THM THM - 6 months ago 26
Python Question

How use the mean method on a pandas TimeSeries with Decimal type values?

I need to store Python decimal type values in a pandas TimeSeries/DataFrame object. Pandas gives me an error when using the "groupby" and "mean" on the TimeSeries/DataFrame. The following code based on floats works well:

[0]: by = lambda x: lambda y: getattr(y, x)

[1]: rng = date_range('1/1/2000', periods=40, freq='4h')

[2]: rnd = np.random.randn(len(rng))

[3]: ts = TimeSeries(rnd, index=rng)

[4]: ts.groupby([by('year'), by('month'), by('day')]).mean()
2000 1 1 0.512422
2 0.447235
3 0.290151
4 -0.227240
5 0.078815
6 0.396150
7 -0.507316


But i get an error if do the same using decimal values instead of floats:

[5]: rnd = [Decimal(x) for x in rnd]

[6]: ts = TimeSeries(rnd, index=rng, dtype=Decimal)

[7]: ts.groupby([by('year'), by('month'), by('day')]).mean() #Crash!

Traceback (most recent call last):
File "C:\Users\TM\Documents\Python\tm.py", line 100, in <module>
print ts.groupby([by('year'), by('month'), by('day')]).mean()
File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 293, in mean
return self._cython_agg_general('mean')
File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 365, in _cython_agg_general
raise GroupByError('No numeric types to aggregate')
pandas.core.groupby.GroupByError: No numeric types to aggregate


The error message is "GroupByError('No numeric types to aggregate')". Is there any chance to use the standard aggregations like sum, mean, and quantileon on the TimeSeries or DataFrame containing Decimal values?

Why doens't it work and is there a chance to have an equally fast alternative if it is not possible?

EDIT: I just realized that most of the other functions (min, max, median, etc.) work fine but not the mean function that i desperately need :-(.

Answer
import numpy as np
ts.groupby([by('year'), by('month'), by('day')]).apply(np.mean)