Thanos Thanos - 4 months ago 6
Python Question

Quarter out of datetime64

I have a

pd.DataFrame
that looks like this:

In [119]: df1
Out[119]:
DATES
0 2014-01-01
1 2014-01-24
2 2014-03-11
3 2014-04-09
4 2014-04-21
5 2014-05-02
6 2014-05-13
7 2014-06-11
8 2014-06-21
9 2014-07-22
10 2014-08-04

In [120]: df1.dtypes
Out[120]:
DATES datetime64[ns]
dtype: object


and I want to calculate the quarter each one of the entries belongs to. What I've tried so far is:

df1['QUARTER'] = df1['DATES'].map(lambda x: '2014Q1' if (x.year == 2014 & (x.month == 1 | x.month == 2 | x.month == 3)) else np.nan)


and then I get:

In [124]: df1
Out[124]:
DATES QUARTER
0 2014-01-01 NaN
1 2014-01-24 NaN
2 2014-03-11 NaN
3 2014-04-09 NaN
4 2014-04-21 NaN
5 2014-05-02 NaN
6 2014-05-13 NaN
7 2014-06-11 NaN
8 2014-06-21 NaN
9 2014-07-22 NaN
10 2014-08-04 NaN


Finally, I've tried:

df1['QUARTER'] = df1['DATES'].map(lambda x: x.year + '-Q' + x.quarter)


and then I get an error:

TypeError: unsupported operand type(s) for +: 'int' and 'str'


Any ideas are appreciated, thanks!

Answer
In [30]: df['QUARTER'] = pd.PeriodIndex(df['DATES'], freq='Q')

In [31]: df
Out[31]: 
        DATES QUARTER
0  2014-01-01  2014Q1
1  2014-01-24  2014Q1
2  2014-03-11  2014Q1
3  2014-04-09  2014Q2
4  2014-04-21  2014Q2
5  2014-05-02  2014Q2
6  2014-05-13  2014Q2
7  2014-06-11  2014Q2
8  2014-06-21  2014Q2
9  2014-07-22  2014Q3
10 2014-08-04  2014Q3

The values in df['QUARTER'] are Periods. If you'd like strings, then use

df['QUARTER'] = pd.PeriodIndex(df['DATES'], freq='Q').format()