piRSquared piRSquared - 6 months ago 26
Python Question

Create monthly time series with arbitrary start dates

Using

pandas
it is easy to create a monthly series of dates.

import pandas as pd

pd.date_range('2012-04-23', '2013-01-23', freq='BM')

DatetimeIndex(['2012-04-30', '2012-05-31', '2012-06-29', '2012-07-31',
'2012-08-31', '2012-09-28', '2012-10-31', '2012-11-30',
'2012-12-31'],
dtype='datetime64[ns]', freq='BM')


Notice that the dates in the
DatetimeIndex
are month ends. I know that it should be considering I chose
freq='BM'
but I don't believe I had a choice that would have accomplished my goal.

I'm often in need for producing a monthly series of dates starting from the last business day going back in time every month.

I'd like to see this instead:

DatetimeIndex(['2012-04-23', '2012-05-23', '2012-06-23', '2012-07-23',
'2012-08-23', '2012-09-23', '2012-10-23', '2012-11-23',
'2012-12-23'],
dtype='datetime64[ns]', freq=None)


or another more complicated example might be to get months from '2012-01-30' to '2012-04-30'. I'd expect to see:

DatetimeIndex(['2012-01-30', '2012-02-29', '2012-03-30', '2012-04-30'],
dtype='datetime64[ns]', freq=None)

Answer

You may be looking for something like this:

from pandas.tseries.offsets import Day, BDay
pd.date_range(start = '2012-01-01', periods = 6, freq = 'MS') + Day(22) + BDay(0)
Out[12]: 
DatetimeIndex(['2012-01-23', '2012-02-23', '2012-03-23', '2012-04-23',
               '2012-05-23', '2012-06-25'],
              dtype='datetime64[ns]', freq=None)

Day(22) adds an offset of 22 days and BDay is responsible for business day offset (BDay(0) takes the nearest business day).

It's a bit more difficult with dates starting at 30th. So I had to write a function for this. (For clarity of code it doesn't allow a custom freq parameter.)

def my_business_date_range(day, **kwargs):
    assert(isinstance(day, int) & (day > 0) & (day < 32))
    rng0 = pd.date_range(freq = 'MS', **kwargs)
    rng1 = rng0 + pd.tseries.offsets.Day(day-1) + pd.tseries.offsets.BDay(0)
    # Correcting overflows:
    overflow_idx, = np.nonzero(rng0.month != rng1.month)
    if overflow_idx.size > 0:
        # rng1 is not mutable
        tmp = rng1.tolist()        
        bme = pd.tseries.offsets.BusinessMonthEnd(-1)
        for i in overflow_idx:
            tmp[i] = bme(rng1[i])
        rng1 = pd.DatetimeIndex(tmp)
    return rng1

my_business_date_range(30, start= '2012-01-01', periods = 6)
Out[13]: 
DatetimeIndex(['2012-01-30', '2012-02-29', '2012-03-30', '2012-04-30',
               '2012-05-30', '2012-06-29'],
              dtype='datetime64[ns]', freq=None)

Pandas has also an experimental CustomBusinessMonth and the like but I couldn't make it work.

Comments