DataAddicted DataAddicted - 2 months ago 17
Python Question

Group DataFrame by period of time with aggregation

I am using Pandas to structure and process Data. This is my DataFrame:

enter image description here

I grouped many datetimes by minute and I did an aggregation in order to have the sum of 'bitrate' scores by minute.
This was my code to have this Dataframe:

def aggregate_data(data):

def delete_seconds(time):

return (datetime.datetime.strptime(time, '%Y-%m-%d %H:%M:%S')).replace(second=0)


data['new_time'] = data['beginning_time'].apply(delete_seconds)
df = (data[['new_time', 'bitrate']].groupby(['new_time'])).aggregate(np.sum)

return df


Now I want to do a similar thing with 5 minutes as buckets. I wand to do group my datetimes by 5 minutes and do a mean..
Something like this : (This dosent work of course!)

df.groupby([df.index.map(lambda t: t.5minute)]).aggregate(np.mean)


Ideas ? Thx !

Answer

use resample.

df.resample('5Min').sum()

This assumes your index is properly set as a DateTimeIndex.

you can also use the TimeGrouper, as resampling is really just a groupby operation on time buckets.

df.groupby(pd.TimeGrouper('5Min')).sum()