9blue 9blue - 2 months ago 5
Python Question

Panda group dataframe based on datetime type into different period ignoring date part

I want to group the rows into groups, based on variable time interval.
However, when doing grouping, I want to ignore the date part, only group based on the time date.

Say I want to group every 5 minutes.

timestampe val
0 2016-08-11 11:03:00 0.1
1 2016-08-13 11:06:00 0.3
2 2016-08-09 11:04:00 0.5
3 2016-08-05 11:35:00 0.7
4 2016-08-19 11:09:00 0.8
5 2016-08-21 12:37:00 0.9

into

timestampe val
0 2016-08-11 11:03:00 0.1
2 2016-08-09 11:04:00 0.5

timestampe val
1 2016-08-13 11:06:00 0.3
4 2016-08-19 11:09:00 0.8

timestampe val
3 2016-08-05 11:35:00 0.7
timestampe val
5 2016-08-21 12:37:00 0.9


Notice as long as the time is within the same 5 minutes interval, the rows are grouped, regardless of the date.

Answer

This is assuming you split the day up into 5 minute windows

df.groupby(df.timestampe.dt.hour.mul(60) \
             .add(df.timestampe.dt.minute) // 5) \
  .apply(pd.DataFrame.reset_index)

enter image description here


for name, group in df.groupby(df.timestampe.dt.hour.mul(60).add(df.timestampe.dt.minute) // 5):
    print name
    print group
    print

132
           timestampe  val
0 2016-08-11 11:03:00  0.1
2 2016-08-09 11:04:00  0.5

133
           timestampe  val
1 2016-08-13 11:06:00  0.3
4 2016-08-19 11:09:00  0.8

139
           timestampe  val
3 2016-08-05 11:35:00  0.7

151
           timestampe  val
5 2016-08-21 12:37:00  0.9
Comments