Nathalie HB Nathalie HB - 6 months ago 20
Python Question

How to resample a dataframe by dropping every second row ( dropping every 30min measurements)

I have a time series of measurements DataFrame that is time stamped every 30min (yyyy/mm/dd 00:30:00, yyyy/mm/dd 01:00:00 ect..), I just want to do a simple resampling by dropping the half hourly measurements and keep only the hourly measurements which could be done by dropping every second row. Any advise on how to do this.

Answer

To drop every other row, keeping the first, use df.iloc[::2].

To drop every other row, starting with the second, use df.iloc[1::2].


Or, since the time series, ts, has a DatetimeIndex, you could use ts.index.minute == 0 to select rows whose minutes equal 0:

In [146]: ts = pd.Series(1, index=pd.date_range('2000-1-1', periods=10, freq='30T'))

In [147]: ts
Out[147]: 
2000-01-01 00:00:00    1
2000-01-01 00:30:00    1
2000-01-01 01:00:00    1
2000-01-01 01:30:00    1
2000-01-01 02:00:00    1
2000-01-01 02:30:00    1
2000-01-01 03:00:00    1
2000-01-01 03:30:00    1
2000-01-01 04:00:00    1
2000-01-01 04:30:00    1
Freq: 30T, dtype: int64

In [148]: ts.loc[ts.index.minute == 0]
Out[148]: 
2000-01-01 00:00:00    1
2000-01-01 01:00:00    1
2000-01-01 02:00:00    1
2000-01-01 03:00:00    1
2000-01-01 04:00:00    1
Freq: 60T, dtype: int64
Comments