Cyrine Ezzahra Cyrine Ezzahra - 3 months ago 15
Python Question

Iteration through a range of timestamp in Python

I have a dataframe df :

TIMESTAMP equipement1 equipement2
2016-05-10 13:20:00 0.000000 0.000000
2016-05-10 14:40:00 0.400000 0.500000
2016-05-10 15:20:00 0.500000 0.500000


Iam trying to iterate through timestamp by step of 5 minutes .
I try :
pd.date_range(start, end, freq='5 minutes')


But I get a problem with timestamp format.


" ValueError: Could not evaluate 5 minutes"


Any idea to help me to resolve this problem?

Thank you

Answer

First, make sure your TIMESTAMP column is a datetime instead of a string (e.g. df['TIMESTAMP'] = pd.to_datetime(df.TIMESTAMP)).

Next, use this column as the index of the dataframe. To make this permanent, df.set_index('TIMESTAMP, inplace=True)`.

Now you can resample for any given frequency (e.g. 30min) and use different methods of aggregation such as sum, mean (the default), a lambda function, etc).

Optionally, you can add .fillna(0) to replace the NaNs with zeros.

>>> df.set_index('TIMESTAMP').resample('30min', how='sum')

                     equipement1  equipement2
TIMESTAMP                                    
2016-05-10 13:00:00          0.0          0.0
2016-05-10 13:30:00          NaN          NaN
2016-05-10 14:00:00          NaN          NaN
2016-05-10 14:30:00          0.4          0.5
2016-05-10 15:00:00          0.5          0.5