SlowLearner SlowLearner - 4 months ago 17
Python Question

Setting timezone for timestamp data in pandas dataframe

I have a frequently changing pandas dataframe of data that looks like this:

date name time timezone
0 2016-08-01 aaa 0900 Asia/Tokyo
1 2016-08-04 bbb 1200 Europe/Berlin
2 2016-08-05 ccc 1400 Europe/London


The date, time and timezone refer to a delivery date that is often for an overseas location, with the name being the name of the client company.

The plan is to take this data and create a
datetime_local
column that incorporates the time zone shown in the
timezone
column of dataframe. Then from that I want to add a column
datetime_london
that contains that date and time, but expressed in terms of the time and date in London.

I have got most of the way there but when calling
tz_localize
I end up with a
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
which suggests to me that I'm not treating the column with the timezone correctly.

Any suggestions as to how to proceed?

mydf = pd.DataFrame(data={'date':['2016-08-01','2016-08-04','2016-08-05'],
'time':['0900','1200','1400'],
'timezone':['Asia/Tokyo','Europe/Berlin','Europe/London'],
'name':['aaa','bbb','ccc']}
)
print(mydf)
mydf["datetime"] = mydf["date"].map(str) + " " + mydf["time"]
mydf.datetime = pd.to_datetime(mydf.datetime)
mydf.index = mydf.datetime
print(mydf)
mydf["datetime_local"] = mydf.datetime
mydf.datetime_local.tz_localize(mydf.timezone)

Max Max
Answer
import pandas as pd

def convert_to_local_time(row):
    return pd.to_datetime(row.datetime).tz_localize(row.timezone)

def convert_to_london_time(row):
    return pd.to_datetime(row.datetime_local).tz_convert('Europe/London')

mydf = pd.DataFrame(data={'date':['2016-08-01','2016-08-04','2016-08-05'],
                          'time':['0900','1200','1400'],
                          'timezone':['Asia/Tokyo','Europe/Berlin','Europe/    London'],
                          'name':['aaa','bbb','ccc']}
)
print(mydf)

Output:

         date name  time       timezone
0  2016-08-01  aaa  0900     Asia/Tokyo
1  2016-08-04  bbb  1200  Europe/Berlin
2  2016-08-05  ccc  1400  Europe/London

Add datetime_local column

mydf["datetime"] = mydf["date"].map(str) + " " + mydf["time"]
mydf['datetime_local'] = mydf.apply(convert_to_local_time, axis=1)
print(mydf)

Output:

         date name  time       timezone         datetime  \
0  2016-08-01  aaa  0900     Asia/Tokyo  2016-08-01 0900   
1  2016-08-04  bbb  1200  Europe/Berlin  2016-08-04 1200   
2  2016-08-05  ccc  1400  Europe/London  2016-08-05 1400   

              datetime_local  
0  2016-08-01 09:00:00+09:00  
1  2016-08-04 12:00:00+02:00  
2  2016-08-05 14:00:00+01:00  

Add datetime_london column

mydf['datetime_london'] = mydf.apply(convert_to_london_time, axis=1)
print('After adding datetime_london:')
print(mydf)

Output:

         date name  time       timezone         datetime  \
0  2016-08-01  aaa  0900     Asia/Tokyo  2016-08-01 0900   
1  2016-08-04  bbb  1200  Europe/Berlin  2016-08-04 1200   
2  2016-08-05  ccc  1400  Europe/London  2016-08-05 1400   

              datetime_local           datetime_london  
0  2016-08-01 09:00:00+09:00 2016-08-01 01:00:00+01:00  
1  2016-08-04 12:00:00+02:00 2016-08-04 11:00:00+01:00  
2  2016-08-05 14:00:00+01:00 2016-08-05 14:00:00+01:00  
Comments