Arnold Klein Arnold Klein - 2 months ago 8
JSON Question

Pandas DataFrame with date and time from JSON format

I'm importing data from

.json
file with pandas
DataFrame
and the result is a bit broken:

>> print df
summary response_date
8.0 {u'$date': u'2009-02-19T10:54:00.000+0000'}
11.0 {u'$date': u'2009-02-24T11:23:45.000+0000'}
14.0 {u'$date': u'2009-03-03T17:55:07.000+0000'}
16.0 {u'$date': u'2009-03-10T12:23:04.000+0000'}
19.0 {u'$date': u'2009-03-17T17:19:55.000+0000'}
13.0 {u'$date': u'2009-03-25T15:10:52.000+0000'}
22.0 {u'$date': u'2009-04-02T16:57:31.000+0100'}
15.0 {u'$date': u'2009-04-08T22:29:09.000+0100'}
20.0 {u'$date': u'2009-04-16T18:14:20.000+0100'}
13.0 {u'$date': u'2009-04-29T10:47:06.000+0100'}
15.0 {u'$date': u'2009-05-06T13:45:45.000+0100'}
20.0 {u'$date': u'2009-05-26T10:41:52.000+0100'}


How to get rid of 'date' and other mess to create a normal column with date and time. To convert from ISO8601 format I normally use:

df.response_date = pd.to_datetime(df.response_date)

Answer

You can use DataFrame constructor with converting column response_date to list by values if type is dict:

print (type(df.ix[0,'response_date']))
<class 'dict'>

df.response_date = pd.DataFrame(df.response_date.values.tolist())
df.response_date = pd.to_datetime(df.response_date)
print (df)
   summary       response_date
0      8.0 2009-02-19 10:54:00
1     11.0 2009-02-24 11:23:45
2     14.0 2009-03-03 17:55:07

If type is string, use split and strip:

print (type(df.ix[0,'response_date']))
<class 'str'>

df.response_date = df.response_date.str.split().str[1].str.strip("'u}")
df.response_date = pd.to_datetime(df.response_date)

print (df)
   summary       response_date
0      8.0 2009-02-19 10:54:00
1     11.0 2009-02-24 11:23:45
2     14.0 2009-03-03 17:55:07
Comments