cyril cyril - 3 months ago 15
Python Question

Pandas: set a default datetime for None values

I have a Pandas dataframe with columns that contain dates as strings (in SQL-like format). However, many cells contain

None
values. I'm trying to convert these columns to Pandas dates using
to_datetime
and set a "default" value for cells that contain the
None
value. Example code below:

import pandas as pd

>>> d = {'a': [1,2,3],
'd1': ['2016-01-01','2015-10-02',None],
'd2': [None,'2016-04-03',None]}
>>> df = pd.DataFrame(d)
>>> print df
a d1 d2
0 1 2016-01-01 None
1 2 2015-10-02 2016-04-03
2 3 None None


>>> date_cols = ['d1','d2']
>>> df[date_cols] = df[date_cols].apply(pd.to_datetime)
>>> print df
a d1 d2
0 1 2016-01-01 NaT
1 2 2015-10-02 2016-04-03
2 3 NaT NaT


It's simple enough to convert the valid strings to a date, I just want to replace the
NaT
with the
default_date
. This is what I'd like the final dataframe to look like:

>>> default_date = '2015-01-01'
>>> print df
a d1 d2
0 1 2016-01-01 2015-01-01
1 2 2015-10-02 2016-04-03
2 3 2015-01-01 2015-01-01

Answer

use fillna

df[date_cols] = df[date_cols].fillna(pd.to_datetime('2015-01-01'))
df

enter image description here