sdaza sdaza - 1 month ago 11
Python Question

force a format when parsing dates in pandas from an Excel file

I am trying to parse dates from an Excel file following a given format

%d/%m/%y
. I am not able to set the format, I am always getting
%m/%d/%Y
.

The source format is
%d/%m/%y
so that the first date, for instance, should be first of June 2016. Any ideas on how to do it?

import pandas as pd

url = 'https://www.dropbox.com/s/8gqmq3jx27unsta/example_dates.xlsx?dl=1'

file = pd.ExcelFile(url, parse_date=True,
date_parser = (lambda x: pd.to_datetime(x, format ='%d/%m/%y')))

df = file.parse(0)

df
date variable
0 2016-01-06 1
1 2016-06-07 2
2 2016-12-10 3
3 2016-12-29 4


Something like this might work but it doesn't:

df = file.parse(0, converters={'date' : lambda x: pd.to_datetime(x, dayfirst=True)}

date variable
0 2016-01-06 1
1 2016-06-07 2
2 2016-12-10 3
3 2016-12-29 4

Answer Source

You can use converters:

>>> file.parse(0, parse_dates=True, index_col=0, date_parser=lambda x: pd.to_datetime(x).strftime("%d/%m/%Y"))

     variable
date    
2016-06-01  1
2016-07-06  2
2016-10-12  3
2016-12-29  4