Frangipanes Frangipanes - 4 months ago 10
Python Question

Python datetime switching between US and UK date formats

I'm using matplotlib to plot some data imported from CSV files. These files have the following format:

Date,Time,A,B
25/07/2016,13:04:31,5,25550
25/07/2016,13:05:01,0,25568
....
01/08/2016,19:06:43,0,68425


The dates are formatted as they would be in the UK, i.e.
%d/%m/%Y
. The end result is to have two plots: one of how
A
changes with time, and one of how
B
changes with time. I'm importing the data from the CSV like so:

import matplotlib
matplotlib.use('Agg')
from matplotlib.mlab import csv2rec
import matplotlib.pyplot as plt
from datetime import datetime
import sys
...

def analyze_log(file, y):
data = csv2rec(open(file, 'rb'))

fig = plt.figure()

date_vec = [datetime.strptime(str(x), '%Y-%m-%d').date() for x in data['date']]
print date_vec[0]
print date_vec[len(date_vec)-1]

time_vec = [datetime.strptime(str(x), '%Y-%m-%d %X').time() for x in data['time']]
print time_vec[0]
print time_vec[len(time_vec)-1]

datetime_vec = [datetime.combine(d, t) for d, t in zip(date_vec, time_vec)]
print datetime_vec[0]
print datetime_vec[len(datetime_vec)-1]

y_vec = data[y]
plt.plot(datetime_vec, y_vec)

...
# formatters, axis headers, etc.
...
return plt


And all was working fine before 01 August. However, since then, matplotlib is trying to plot my 01/08/2016 data points as 2016-01-08 (08 Jan)!

I get a plotting error because it tries to plot from January to July:

RuntimeError: RRuleLocator estimated to generate 4879 ticks from 2016-01-08 09:11:00+00:00 to 2016-07-29 16:22:34+00:00:


exceeds Locator.MAXTICKS * 2 (2000)

What am I doing wrong here? The results of the print statements in the code above are:

2016-07-25
2016-01-08 #!!!!
13:04:31
19:06:43
2016-07-25 13:04:31
2016-01-08 19:06:43 #!!!!

Answer

Matplotlib's csv2rec function parses your dates already and tries to be intelligent when it comes to parsing dates. The function has two options to influence the parsing, dayfirst should help here:

dayfirst: default is False so that MM-DD-YY has precedence over DD-MM-YY.

yearfirst: default is False so that MM-DD-YY has precedence over YY-MM-DD.

See http://labix.org/python-dateutil#head-b95ce2094d189a89f80f5ae52a05b4ab7b41af47 for further information.

Comments