Emmanuel Gonzales II Emmanuel Gonzales II - 1 year ago 204
Python Question

Pandas Converter on Data strptime()

I am trying to plot this points, however I am getting that error. Do I need another converter for the date data? The x-axis should be the date, and y axis should be the time value. Thank you.

TypeError: strptime() argument 1 must be str, not Timestamp

df = pd.read_csv('file.csv', sep=',', parse_dates=[0], header=None,
names=['Date', 'Time'])

print (df.head())
Date Time
0 2015-01-02 02:29:45 PM
1 2015-01-02 05:16:15 PM
2 2015-01-02 05:48:46 PM
3 2015-01-02 03:18:34 PM
4 2015-01-02 05:22:55 PM
In [5]:

date = df['Date']
time = df['Time']

from matplotlib import pyplot as plt
from matplotlib.dates import date2num

def date_to_days(date):
return date2num(datetime.datetime.strptime(date,'%Y-%m-%d'))

def time_to_hours(time):
[hh, mm, ss] = [int(x) for x in time.split(':')]
seconds = datetime.timedelta(hours=hh, minutes=mm, seconds=ss).seconds
hours = seconds / float(3600)
return hours

if __name__ == '__main__':

start_date = '2015-01-01'
end_date = '2015-01-31'

dates = date
times = time

days = [date_to_days(d) for d in dates]
hours = [time_to_hours(t) for t in times]

plt.plot_date(days, hours, ydate=False)
plt.axis([date_to_days(start_date), date_to_days(end_date), 0, 24])
plt.ylabel('Time (hours)')

Answer Source

datetime.strptime() is for parsing strings into datetime.datetime objects. As such it makes no sense to apply it to a pandas.tslib.Timestamp object, which is what would be passed in by [date_to_days(d) for d in dates] because dates contains those objects.

It should be possible to pass the pandas timestamp directly to date2num():

def date_to_days(date):
    return date2num(date)

>>> days = [date_to_days(d) for d in dates]
>>> days
[735600.0, 735600.0, 735600.0, 735600.0, 735600.0]

Later in your code you want to call date2num() on date strings, however, you could simply define them upfront as datetime objects so as to avoid parsing the strings:

start_date = datetime.datetime(2015, 1, 1)
end_date = datetime.datetime(2015, 1, 31)

and this will work with the revised function that I show above; in fact the date_to_days() function is no longer required.... just call date2num() directly:

days = [date2num(d) for d in dates]


plt.axis([date2num(start_date), date2num(end_date), 0, 24])