Christian Staudt Christian Staudt - 3 months ago 23
Python Question

Simple way to plot time series with real dates using pandas

Starting from the following CSV data, loaded into a pandas data frame...

Buchung;Betrag;Saldo
27.06.2016;-1.000,00;42.374,95
02.06.2016;500,00;43.374,95
01.06.2016;-1.000,00;42.874,95
13.05.2016;-500,00;43.874,95
02.05.2016;500,00;44.374,95
04.04.2016;500,00;43.874,95
02.03.2016;500,00;43.374,95
10.02.2016;1.000,00;42.874,95
02.02.2016;500,00;41.874,95
01.02.2016;1.000,00;41.374,95
04.01.2016;300,00;40.374,95
30.12.2015;234,54;40.074,95
02.12.2015;300,00;39.840,41
02.11.2015;300,00;39.540,41
08.10.2015;1.000,00;39.240,41
02.10.2015;300,00;38.240,41
02.09.2015;300,00;37.940,41
31.08.2015;2.000,00;37.640,41


... I would like an intuitive way to plot the time series given by the dates in column "Buchung" and the monetary values in column "Saldo".

I tried

seaborn.tsplot(data=data, time="Buchung", value="Saldo")


which yields

ValueError: could not convert string to float: '31.08.2015'


What is an easy way to read the dates and values and plot the time series? I assume this is such a common problem that there must be a three line solution.

Answer

You need to convert your date column into the correct format:

data['Buchung'] = pd.to_datetime(data['Buchung'], format='%d.%m.%Y')

Now your plot will work.


Though you didn't ask, I think you will also run into a similar problem because your numbers (in 'Betrag' and 'Saldo') seem to be string as well. So I recommend you convert them to numeric before plotting. Here is how you can do that by simple string manipulation:

data["Saldo"] = data["Saldo"].str.replace('.', '').str.replace(',', '.')
data["Betrag"] = data["Betrag"].str.replace('.', '').str.replace(',', '.')

Or set the locale:

import locale
# The data appears to be in a European format, German locale might
# fit. Try this on Windows machine:
locale.setlocale(locale.LC_ALL, 'de')
data['Betrag'] = data['Betrag'].apply(locale.atof)
data['Saldo'] = data['Saldo'].apply(locale.atof)
# This will reset the locale to system default
locale.setlocale(locale.LC_ALL, '')

On an Ubuntu machine, follow this answer. If the above code does not work on a Windows machine, try locale.locale_alias to list all available locales and pick the name from that.


Output

Using matplotlib since I cannot install Seaborn on the machine I am working from.

from matplotlib import pyplot as plt

plt.plot(df['Buchung'], df['Saldo'], '-')
_ = plt.xticks(rotation=45)

The Plot

Note: this has been produced using the locale method. Hence the month names are German.

Comments