famargar famargar - 1 year ago 134
Python Question

How to extract data from a pandas plot?

I have a dataframe that I am plotting with pandas working in ipython. I am importing the usual stuff, then plotting the dataframe

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

traydata_A[('x_TmId', 'Trays')].plot()
plt.xlabel('Hour of the day')
plt.ylabel('Number of picked/despatched trays')


and would like to get the actual data plotted by using

ax = plt.gca()
line = ax.lines[0]


The end result is

IndexError Traceback (most recent call last)
<ipython-input-220-d211b85302a5> in <module>()
1 ax = plt.gca()
----> 2 line = ax.lines[0]

IndexError: list index out of range


what am I doing wrong? I am sure I have a deep misunderstanding of how pandas connects to matplotlib!

Answer Source

You will have to make sure to use the axes returned by the pandas plot function. In your code ax = plt.gca() returns a different axes than the one which is used by pandas. Either make sure to execute the code within the same context, or save the pandas axes into an intermediate variable. Full example:

s = pd.Series(data=[5850000, 6000000, 5700000, 13100000, 16331452], name='data')
ax = s.plot()
print(ax.get_lines()[0].get_xydata())
[[  0.00000000e+00   5.85000000e+06]
 [  1.00000000e+00   6.00000000e+06]
 [  2.00000000e+00   5.70000000e+06]
 [  3.00000000e+00   1.31000000e+07]
 [  4.00000000e+00   1.63314520e+07]]

From the documentation of matplotlib.pyplot.gca:

Get the current Axes instance on the current figure matching the given keyword args, or create one.

[...]

If the current axes doesn’t exist, or isn’t a polar one, the appropriate axes will be created and then returned.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download