Stanislav Jirák Stanislav Jirák - 3 years ago 220
Python Question

Overflow error in Python with pandas

I'm following this tutorial: for data analysis with pandas but when I want to run following code

import datetime
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

sp500 ='%5EGSPC', start = datetime.datetime(2015, 10, 15),
end = datetime.datetime(2016, 10, 15))

df = pd.read_csv('sp500.csv', index_col = 'Date', parse_dates=True)

df['H-L'] = df['High'] - df.Low
df['100MA'] = pd.rolling_mean(df['Close'], 100)
df['Difference'] = df['Close'].diff()

threedee = plt.figure().gca(projection='3d')
threedee.scatter(df.index, df['H-L'], df['Close'])

It produces both in Jupyter notebook and PyCharm an error as follows:

OverflowError Traceback (most recent call last)
C:\Program Files\Anaconda2\lib\site-packages\IPython\core\ in __call__(self, obj)
305 pass
306 else:
--> 307 return printer(obj)
308 # Finally look for special method names
309 method = get_real_method(obj, self.print_method)

C:\Program Files\Anaconda2\lib\site-packages\IPython\core\ in <lambda>(fig)
226 if 'png' in formats:
--> 227 png_formatter.for_type(Figure, lambda fig: print_figure(fig, 'png', **kwargs))
228 if 'retina' in formats or 'png2x' in formats:
229 png_formatter.for_type(Figure, lambda fig: retina_figure(fig, **kwargs))

C:\Program Files\Anaconda2\lib\site-packages\IPython\core\ in print_figure(fig, fmt, bbox_inches, **kwargs)
118 bytes_io = BytesIO()
--> 119 fig.canvas.print_figure(bytes_io, **kw)
120 data = bytes_io.getvalue()
121 if fmt == 'svg':

with many others various paths including and as on.
What's wrong? It isn't too much data to load, is it?

Answer Source

Have you tried replacing this line

threedee.scatter(df.index, df['H-L'], df['Close'])

with the following?

threedee.scatter(range(len(df.index)), df['H-L'], df['Close'])

You are plotting timestamps as values. It is possible that matplotlib doesn't understand what numerical values the timestamps carry.

Edit: unfortunately, this workaround this workaround turns the xaxis ticks into a numberic range. But we can always set the ticks manually:

threedee.scatter(df.index, df['H-L'], df['Close'])

renderer = fig.canvas.get_renderer()
old_xticks = [t.get_text() for t in threedee.xaxis.get_ticklabels()]
new_xticks = [df.index[int(t)].strftime("%Y-%m-%d")
               if t is not '' else '' for t in old_xticks]

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download