relima relima - 1 year ago 80
Python Question

Threading to quickly create a large number of charts

I have been trying to find ways to make the following piece of code perform faster:

def do_chart(target="IMG_BACK", xlabel="xlabel", ylabel="ylabel", title="title", ydata=pylab.arange(1961, 2031, 1)):
print "here"
for i in range(70):
MYRAMDICT[i] = cStringIO.StringIO()
xdata = pylab.arange(1961, 2031, 1)
pylab.figure(num=None, figsize=(10.24, 5.12), dpi=1, facecolor='w', edgecolor='k')
pylab.plot(xdata, ydata, linewidth=3.0)
pylab.xlabel(xlabel); pylab.ylabel(ylabel); pylab.title(i)
pylab.savefig(MYRAMDICT[i], format='png')

This function (please ignore the pylab commands, they are here just for illustration) creates a dictionary (MYTAMDICT) which i populated with cString objects that are used to store charts on memmory. These charts are later dynamically presented to the user.

Would somebody please help me to make use of threading so that I can use all of my cores and make this function perform faster? Or point me towards ideas to improve it?

Answer Source

For the description, you'd be far better off using multiprocessing than threading... You have an "embarrassingly parallel" problem, and no disk IO constraints (you're writing to memory) Of course, passing large stuff back and forth between the processes will get expensive, but returning a string representing a .png shouldn't be too bad..

It can be done quite simply:

import multiprocessing
import cStringIO

import matplotlib.pyplot as plt
import numpy as np

import itertools

def main():
    """Generates 1000 random plots and saves them as .png's in RAM"""
    pool = multiprocessing.Pool()
    same_title = itertools.repeat('Plot %i')
    fig_files =, itertools.izip(xrange(1000), same_title))

def plot(args):
    """Make a random plot"""
    # Unfortunately, (and imap) only support a single argument to
    # the function, so you'll have to unpack a tuple of arguments...
    i, titlestring = args

    outfile = cStringIO.StringIO()

    x = np.cumsum(np.random.random(100) - 0.5)

    fig = plt.figure()
    fig.savefig(outfile, format='png', bbox_inches='tight')
    plt.title(titlestring % i)

    # cStringIO files aren't pickelable, so we'll return the string instead...


Without using multiprocessing, this takes ~250 secs on my machine. With multiprocessing (8 cores), it takes ~40 secs.

Hope that helps a bit...

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download