kate88 kate88 - 1 year ago 114
Python Question

fitting a cumulative line to histogram with matplotlib

I created the histogram below:

sample histogram

and was wondering if instead of plotting the whole graph (in blue) I could just plot the top edge (in black)?

or just fit the line to match the top of the distribution?

my code is:

plt.hist(histogramData, bins=200, normed=True, cumulative=True, edgecolor='b', facecolor='None')

I tried removing 'edgecolor' and 'facecolor' but it does not seem to work...

Thank for your help!

Answer Source

I think pylabs histogram codes uses numpys np.histogram() function, yielding bins and counts; so if you use that together wit the standard plot() command, you are done (just remember to also do the np.cumsum()on the counts of the np.histogram() for the cummulative look).

Edit: Regarding the comment, I quote from the numpy.histogram() documentation:


hist : array

The values of the histogram. See normed and weights for a description of the possible semantics.

bin edges : array of dtype float

Return the bin edges (length(hist)+1).

Thus, to plot your data in the desired way:

hist, bins = np.histogram(data, bins=200)
plt.plot( bins[:-1], np.cumsum(hist) )

or if you want to be more precise, you could even put the data values in the bin center:

offset = bins[1:]-bins[:-1]
plt.plot( bins[:-1]+offset, np.cumsum(hist) )