user1506145 user1506145 - 1 year ago 147
Python Question

Hist in matplotlib: Bins are not centered and proportions not correct on the axis

take a look at this example:

import matplotlib.pyplot as plt
l = [3,3,3,2,1,4,4,5,5,5,5,5,5,5,5,5]

The output is posted as a picture. I have two questions:

a) Why are only the 4 and 5 bins centered around its value? Shouldn't the others be that as well? Is there a trick to get them centered?

b)Why are the bins not normalised to proportion? I want the y values of all the bins to sum up to one.

Note that my real example contains much more values in the list, but they are all discrete.

enter image description here

Answer Source

You should adjust the keyword arguments of the plt.hist function. There are many of them and the documentation can help you answer many of these questions.

a. ) You can pass the keywords bins=range(1,7) and align=left. Setting the bins keyword to a sequence gives the borders of each bin. For example, [1,2], [2,3], [3,4], ..., [5, 6].

b. ) Check your bin widths (rwidth!=1). From the matplotlib.pyplot.hist documentation:

If True, the first element of the return tuple will be the counts normalized to form a probability density, i.e., n/(len(x)*dbin). In a probability density, the integral of the histogram should be 1; you can verify that with a trapezoidal integration of the probability density function:

This means that the area under your bins is summing up to one, but because the bin widths are less than 1, the heights get normalized in such a way that the heights don't add up to 1. If you adjust rwidth=1, you get a good looking plot:

plt.hist(l, bins=range(1,7), align='left', rwidth=1, normed=True)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download