Isura Nirmal Isura Nirmal -4 years ago 391
Python Question

Plot a histogram with normal curve and name the bins in seaborn

enter image description here

Hi all, I am trying to plot the following type of plot using seaborn with a different data set. The problem is when a histogram type is used, I cannot name the bins (like 2-2.5,2.5-3..etc) even though it provides kernel curves. Bar plots dont have function to draw the normal curve like in the picture. The image seems to be used SPSS statistical package which I have little knowledge of.

Following is the closest thing I can get (I have attached the code)

df = pd.DataFrame({'cat': ['1-1.5', '1.5-2', '2-2.5','2.5-3','3-3.5','3.5-4','4-4.5','4.5-5'],'val': [0,0,1,7,7,33,17,10]})
ax = sns.barplot(y = 'val', x = 'cat',
data = df)
ax.set(xlabel='Categories', ylabel='Frequency')

enter image description here

Answer Source

So the problem is of course that you don't have the original data, but data that has already been binned. One could reverse this binning and start with an array of raw data. Then perform the histogramming again and use a sns.distplot which, by default, shows a KDE plot as well.

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

cat = ['1-1.5', '1.5-2', '2-2.5','2.5-3','3-3.5','3.5-4','4-4.5','4.5-5']
val = [0,0,1,7,7,33,17,10]
data = []
for i in range(len(cat)):
bins = np.arange(1,5.5, 0.5)

ax = sns.distplot(data, bins=bins, hist_kws= dict(edgecolor="k"))
ax.set(xlabel='Categories', ylabel='Frequency')

enter image description here

Use the bw keyword argument to the KDE function to set the smoothness of the curve. E.g. sns.distplot(data, bins=bins, kde_kws=dict(bw=0.5), hist_kws= dict(edgecolor="k")) where bw=0.5 produces

enter image description here

Also try bw=0.1, bw=0.25, bw=0.35 and bw=2 to see the differences.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download