user6433228 user6433228 - 3 months ago 21
Python Question

Plot a Data Set According to Counts of Categories of a Variable in Python

I have a dataset which has 14 columns (I had to only use 4 columns: travelling class, gender, age, and fare price) that I have split into train and test data sets. I need to create a vertical bar chart from the train data set for the distribution of the passengers by travelling class (1, 2, and 3 are the classes). I am not allowed to use NumPy, Pandas, SciPy, and SciKit-Learn.

I am very new to Python, and I know how to plot very simple graphs, but when it comes to more complicated graphs, I get a bit lost.

This is my code (I know there is a lot wrong):

travelling_class = defaultdict(list)
for row in data:
travelling_class[row[0]]

travelling_class = {key: len(val) for key, val in travelling_class.items()}

keys = travelling_class()
vals = [travelling_class[key] for key in keys]
ind = range(min(travelling_class.keys()), max(travelling_class.keys()) + 1)
width = 0.6

plt.xticks([i + width/2 for i in ind], ind, ha='center')
plt.xlabel('Tracelling Class')
plt.ylabel('Counts of Passengers')
plt.title('Number of Passengers per Travelling Class')
plt.ylim(0, 1000)
plt.bar(keys, vals, width)
plt.show()


Any help would be greatly appreciated. Thank you in advance

Answer

Use plt.hist, which will plot a histogram (more info here)

Example:

import matplotlib.pyplot as plt

classes = [1, 2, 1, 1, 3, 3]

plt.hist(classes)
plt.show()

And this is the result:

Histogram