user6433228 - 1 month ago 12
Python Question

# Plot a Data Set According to Counts of Categories of a Variable in Python

I have a dataset which has 14 columns (I had to only use 4 columns: travelling class, gender, age, and fare price) that I have split into train and test data sets. I need to create a vertical bar chart from the train data set for the distribution of the passengers by travelling class (1, 2, and 3 are the classes). I am not allowed to use NumPy, Pandas, SciPy, and SciKit-Learn.

I am very new to Python, and I know how to plot very simple graphs, but when it comes to more complicated graphs, I get a bit lost.

This is my code (I know there is a lot wrong):

``````travelling_class = defaultdict(list)
for row in data:
travelling_class[row[0]]

travelling_class = {key: len(val) for key, val in travelling_class.items()}

keys = travelling_class()
vals = [travelling_class[key] for key in keys]
ind  = range(min(travelling_class.keys()), max(travelling_class.keys()) + 1)
width = 0.6

plt.xticks([i + width/2 for i in ind], ind, ha='center')
plt.xlabel('Tracelling Class')
plt.ylabel('Counts of Passengers')
plt.title('Number of Passengers per Travelling Class')
plt.ylim(0, 1000)
plt.bar(keys, vals, width)
plt.show()
``````

Any help would be greatly appreciated. Thank you in advance

Use `plt.hist`, which will plot a histogram (more info here)

Example:

``````import matplotlib.pyplot as plt

classes = [1, 2, 1, 1, 3, 3]

plt.hist(classes)
plt.show()
``````

And this is the result: