Cytrix - 11 months ago 71

Python Question

Make a plot for a large data set that I want to plot with Pandas groupby.

The groupby is two layers. One is numeric (desired x-axis) and one is categorical (wanting to make this different boxes on a stacked bar chart. I sum the values of the groups and that will create my y_axis.

So I construct the following group by dataframe.

`import pandas as pd`

import matplotlib.pyplot as plt

data= pd.DataFrame()

data['x_axis'] = [1,1,2,2]

data['category'] = ['a','b','a','b']

data['y_value'] = [10,15,20,30]

data = data.groupby(['x_axis','category']).sum()

data.reset_index(inplace = True)

data.plot.bar(x = 'x_axis', y= 'y_value', stacked = True)

plt.show()

This results in the following

`numeric_x_axis category sum_value`

1 a 10

b 15

2 a 20

b 30

Therefore the desires chart would have a stacked bar chart with a x axis of (1,2) and stacked bars for a and b with the sum value as the y axis value.

However the chart appear with multiple repeat x_axis values.

Answer Source

Are you sure you want to use `groupby`

? Based on your description, it seems as though you would be better served by `pivot`

.

```
data = pd.DataFrame()
data['x_axis'] = [1,1,2,2]
data['category'] = ['a','b','a','b']
data['y_value'] = [10,15,20,30]
pivoted_data = data.pivot('x_axis', 'category')
pivoted_data.plot(kind='bar', stacked=True)
plt.show()
```

Note, the pivoted dataframe looks like

```
In [2]: pivoted_data
Out[2]:
y_value
category a b
x_axis
1 10 15
2 20 30
```