vkontori vkontori - 1 month ago 20
Python Question

pandas: generate and plot average

I have a pandas dataframe like:

In [61]: df = DataFrame(np.random.rand(3,4), index=['art','mcf','mesa'],
columns=['pol1','pol2','pol3','pol4'])

In [62]: df
Out[62]:
pol1 pol2 pol3 pol4
art 0.661592 0.479202 0.700451 0.345085
mcf 0.235517 0.665981 0.778774 0.610344
mesa 0.838396 0.035648 0.424047 0.866920


and I want to generate a row with the average for the policies across benchmarks and then plot it.

Currently, the way I do this is:

df = df.T
df['average'] = df.apply(average, axis=1)
df = df.T
df.plot(kind='bar')


Is there an elegant way to avoid the double transposition?

I tried:

df.append(DataFrame(df.apply(average)).T)
df.plot(kind='bar')


This will append the correct values but does not update the index properly and the graph is messed up.

A clarification. The result of the code with the double transposition is this: enter image description here
This is what I want. To show both the benchmarks and the average of the policies, not just the average. I was just curious if I can do it better.

Note that the legend is usually messed up. For a fix:

ax = df.plot(kind='bar')
ax.legend(patches, list(df.columns), loc='best')

bmu bmu
Answer

You can simply use the instance method mean of the DataFrame and than plot the results. There is no need for transposition.

In [14]: df.mean()
Out[14]: 
pol1    0.578502
pol2    0.393610
pol3    0.634424
pol4    0.607450

In [15]: df.mean().plot(kind='bar')
Out[15]: <matplotlib.axes.AxesSubplot at 0x4a327d0>

policies.png

Update

If you want to plot the bars of all columns and the mean you can append the mean:

In [95]: average = df.mean()

In [96]: average.name = 'average'

In [97]: df = df.append(average)

In [98]: df
Out[98]: 
             pol1      pol2      pol3      pol4
art      0.661592  0.479202  0.700451  0.345085
mcf      0.235517  0.665981  0.778774  0.610344
mesa     0.838396  0.035648  0.424047  0.866920
average  0.578502  0.393610  0.634424  0.607450

In [99]: df.plot(kind='bar')
Out[99]: <matplotlib.axes.AxesSubplot at 0x52f4390>

second plot

If your layout doesn't fit in to the subplot tight_layout will adjust the matplotlib parameters.