CF84 CF84 - 12 days ago 5
Python Question

Pandas: compute the mean of a column grouped by another column

Say I have a dataframe like this:

gender height weight C
2000-01-01 male 42.849980 157.500553 1
2000-01-02 male 49.607315 177.340407 1
2000-01-03 male 56.293531 171.524640 1
2000-01-04 female 48.421077 144.251986 2
2000-01-05 male 46.556882 152.526206 2
2000-01-06 female 68.448851 168.272968 1
2000-01-07 male 70.757698 136.431469 2
2000-01-08 female 58.909500 176.499753 3
2000-01-09 female 76.435631 174.094104 3
2000-01-10 male 45.306120 177.540920 2


How could I compute the mean of the
height
column, grouped by column
C
? This would yield 3 different values: the mean of those heights with
C=1
, that of those with
C=2
, and so forth.

So far I tried this but to no avail:

df['height'].mean(groupby='C')


-> returns
TypeError: mean() got an unexpected keyword argument 'groupby'

Answer

Your syntax is incorrect, there is no groupby arg for mean, you want to groupby on the col of interest and then call mean on the column of interest:

In [11]:
df.groupby('C')['height'].mean()

Out[11]:
C
1    54.299919
2    52.760444
3    67.672566
Name: height, dtype: float64
Comments