user1017373 user1017373 - 3 months ago 9
Python Question

Calculating mean for sub-set of dataframe based on unique row names

I have a dataframe which looks like follows,

df.head()

Sym P1 P2 P3 P4 P5 B1 B2 B3 B4 B5
AA 7.86 8.86 9.86 10.86 11.86 0.7768 1.7768 2.7768 3.7768 4.7768
AA 7.86 8.86 9.86 10.86 11.86 0.8664 1.8664 2.8664 3.8664 4.8664
AA 7.86 8.86 9.86 10.86 11.86 0.874534 1.874534 2.874534 3.874534 4.874534
BB 5.8 6.8 7.8 8.8 9.8 7.42 8.42 9.42 10.42 11.42
BB 5.8 6.8 7.8 8.8 9.8 0.1434 1.1434 2.1434 3.1434 4.1434
CC 0.421 1.421 2.421 3.421 4.421 6.78 7.78 8.78 9.78 10.78
CC 0.421 1.421 2.421 3.421 4.421 8.43 9.43 10.43 11.43 12.43
VV 3.25 4.25 5.25 6.25 7.25 0.97 1.97 2.97 3.97 4.97
VV 3.25 4.25 5.25 6.25 7.25 0.2 1.2 2.2 3.2 4.2
VV 3.25 4.25 5.25 6.25 7.25 0.45 1.45 2.45 3.45 4.45
VV 3.25 4.25 5.25 6.25 7.25 0.78 1.78 2.78 3.78 4.78


And what I am aiming is to get the mean of the second half(Columns Starting with name B1..B5) of the data frame based on the unique values in column 'sym' and make a new dataframe which looks as follows.

Sym P1 P2 P3 P4 P5 B1 B2 B3 B4 B5
AA 7.86 8.86 9.86 10.86 11.86 0.8664 1.8664 2.8664 3.8664 4.8664
BB 5.8 6.8 7.8 8.8 9.8 3.7817 4.7817 5.7817 6.7817 7.7817
CC 0.421 1.421 2.421 3.421 4.421 7.605 8.605 9.605 10.605 11.605
VV 3.25 4.25 5.25 6.25 7.25 0.615 1.615 2.615 3.615 4.615


I tried to used groupby for that to get the unique sym .Would be great if someone could suggest a simple way to proceed
Thank you

Answer

Use filter and groupby

transformed = df.filter(like='B').groupby(df.Sym).transform(np.mean)
df.loc[:, df.columns.str.contains('B')] = transformed
df

enter image description here