SpeedEX505 SpeedEX505 - 5 months ago 50
Python Question

Python: Pandas dataframe sum

I try to learn how to work with pandas dataframes.
My dataframe has 4 columns A,B,C,D.

For index (A,B,C) there are multiple values of D.
I want to merge these rows and sum the values of D.

I have:

╔═══╦═══╦═══╦═══╦═══╗
║ ║ A ║ B ║ C ║ D ║
╠═══╬═══╬═══╬═══╬═══╣
║ 1 ║ 1 ║ 2 ║ 3 ║ 5 ║
║ 1 ║ 1 ║ 2 ║ 3 ║ 3 ║
║ 2 ║ 1 ║ 5 ║ 4 ║ 2 ║
║ 2 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 3 ║
╚═══╩═══╩═══╩═══╩═══╝


I want to get:

╔═══╦═══╦═══╦═══╦═══╗
║ ║ A ║ B ║ C ║ D ║
╠═══╬═══╬═══╬═══╬═══╣
║ 1 ║ 1 ║ 2 ║ 3 ║ 8 ║
║ 2 ║ 1 ║ 5 ║ 4 ║ 2 ║
║ 2 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 5 ║
╚═══╩═══╩═══╩═══╩═══╝


I tried to do it this way:

df=df.groupby(['A','B','C'])['D'].sum()


But it gives me a Series instead.

Answer

If you want to retain the columns after groupby you can call reset_index:

In [185]:
df.groupby(['A','B','C'])['D'].sum().reset_index()

Out[185]:
   A  B  C  D
0  1  2  3  8
1  1  2  4  7
2  1  5  4  2

or pass arg as_index=False

Comments