Kurosaki Kurosaki - 1 month ago 8
Python Question

How to combine the rows in data frame?

It's really annoying that I cannot find a way to combine several rows or columns by finding there means or standard deviations or something else. Could some one give my an idea? Thanks!
example

Answer

I think you can groupby by index floor divided by 10 and aggregate mean or std:

np.random.seed(1)
df = pd.DataFrame(np.random.randint(10, size=(5,5)),index=[1971,1972,1981,1982,1991])
print (df)
      0  1  2  3  4
1971  5  8  9  5  0
1972  0  1  7  6  9
1981  2  4  5  2  4
1982  2  4  7  7  9
1991  1  7  0  6  9

print (df.index // 10)
Int64Index([197, 197, 198, 198, 199], dtype='int64')

df1 = df.groupby([df.index // 10]).mean()
df1.index = df1.index.astype(str) + '0s' 
print (df1)
         0    1    2    3    4
1970s  2.5  4.5  8.0  5.5  4.5
1980s  2.0  4.0  6.0  4.5  6.5
1990s  1.0  7.0  0.0  6.0  9.0

df1 = df.groupby([df.index // 10]).std()
df1.index = df1.index.astype(str) + '0s' 
print (df1)
              0         1         2         3         4
1970s  3.535534  4.949747  1.414214  0.707107  6.363961
1980s  0.000000  0.000000  1.414214  3.535534  3.535534
1990s       NaN       NaN       NaN       NaN       NaN
Comments