tao.hong tao.hong - 3 months ago 139
Python Question

Pandas equivalent rbind operation

Basically, I am looping through a bunch of CSV files and in the end would like to

append
each dataframe into one. Actually, all I need is an
rbind
type function. So, I did some search and followed the guide. However, I still could not get the ideal solution.

A sample code is attached below. For instance shape of data1 is always 47 by 42. But shape of
data_out_final
becomes (47, 42), (47, 84), and (47, 126) after the first three files. Idealy, it should be (141, 42). In addition, I check index of
data1
, which is
RangeIndex(start=0, stop=47, step=1)
. Appreciate any suggestions!

My
pandas
version is
0.18.1


code



appended_data = []
for csv_each in csv_pool:
data1 = pd.read_csv(csv_each, header=0)
# do something here
appended_data.append(data2)
data_out_final = pd.concat(appended_data, axis=1)


If using
data_out_final = pd.concat(appended_data, axis=1)
, shape of data_out_final becomes (141, 94)

PS



kind of figure it out. Actually, you have to standardize column names before
pd.concat
.

Answer
>>> df1
          a         b
0 -1.417866 -0.828749
1  0.212349  0.791048
2 -0.451170  0.628584
3  0.612671 -0.995330
4  0.078460 -0.322976
5  1.244803  1.576373
6  1.169629 -1.135926
7 -0.652443  0.506388
8  0.549604 -0.691054
9 -0.512829 -0.959398

>>> df2
          a         b
0 -0.652161  0.940932
1  2.495067  0.004833
2 -2.187792  1.692402
3  1.900738  0.372425
4  0.245976  1.894527
5  0.627297  0.029331
6 -0.828628 -1.600014
7 -0.991835 -0.061202
8  0.543389  0.703457
9 -0.755059  1.239968

>>> pd.concat([df1, df2])
          a         b
0 -1.417866 -0.828749
1  0.212349  0.791048
2 -0.451170  0.628584
3  0.612671 -0.995330
4  0.078460 -0.322976
5  1.244803  1.576373
6  1.169629 -1.135926
7 -0.652443  0.506388
8  0.549604 -0.691054
9 -0.512829 -0.959398
0 -0.652161  0.940932
1  2.495067  0.004833
2 -2.187792  1.692402
3  1.900738  0.372425
4  0.245976  1.894527
5  0.627297  0.029331
6 -0.828628 -1.600014
7 -0.991835 -0.061202
8  0.543389  0.703457
9 -0.755059  1.239968

Unless I'm misinterpreting what you need, this is what you need.

Comments