bluprince13 bluprince13 - 1 year ago 76
Python Question

Merge two similar dataframes that have the same columns

I'd like to merge df_1 and df_2 to create df_merged, but I want to merge columns that both have in common, rather than have the likes of A_x and A_y created.

index = [np.array(['foo', 'foo', 'qux', 'qux']),
np.array(['one', 'two', 'one', 'two',])]
columns = ["A", "B"]
df_1 = pd.DataFrame(np.random.randn(4, 2), index=index, columns=columns)

index = [np.array(['bar', 'bar', 'baz', 'baz',]),
np.array(['one', 'two', 'one', 'two',])]
columns = ["A", "B"]
df_2 = pd.DataFrame(np.random.randn(4, 2), index=index, columns=columns)

df_merge = pd.merge(df_1, df_2, left_index=True, right_index=True, how='outer')

print df_1
print df_2
print df_merge


df_1

A B
foo one 2.082229 1.575985
two -0.805592 0.444195
qux one 0.368874 0.253556
two 1.017632 -0.471978


df_2

A B
bar one 0.134571 0.415209
two -1.288889 -0.144284
baz one -0.117345 -0.095292
two -0.256708 -0.682542


df_merge - current output

A_x B_x A_y B_y
bar one NaN NaN 0.134571 0.415209
two NaN NaN -1.288889 -0.144284
baz one NaN NaN -0.117345 -0.095292
two NaN NaN -0.256708 -0.682542
foo one 2.082229 1.575985 NaN NaN
two -0.805592 0.444195 NaN NaN
qux one 0.368874 0.253556 NaN NaN
two 1.017632 -0.471978 NaN NaN


df_merge - desired

A B
bar one 0.134571 0.415209
two -1.288889 -0.144284
baz one -0.117345 -0.095292
two -0.256708 -0.682542
foo one 2.082229 1.575985
two -0.805592 0.444195
qux one 0.368874 0.253556
two 1.017632 -0.471978

Answer Source

The simpliest is use concat, by default there is 'outer' join and concatenate pandas objects along a particular axis (here axis=0, default value):

print (pd.concat([df_1,df_2]))

                A         B
foo one -0.329887 -0.966898
    two  0.552272 -1.964264
qux one -0.629764 -0.765578
    two -0.148118  0.904920
bar one  0.305685 -1.269400
    two  1.256213 -0.686447
baz one -2.194461  0.529666
    two -1.487217 -0.520045

And then sort_index if necessary:

print (pd.concat([df_1,df_2]).sort_index())

                A         B
bar one -0.463547 -0.002780
    two -0.421346 -1.730840
baz one -0.086068  1.179000
    two  0.756876 -0.492985
foo one -0.223900 -0.302643
    two  0.460265  0.216632
qux one -0.296815  0.799978
    two -0.420700  1.147312
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download