BKS BKS - 4 months ago 9
Python Question

How to update a dataframe in Pandas Python

I have the following two dataframes in pandas:

DF1:
AuthorID1 AuthorID2 Co-Authored
A1 A2 0
A1 A3 0
A1 A4 0
A2 A3 0

DF2:
AuthorID1 AuthorID2 Co-Authored
A1 A2 5
A2 A3 6


I would like (without looping and comparing) to find the matching AuthorID1 and AuthorID2 pairing in DF2 that exist in DF1 and update the column values accordingly. So the result for the above two tables would be the following:

Resulting Updated DF1:
AuthorID1 AuthorID2 Co-Authored
A1 A2 5
A1 A3 0
A1 A4 0
A2 A3 6


Is there a fast way to do this? As I have 7 millions rows in DF1 and looping and comparing would just take forever.

Answer

You can use update:

df1.update(df2)
print (df1)
  AuthorID1 AuthorID2  Co-Authored
0        A1        A2          5.0
1        A2        A3          6.0
2        A1        A4          0.0
3        A2        A3          0.0

Sample:

df1 = pd.DataFrame({'new': {0: 7, 1: 8, 2: 1, 3: 3}, 
                    'AuthorID2': {0: 'A2', 1: 'A3', 2: 'A4', 3: 'A3'}, 
                    'AuthorID1': {0: 'A1', 1: 'A1', 2: 'A1', 3: 'A2'}, 
                    'Co-Authored': {0: 0, 1: 0, 2: 0, 3: 0}})

df2 = pd.DataFrame({'AuthorID2': {0: 'A2', 1: 'A3'},
                    'AuthorID1': {0: 'A1', 1: 'A2'}, 
                    'Co-Authored': {0: 5, 1: 6}})

  AuthorID1 AuthorID2  Co-Authored  new
0        A1        A2            0    7
1        A1        A3            0    8
2        A1        A4            0    1
3        A2        A3            0    3

print (df2)
  AuthorID1 AuthorID2  Co-Authored
0        A1        A2            5
1        A2        A3            6

df1.update(df2)
print (df1)
  AuthorID1 AuthorID2  Co-Authored  new
0        A1        A2          5.0    7
1        A2        A3          6.0    8
2        A1        A4          0.0    1
3        A2        A3          0.0    3