runningbirds runningbirds - 27 days ago 6
Python Question

Pandas join on columns with different names

I have two different data frames that I want to perform some sql operations on. Unfortunately, as is the case with the data I'm working with, the spelling is often different.

See the below as an example with what I thought the syntax would look like where userid belongs to df1 and username belongs to df2. Anyone help me out?

# not working - I assume some syntax issue?
pd.merge(df1, df2, on = [['userid'=='username', 'column1']], how = 'left')

Answer

When the names are different, use the xxx_on parameters instead of on=:

pd.merge(df1, df2, left_on=  ['userid', 'column1'],
                   right_on= ['username', 'column1'], 
                   how = 'left')