S Ringne S Ringne - 22 days ago 6
Python Question

Join two dataframes on multiple column python

i have two dataframes with names df1 and df2.

df1=

col1 col2 count
0 1 36 200
1 12 15 200
2 13 17 100


df2=

product_id product_name
0 1 abc
1 2 xyz
2 3 aaaa
3 12 qwert
4 13 sed
5 15 qase
6 36 asdf
7 17 zxcv


The entries in col1 and col2 are product_id from df2.

i want to make a new dataframe 'df3' which has the following columns and entries.

df3=

col1 | col1_name | col2 | col2_name | count
0 1 | abc | 36 | asdf | 200
1 12 | qwert | 15 | qase | 200
2 13 | sed | 17 | zxcv | 100


i.e add a col1_name and col2_name wherever product_id from df2 is equal to col1 & col2 values.

is it possible to do so by

df3 = pd.concat([df1, df2], axis=1)


my knowledge to pandas df and python is beginner level.
Is there a way to do so?
Thanks in advance.

Answer

I think you can use map by dict generated from df2 and then sort columns names by sort_index:

d = df2.set_index('product_id')['product_name'].to_dict()
print (d)
{1: 'abc', 2: 'xyz', 3: 'aaaa', 36: 'asdf', 17: 'zxcv', 12: 'qwert', 13: 'sed', 15: 'qase'}

df1['col1_name'] = df1.col1.map(d)
df1['col2_name'] = df1.col2.map(d)
df1 = df1.sort_index(axis=1)
print (df1)
   col1 col1_name  col2 col2_name  count
0     1       abc    36      asdf    200
1    12     qwert    15      qase    200
2    13       sed    17      zxcv    100

df1 = df1.drop(['col1','col2'], axis=1)
print (df1)
  col1_name col2_name  count
0       abc      asdf    200
1     qwert      qase    200
2       sed      zxcv    100