gabboshow gabboshow - 30 days ago 7
Python Question

merge 2 dataframes based on index

How can I merge 2 dataframe

df1
and
df2
in order to get
df3
that has the rows of
df1
and
df2
that have the same index (and the same values in the columns)?

df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 1, 2, 3])
df2 = pd.DataFrame({'A': ['A0', 'A5', 'A6', 'A7'],
'B': ['B0', 'B5', 'B6', 'B7'],
'C': ['C0', 'C5', 'C6', 'C7'],
'D': ['D0', 'D5', 'D6', 'D7']},
index=[0, 5, 6, 7])

df3 = pd.DataFrame({'A': ['A0'],
'B': ['B0'],
'C': ['C0'],
'D': ['D0']},
index=[0])

Answer Source

Just merge:

In[111]:
df1.merge(df2)

Out[111]: 
    A   B   C   D
0  A0  B0  C0  D0

The default params for merge is to merge all columns, performing an inner merge so only where all values agree

Looking at the index matching requirement, I'd filter the df prior to the merge:

In[131]:
filtered = df1.loc[df2.index].dropna()
filtered

Out[131]: 
    A   B   C   D
1  A1  B1  C1  D1

and then merge

In[132]:
filtered.merge(df2)
Out[132]: 
    A   B   C   D
0  A0  B0  C0  D0

if the indices do not match at all, say the first row of df2 is 1 instead of 2:

In[133]:
filtered = df1.loc[df2.index].dropna()
filtered
Out[133]: 
    A   B   C   D
1  A1  B1  C1  D1

then merge will return an empty df because the index row value doesn't agree:

In[134]:
filtered.merge(df2)

Out[132]: 
Empty DataFrame
Columns: [A, B, C, D]
Index: []