NamAshena NamAshena - 2 months ago 14
Python Question

How to divide two dataframes with different length and duplicated indexs in Python

Here is my code and I want to get the expected output, but, division of dataframes does not work, what is wrong here?

import pandas as pd
data1 = {'name':['A', 'C', 'D'], 'cond_a':['B','B','B'], 'value':[10,12,14]}
data2 = {'name':['A', 'C', 'D','D','A'], 'cond_a':['G','G','G','G','G'], 'value':[5,6,7,3,2]}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

df1.set_index('name', inplace=True)
df2.set_index('name', inplace=True)


df2['new_col'] = df2['value'] / df1['value']


expected output:

cond_a value new_col
name
A G 5 5/10
C G 6 6/12
D G 7 7/14
D G 3 3/14
A G 2 2/10

Answer

As long as df2 has a unique index, you can use reindex on it when performing the division:

df2['new_col'] = df2['value'] / df1['value'].reindex(df2.index)

The resulting output:

     cond_a  value   new_col
name                        
A         G      5  0.500000
C         G      6  0.500000
D         G      7  0.500000
D         G      3  0.214286
A         G      2  0.200000