lte__ lte__ - 2 years ago 618
Python Question

Pandas - KeyError on merge

I'm trying to merge 2 dataframes. I'm using the Jupyter notebook and pandas dataframes. My two dfs look like this:

gbdf.dtypes:
product_name object
Quantity float64
Product_id int64
product_group1 int64
product_group1_name object
product_group2 int64
product_group2_name object
packing_unit object
packing_amount int64
dtype: object


trns.dtypes:
Store_id int64
Date object
Price int64
Net price int64
Purchase price int64
Hour int64
product_id int64
Quantity int64
dtype: object


Yet, when I try to run

gbdfprice = gbdf.merge(gbdf, trns, left_on = 'Product_id', right_on = 'product_id')


I get

KeyError: 'product_id'


Any idea why?

Answer Source

The format you have used (that accepts left and right DataFrame arguments) is the method associated with the pandas top-level module, however you have actually used the method associated with a DataFrame object which accepts only the right argument.

import pandas as pd

left = DataFrame(...)
right = DataFrame(...)

#Method you have used
combined = left.merge(right, [options...])
#Method you have taken argument list from
combined = pd.merge(left, right, [options...])

From what I can see in the source, left.merge(right...) just imports the other merge method and runs merge(self,right,...).

So, as @ayhan points out, to fix just remove gbdf from the argument list, or you could also replace the gbdf.merge call with pd.merge and leave the argument list the same.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download