lte__ lte__ - 1 year ago 371
Python Question

Pandas - KeyError on merge

I'm trying to merge 2 dataframes. I'm using the Jupyter notebook and pandas dataframes. My two dfs look like this:

product_name object
Quantity float64
Product_id int64
product_group1 int64
product_group1_name object
product_group2 int64
product_group2_name object
packing_unit object
packing_amount int64
dtype: object

Store_id int64
Date object
Price int64
Net price int64
Purchase price int64
Hour int64
product_id int64
Quantity int64
dtype: object

Yet, when I try to run

gbdfprice = gbdf.merge(gbdf, trns, left_on = 'Product_id', right_on = 'product_id')

I get

KeyError: 'product_id'

Any idea why?

Answer Source

The format you have used (that accepts left and right DataFrame arguments) is the method associated with the pandas top-level module, however you have actually used the method associated with a DataFrame object which accepts only the right argument.

import pandas as pd

left = DataFrame(...)
right = DataFrame(...)

#Method you have used
combined = left.merge(right, [options...])
#Method you have taken argument list from
combined = pd.merge(left, right, [options...])

From what I can see in the source, left.merge(right...) just imports the other merge method and runs merge(self,right,...).

So, as @ayhan points out, to fix just remove gbdf from the argument list, or you could also replace the gbdf.merge call with pd.merge and leave the argument list the same.