Georg Heiler Georg Heiler - 11 months ago 149
Python Question

Pandas append multiple columns for a single one

How can I use pandas to append multiple KPI values per single customer efficiently?

A join of the

df with the
df makes some problems because the country is the index of the pivoted data frame and the nationality is not in the index.

countryKPI = pd.DataFrame({'country':['Austria','Germany', 'Germany', 'Austria'],
customers = pd.DataFrame({'customer':['first','second'],

See the desired result in pink:
enter image description here

Answer Source

You could counter the mismatch in the categories through merge:

df = pd.pivot_table(data=countryKPI, index=['country'], columns=['indicator']) = 'nationality'    
customers.merge(df['value'].reset_index(), on='nationality', how='outer')



countryKPI = pd.DataFrame({'country':['Austria','Germany', 'Germany', 'Austria'],
customers = pd.DataFrame({'customer':['first','second'],

The problem appears to be that you have got CategoricalIndex in your DF resulting from the pivot operation and when you perform reset_index on that complains you of that error.

Simply do reverse engineering as in check the dtypes of countryKPI and customers Dataframes and wherever there is category mentioned, convert those columns to their string representation via astype(str)

Reproducing the Error and Countering it:

Assume the DF to be the above mentioned:

countryKPI['indicator'] = countryKPI['indicator'].astype('category')
countryKPI['country'] = countryKPI['country'].astype('category')
customers['nationality'] = customers['nationality'].astype('category')

country      category
indicator    category
value           int64
dtype: object

customer         object
nationality    category
value             int64
dtype: object

After pivot operation:

df = pd.pivot_table(data=countryKPI, index=['country'], columns=['indicator'])
CategoricalIndex(['Austria', 'Germany'], categories=['Austria', 'Germany'], ordered=False, 
                  name='country', dtype='category')
# ^^ See the categorical index

When you perform reset_index on that:


TypeError: cannot insert an item into a CategoricalIndex that is not already an existing category

To counter that error, simply cast the categorical columns to str type.

countryKPI['indicator'] = countryKPI['indicator'].astype('str')
countryKPI['country'] = countryKPI['country'].astype('str')
customers['nationality'] = customers['nationality'].astype('str')

Now, the reset_index part works and even the merge too.