ysearka ysearka - 5 months ago 15
Python Question

Creating one dataframe from another (using pivot)

I'm having a problem with pandas. I have a dataframe with three columns: 'id1','id2','amount'.

From this, I would like to create another dataframe which index is 'id1', which columns is 'id2', and the cells contain the corresponding 'amount'.

Let's go for an example:

import pandas as pd
df = pd.DataFrame([['first_person','first_item',10],['first_person','second_item',6],['second_person','first_item',18],['second_person','second_item',36]],columns = ['id1','id2','amount'])


which yields:

id1 id2 amount
0 first_person first_item 10
1 first_person second_item 6
2 second_person first_item 18
3 second_person second_item 36


And from this I would like to create a second dataframe which is:

first_item second_item
first_person 10 6
second_person 18 36


Of course, before posting I've worked on it for a time, but all I've managed to do for this is a double 'for loop'... Which for the size of my dataframes is nowhere to be computable. Would you know how to do this in a more pythonic way? (which would obviously be far more efficient than 'for' loops!)

Answer

I think you can use pivot with rename_axis (new in pandas 0.18.0):

print df
             id1          id2  amount
0   first_person   first_item      10
1   first_person  second_item       6
2  second_person   first_item      18
3  second_person  second_item      36

print df.pivot(index='id1', columns='id2', values='amount')
        .rename_axis(None)
        .rename_axis(None, axis=1)

               first_item  second_item
first_person           10            6
second_person          18           36