Denziloe Denziloe -4 years ago 120
Python Question

Python pandas pivot output differs from documentation -- resulting DataFrame has column name in top left

Here's an excerpt from the pandas pivot docs:

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pivot.html

>>> df = pd.DataFrame({'foo': ['one','one','one','two','two','two'],
'bar': ['A', 'B', 'C', 'A', 'B', 'C'],
'baz': [1, 2, 3, 4, 5, 6]})
>>> df
foo bar baz
0 one A 1
1 one B 2
2 one C 3
3 two A 4
4 two B 5
5 two C 6
>>> df.pivot(index='foo', columns='bar', values='baz')
A B C
one 1 2 3
two 4 5 6


When I run the exact code above (pandas 0.19.2), I instead get the following output:

bar A B C
foo
one 1 2 3
two 4 5 6


My questions are:


  • Do other people get this behaviour?

  • Why does the behaviour differ from the documentation?

  • What actually is the nature of this resulting DataFrame? I am quite new to pandas so this is probably a stupid question. But I don't think I've seen a name (bar) over the index before. I can't work out what it is?



Thanks.

Answer Source

I think this is due to an older version of pandas that generated the docs, in the latest versions it will name the index if passed, in this case 'foo'

In [111]:
pv = df.pivot(index='foo', columns='bar', values='baz')
pv.index

Out[111]:
Index(['one', 'two'], dtype='object', name='foo')

You can see that the index now has a 'name' attribute

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download