Ray Ray - 1 month ago 7
Python Question

Python Pandas convert GroupBy object to DataFrame

Question



There are two questions that look similar but they're not the same question: here and here. They both call a method of
GroupBy
, such as
count()
or
aggregate()
, which I know returns a
DataFrame
. What I'm asking is how to convert the
GroupBy
(class
pandas.core.groupby.DataFrameGroupBy
) object itself into a
DataFrame
. I'll illustrate below.

Example



Construct an example
DataFrame
as follows.

data_list = []
for name in ["sasha", "asa"]:
for take in ["one", "two"]:
row = {"name": name, "take": take, "score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}
data_list.append(row)
data = pandas.DataFrame(data_list)


The above
DataFrame
should look like the following (with different numbers obviously).

name ping score take
0 sasha 72 0.923263 one
1 sasha 14 0.724720 two
2 asa 76 0.774320 one
3 asa 71 0.128721 two


What I want to do is to group by the columns "name" and "take" (in that order), so that I can get a
DataFrame
indexed by the multiindex constructed from the columns "name" and "take", like below.

score ping
name take
sasha one 0.923263 72
two 0.724720 14
asa one 0.774320 76
two 0.128721 71


How do I achieve that? If I do
grouped = data.groupby(["name", "take"])
, then
grouped
is a
pandas.core.groupby.DataFrameGroupBy
instance. How do I convert
grouped
to a
DataFrame
instance?

Answer

You need set_index:

data = data.set_index(['name','take'])
print (data)
            ping     score
name  take                
sasha one     46  0.509177
      two     77  0.828984
asa   one     51  0.637451
      two     51  0.658616
Comments