user3059024 user3059024 - 23 days ago 10
Python Question

Converting a nested python dictionary into a multi-indexed pandas dataframe

How do I convert a nested dictionary into a pandas multi-indexed dataframe?

Here is an example:

dct={'outer':{}}
for i in dct:
dct[i]={'middle':{}}
for j in dct[i]:
dct[i][j]={}
for j in dct[i]:
dct[i][j]['inner']=10

print dct


which outputs:

{'outer': {'middle': {'inner': 10}}}


I want this in a pandas dataframe which looks something like this:

outer middle inner value
inner2 value
middle2 inner value
outer2 middle inner value
inner2 value
middle2 inner value


I'm aware that multi-indexing is a good way to do this but I'm not sure how to make the data frame. Can anybody give me some pointers?

Answer

I think you can use concat created by dict comprehension with DataFrame.from_dict and last stack - but output is Series with MultiIndex:

dct={'outer':{}, 'outer2':{}}
for i in dct:
    dct[i]={'middle':{}, 'middle2':{}}
    for j in dct[i]:
        dct[i][j]={}
    for j in dct[i]:
        dct[i][j]['inner']=10
        dct[i][j]['inner2']=20

print (dct)
{'outer2': {'middle2': {'inner': 10, 'inner2': 20}, 
'middle': {'inner': 10, 'inner2': 20}}, 
'outer': {'middle2': {'inner': 10, 'inner2': 20}, 
'middle': {'inner': 10, 'inner2': 20}}}
print (pd.concat({key:pd.DataFrame.from_dict(dct[key],orient='index') 
                  for key in dct.keys()}))
                inner  inner2
outer  middle      10      20
       middle2     10      20
outer2 middle      10      20
       middle2     10      20

df = pd.concat({key:pd.DataFrame.from_dict(dct[key], orient='index') 
                for key in dct.keys()}).stack()
print (df)
outer   middle   inner     10
                 inner2    20
        middle2  inner     10
                 inner2    20
outer2  middle   inner     10
                 inner2    20
        middle2  inner     10
                 inner2    20
dtype: int64
Comments