George Liu George Liu - 16 days ago 7
Python Question

How to convert a nested dictionary to pandas dataframe?

I have a dictionary "my_dict" in this format:

{'l1':{'c1': {'a': 0, 'b': 1, 'c': 2},
'c2': {'a': 3, 'b': 4, 'c': 5}},
'l2':{'c1': {'a': 0, 'b': 1, 'c': 2},
'c2': {'a': 3, 'b': 4, 'c': 5}}
}


Currently, I am using
pd.DataFrame.from_dict(my_dict, orient='index')
and get a df like this:

c2 c1
l1 {u'a': 3, u'c': 5, u'b': 4} {u'a': 0, u'c': 2, u'b': 1}
l2 {u'a': 3, u'c': 5, u'b': 4} {u'a': 0, u'c': 2, u'b': 1}


However, what I want is both l1/l2 and c2/c3 as indexes and a/b/c as columns.

Something like this:

a b c
l1 c1 0 1 2
c2 3 4 5
l2 c1 0 1 2
c2 3 4 5


What's the best way to do this?

Answer

Consider a dictionary comprehension to build a dictionary with tuple keys. Then, use pandas' MultiIndex.from_tuples. Below ast is used to rebuild you original dictionary from string (ignore the step on your end).

import pandas as pd
import ast

origDict = ast.literal_eval("""
{'l1':{'c1': {'a': 0, 'b': 1, 'c': 2},
       'c2': {'a': 3, 'b': 4, 'c': 5}},
 'l2':{'c1': {'a': 0, 'b': 1, 'c': 2},
       'c2': {'a': 3, 'b': 4, 'c': 5}}
}""")

# DICTIONARY COMPREHENSION
newdict = {(k1, k2):v2 for k1,v1 in origDict.items() \
                       for k2,v2 in origDict[k1].items()}
print(newdict)
# {('l1', 'c2'): {'c': 5, 'a': 3, 'b': 4},
#  ('l2', 'c1'): {'c': 2, 'a': 0, 'b': 1},
#  ('l1', 'c1'): {'c': 2, 'a': 0, 'b': 1},
#  ('l2', 'c2'): {'c': 5, 'a': 3, 'b': 4}}

# DATA FRAME ASSIGNMENT
df = pd.DataFrame([newdict[i] for i in sorted(newdict)],
                  index=pd.MultiIndex.from_tuples([i for i in sorted(newdict.keys())]))    
print(df)
#        a  b  c
# l1 c1  0  1  2
#    c2  3  4  5
# l2 c1  0  1  2
#    c2  3  4  5
Comments