haki haki - 5 months ago 68
Python Question

Convert pandas DataFrame to a nested dict

I'm Looking for a generic way of turning a DataFrame to a nested dictionary

This is a sample data frame

name v1 v2 v3
0 A A1 A11 1
1 A A2 A12 2
2 B B1 B12 3
3 C C1 C11 4
4 B B2 B21 5
5 A A2 A21 6


The number of columns may differ and so does the column names.

like this :

{
'A' : {
'A1' : { 'A11' : 1 }
'A2' : { 'A12' : 2 , 'A21' : 6 }} ,
'B' : {
'B1' : { 'B12' : 3 } } ,
'C' : {
'C1' : { 'C11' : 4}}
}


What is best way to achieve this ?

closest I got was with the
zip
function but haven't managed to make it work for more then one level (two columns).

DSM DSM
Answer

I don't understand why there isn't a B2 in your dict. I'm also not sure what you want to happen in the case of repeated column values (every one except the last, I mean.) Assuming the first is an oversight, we could use recursion:

def recur_dictify(frame):
    if len(frame.columns) == 1:
        if frame.values.size == 1: return frame.values[0][0]
        return frame.values.squeeze()
    grouped = frame.groupby(frame.columns[0])
    d = {k: recur_dictify(g.ix[:,1:]) for k,g in grouped}
    return d

which produces

>>> df
  name  v1   v2  v3
0    A  A1  A11   1
1    A  A2  A12   2
2    B  B1  B12   3
3    C  C1  C11   4
4    B  B2  B21   5
5    A  A2  A21   6
>>> pprint.pprint(recur_dictify(df))
{'A': {'A1': {'A11': 1}, 'A2': {'A12': 2, 'A21': 6}},
 'B': {'B1': {'B12': 3}, 'B2': {'B21': 5}},
 'C': {'C1': {'C11': 4}}}

It might be simpler to use a non-pandas approach, though:

def retro_dictify(frame):
    d = {}
    for row in frame.values:
        here = d
        for elem in row[:-2]:
            if elem not in here:
                here[elem] = {}
            here = here[elem]
        here[row[-2]] = row[-1]
    return d
Comments