steveeweeveewoo steveeweeveewoo - 27 days ago 6
Python Question

convert dict of lists of tuples to dataframe

I have a dict of lists of tuples of the form:

{identifier1:[(date1,value1),
(date2,value2)],
identifier2:[(date1,value1),
(date3,value3),
(date4,value4)]
}


I'm trying to parse this into a dataframe but the lists are of different lengths and the tuples have duplicate values. The shape I want is three columns identifier, date and value where there are no nan values. I have tried various combinations such as using
from_dict
method with very little success.

Answer Source

You can use list comprehension with DataFrame constructor (python 3):

d = {'identifier1':[('date1','value1'),('date2','value2')],
     'identifier2':[('date1','value1'),('date3','value3'),('date4','value4')]}

L = [(k, *t) for k, v in d.items() for t in v]

df = pd.DataFrame(L, columns=['identifier','date','val'])
print (df)
    identifier   date     val
0  identifier1  date1  value1
1  identifier1  date2  value2
2  identifier2  date1  value1
3  identifier2  date3  value3
4  identifier2  date4  value4

For python 2 use:

L = [(k, t[0], t[1]) for k, v in d.items() for t in v]

df = pd.DataFrame(L, columns=['identifier','date','val'])
print (df)
    identifier   date     val
0  identifier1  date1  value1
1  identifier1  date2  value2
2  identifier2  date1  value1
3  identifier2  date3  value3
4  identifier2  date4  value4