Darius CM Darius CM - 3 months ago 15
Python Question

Python dict to DataFrame Pandas

I need help for getting a pandas

DataFrame
from a
dict
like this one (2 levels):

{u'instrument': u'EUR_USD',
u'candles': [{u'complete': True,
u'closeMid': 1.26549,
u'highMid': 1.27026,
u'lowMid': 1.25006,
u'volume': 138603,
u'openMid': 1.26864,
u'time': u'2014-09-29T21:00:00.000000Z'},
...
{u'complete': True,
u'closeMid': 1.244995,
u'highMid': 1.25774,
u'lowMid': 1.239455,
u'volume': 167259,
u'openMid': 1.242075,
u'time': u'2014-11-10T22:00:00.000000Z'}
]
}


Columns labels and values should be
instruments
,
Complete
,
CloseMid
,
HighMid
,
lowMid
,
Volume
,
OpenMid
,
time
.

Answer

Here is a pragmatic solution.

d = {u'instrument': u'EUR_USD', 
     u'candles': [
        {u'complete': True, u'closeMid': 1.26549, u'highMid': 1.27026, u'lowMid': 1.25006, u'volume': 138603, u'openMid': 1.26864, u'time': u'2014-09-29T21:00:00.000000Z'}, 
        {u'complete': True, u'closeMid': 1.275215, u'highMid': 1.27915, u'lowMid': 1.25838, u'volume': 164677, u'openMid': 1.265485, u'time': u'2014-10-06T21:00:00.000000Z'}, 
        {u'complete': True, u'closeMid': 1.279995, u'highMid': 1.288645, u'lowMid': 1.26249, u'volume': 207189, u'openMid': 1.27537, u'time': u'2014-10-13T21:00:00.000000Z'}, 
        {u'complete': True, u'closeMid': 1.269775, u'highMid': 1.28403, u'lowMid': 1.261385, u'volume': 125266, u'openMid': 1.280145, u'time': u'2014-10-20T21:00:00.000000Z'}, 
        {u'complete': True, u'closeMid': 1.24819, u'highMid': 1.27707, u'lowMid': 1.243775, u'volume': 210030, u'openMid': 1.270125, u'time': u'2014-10-27T21:00:00.000000Z'}, 
        {u'complete': True, u'closeMid': 1.242075, u'highMid': 1.25774, u'lowMid': 1.23582, u'volume': 246530, u'openMid': 1.24841, u'time': u'2014-11-03T22:00:00.000000Z'}, 
        {u'complete': True, u'closeMid': 1.244995, u'highMid': 1.25774, u'lowMid': 1.239455, u'volume': 167259, u'openMid': 1.242075, u'time': u'2014-11-10T22:00:00.000000Z'}
        ]}

df = pd.DataFrame.from_dict(d).join(pd.DataFrame.from_dict(d['candles'])).drop('candles', axis=1)
df

enter image description here

Edit

The problem is quite different here and requires a new answer based on the same principle, but more complex.

# Test data
d = {u'instruments': [
        {u'instrument': u'EUR_USD', 
         u'interestRate': {u'EUR': {u'ask': 0.004, u'bid': 0.1}, 
                           u'USD': {u'ask': 0.004, u'bid':0}}},
        {u'instrument': u'EUR_USD2', 
         u'interestRate': {u'EUR': {u'ask': 0.05, u'bid': 0.2}, 
                           u'USD2': {u'ask': 0.6, u'bid':0.1}}}
    ]}

# Creating an empty DataFrame
df = DataFrame()

# Iterating over the instruments list
for item in d['instruments']:
    df = pd.concat([df, pd.DataFrame.from_dict(item)
                    .join(pd.DataFrame.from_dict(item['interestRate'], orient='index'))])

# Performing some cleaning to get back a proper interestRate column   
df = df.drop('interestRate', axis=1).reset_index().rename(columns={'index':'interestRate'})

print(df)

  interestRate instrument  bid       ask
0          EUR    EUR_USD  0.1  4.00e-03
1          USD    EUR_USD  0.0  4.00e-03
2          EUR   EUR_USD2  0.2  5.00e-02
3         USD2   EUR_USD2  0.1  6.00e-01
Comments