haimen haimen - 5 months ago 5
Python Question

Error while parsing out few fields from dictionary to CSV

b

{u'message': {u'method': u'XXXX',
u'params': {u'documentURL': u'xxxx',
u'A': u'yyyy',
u'initialPriority': u'Medium',
u'method': u'GET',
u'mixedContentType': u'none',
u'url': u'xxxx'},
u'date': u'qqqq',
u'time': u'wwww',
u'type': u'Other',
u'wallTime': u'uuuu'},
u'webview': u'0'}


I am trying to parse out only few fields from a very big dictionary into a csv. The following is what I have tried,

result = []
for i, val in enumerate(b):
output['a']= b[i]['message']['params']['A']
output['date'] = b[i]['message']['date']
output['time'] = b[i]['message']['time']
output['passed'] = b[i]['message']['action']['output']['passed']
result.append(output)

x = pd.DataFrame(json_normalize(result))
x.to_csv('output.csv', encoding='utf-8')


The problem here is the dictionary is not properly structured and some time variable like
passed (b[i]['message']['action']['output']['passed'])
is not present and therefore there is a error coming up because of this. How do to make this code to append only when value is present and if value is not present make it NULL?

i want to parse out only when the field is present and if it is not present, want to have it as NULL. Can anybody help me in doing this?

Also is there a more efficient way to do this parsing?

Answer

This is assuming that 'message' is always present, but others may or may not:

result = []
for k in b:
    msg=b[k]['message'] # if b is a dictionary of dictionaries
    msg=k['message'] # if b is a list of dictionaries
    if 'params' in msg:
        if 'A' in msg['params']:
            output['a']=msg['params']['A']
        else:
            output['a']="NULL"
    else:
        output['a']="NULL"
    if 'date' in msg:
        output['date']=msg['date']
    else:
        output['date']="NULL"
    if 'time' in msg:
        output['time']=msg['time']
    else:
        output['time']="NULL"
    if 'action' in msg:
        if 'output' in msg['action']:
            if 'passed' in msg['actions']['output']:
                output['passed']=msg['actions']['output']['passed']
            else:
                output['passed']="NULL"
        else:
            output['passed']="NULL"
    else:
        output['passed']="NULL"
    result.append(output)

x = pd.DataFrame(json_normalize(result))
x.to_csv('output.csv',  encoding='utf-8')

This approach does not scale, but will work if you only have to get few fields out of the dictionary.