Shweta Shweta - 5 months ago 13
JSON Question

parsing issues in corrupted json using python

I have a corrupted json, like following:

{'a': 'b', 'c': "abcd'efgh ijklm"}


Before using python json.loads, i need to do 2 things:


  1. replace single quote inside double quote with
    \'

  2. replace all other single quote, except
    \'
    with double quote



I am not sure how to do both steps. Please help.

Answer

Actually you can't use the json.loads function to deserialize your string because the string doesn't contain a valid JSON document. Instead, you should consider to use the ast.literal_eval function.

Demo:

In [25]: import ast

In [26]: with open('inputfile') as f:
   ....:     for d in map(ast.literal_eval, f):
   ....:         print(d)
   ....:         
{'c': "abcd'efgh ijklm", 'a': 'b'}
{'c': "abcdgfgfgg", 'a': 'b'}