samy samy - 6 months ago 9
Python Question

Error occurs while parsing the json file

I'm trying to parse the json format data to json.load() method. But it's giving me an error. I tried different methods like reading line by line, convert into dictionary, list, and so on but it isn't working. I also tried the solution mention in the following url loading-and-parsing-a-json but it give's me the same error.

import json
data = []
with open('output.txt','r') as f:
for line in f:
data.append(json.loads(line))


Error:

ValueError: Extra data: line 1 column 71221 - line 1 column 6783824 (char 71220 - 6783823)


Please find the output.txt in the below URL

Content- output.txt

Answer

I wrote up the following which will break up your file into one JSON string per line and then go back through it and do what you originally intended. There's certainly room for optimization here, but at least it works as you expected now.

import json
import re

PATTERN = '{"statuses"'
file_as_str = ''

with open('output.txt', 'r+') as f:
    file_as_str = f.read()
    m = re.finditer(PATTERN, file_as_str)
    f.seek(0)
    for pos in m:
        if pos.start() == 0:
            pass
        else:
            f.seek(pos.start())
            f.write('\n{"')

data = []

with open('output.txt','r') as f:
    for line in f:
        data.append(json.loads(line))
Comments