Aaron3468 Aaron3468 - 6 months ago 62
JSON Question

Python json.loads not able to parse json string

I'm just working on a side project and having trouble getting the

json
module to parse a json object I've stored in a text file. The text file contains a list of newline separated json objects.

Thus far, I have this code which I've confirmed is retrieving each full json line, then feeding it to
json.loads()
:

def load_save_game(file_name):
save_game = []
with open(file_name) as f:
for line in f.readline():
save_game.append(json.loads(line))
return save_game


When I run this, I get a rather long traceback:

Traceback (most recent call last):
File ____, line 70, in <module>
main()
File ____, line 66, in main
view = ViewerWindow(load_save_game('played/20_05_2016 16-04-31.txt'))
File ____, line 60, in load_save_game
save_game.append(json.loads(line))
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python27\lib\json\decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting object: line 1 column 1 (char 0)


I suspect that the problem may be related to encoding, but upon looking into the matter, I've seen that the JSON library can be picky about things such as the capitalization of truth values, dangling commas and single quotes rather than double quotes.

Python 2.7 appears to use the RFC 7159 and ECMA 404 specifications for json, so I used a free online json validator to see if the json was badly formed. It was successfully validated for all standards of json, so python should be quite happy with the input.

I've hosted one line of the json on pastebin, the only difference being that there is no whitespace in the file, which I've also uploaded.

I've looked for solutions online and tried a few different ones, such as capitalizing truth values, replacing all single quotes, and decoding the text from ascii. Hopefully you can help me solve this problem. Thank you very much for your time.

Answer

You are looping over the characters of the first line:

for line in f.readline():

f.readline() returns one string, the first line of the file. You don't need to call readline() at all here, simply iterate over the file object directly:

with open(file_name) as f:
    for line in f:
        if line.strip():
            save_game.append(json.loads(line))

I've added in an extra test to skip empty lines (the end of a file can easily contain one).

You can turn the above into a list comprehension too:

def load_save_game(file_name):
    with open(file_name) as f:
        return [json.loads(l) for l in f if l.strip()]

Note that the above only works if your JSON documents do not by themselves contain newlines. Use json.load(f) (no looping) if you have just one JSON document in the file, or use a different technique to parse multiple JSON documents with newlines in the documents themselves.

The above works fine on your supplied sample file, with or without the line.strip() call:

>>> len(load_save_game(os.path.expanduser('~/Downloads/20_05_2016_16-04-31.txt')))
301