dacharya64 dacharya64 -4 years ago 145
JSON Question

Writing to text file is cut off

I'm currently working on a some Python code to take certain text elements of a JSON file and write this text to a file in a

'spell name': 'description'
format. For instance, taking the name and short description elements of the following JSON file (edited for brevity):

"pk": 0,
"model": "srd20.spell",
"fields": {
"description": "<p>An arrow of acid springs from your hand and speeds to its target. You must succeed on a <a href=\"../combat.html#_ranged-touch-spells-in-combat-95\">ranged touch attack</a> to hit your target. The arrow deals 2d4 points of acid damage with no splash damage. For every three caster levels you possess, the acid, unless neutralized, lasts for another round (to a maximum of 6 additional rounds at 18th level), dealing another 2d4 points of damage in each round.</p>",
"school": "conjuration",
"saving_throw": "none",
"name": "Acid Arrow",
"reference": "prd/spells",
"level": "sorcerer/wizard 2",
"spell_resistance": "no",
"area": "",
"casting_time": "1 standard action",
"effect": "one arrow of acid",
"descriptor": "acid",
"range": "long (400 ft. + 40 ft./level)",
"short_description": "Ranged touch attack; 2d4 damage for 1 round + 1 round/3 levels.",
"components": "V, S, M (rhubarb leaf and an adder's stomach), F (a dart)",
"altname": "acid-arrow",
"duration": "1 round + 1 round per three levels",
"subschool": "creation",
"target": ""

and writing these details to a file like so:

'Acid Arrow': 'Ranged touch attack; 2d4 damage for 1 round + 1 round/3 levels.',

What I have so far seems to almost do the trick. It takes the JSON, runs through each element (spell on the list) and writes that spell's name and description to the file
in the format I want:

import json

fWrite = open('spelllistparsed.txt', 'w')

with open('spells.json') as data_file:
data = json.load(data_file)

for count, item in enumerate(data, start=0):
fWrite.write("'" + data[count]["fields"]["name"] + "': " + "'"
+ data[count]["fields"]["short_description"] + "',\n")

But the problem is when I run the program, it only ends up writing part of the list to the new file. The end result when I run it will show around 30 spells out of the thousand-or-so that are supposed to be there. Based on some trial and error, I've found the following things:

  • This seems to happen no matter which method I use to iterate. I've tried using an integer count instead, and
    for spell in data:
    but they result in the same thing. Also tried incrementing by 2 -- every other spell gets written but it still gets cut off after around 50 or so spells.

  • When I use a different second field for spells (not "short_description", but a different element of the spell, like "description" or "effect") a different amount of text is written to the file. For short text all 1000-or-so
    spell name : second field
    will be displayed, and for lots of text in the second field, it will only write a few spells over to the
    . I don't think there is a specific character limit on what is being written to the file, but this does seem to be part of the problem

  • When I set the second field to always print the same thing (
    ) it successfully wrote all of the spells to the file as
    'spell name': 'description for spell 0'
    so the problem may be with processing many different descriptions

  • As far as I know the JSON is valid and this should be possible (ran it through a JSONLint)

  • When running the current code it writes to file the spells Acid Arrow through Baleful Polymorph

Any help would be greatly appreciated! I feel like this should be relatively simple, but it's my first time with Python and my searching for a solution has been fruitless so far. Let me know if there's any other info I should include here that can help.

JSON file of spells: https://github.com/machinalis/django-srd20/blob/master/pathfinder/spells.json

Answer Source

Python2 loads JSON strings as unicode objects, which need to be encoded before they can be written to a file object. The reason your script is failing at item 31 is because that is apparently the first item which contains a non-ASCII character (in this case, U+2013 "–").

You'll want your code to look something like this:

import json

fWrite = open('spelllistparsed.txt', 'w')

with open('spells.json') as data_file:
    data = json.load(data_file)

# note that you do not need to use enumerate here as you never use the index
for item in data:
    name = item["fields"]["name"].encode("utf-8")
    desc = item["fields"]["short_description"].encode("utf-8")
    fWrite.write("'" + name + "': " + "'" + desc + "',\n")

# always close open files when you're done!

Alternatively, you could use Python3, in which case you'd just have to add the keyword argument encoding='utf-8' to the first call to open in your script. And you'd still want to close the file at the end.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download