rojas rojas - 7 days ago 6
JSON Question

How to append to a json file on the fly?

I am parsing several thousands html files which come out as a dict. Then I want to combine them as one dict and save to disk in json format.

I don't want to build this huge dict in memory while iterating through files, I would rather keep updating/writing to a file as I go.

So instead of this:

data = {}
for e, fn in enumerate(os.listdir(path)):
fp = os.path.join(path, fn)
d = html_to_dict(fp)
data[e] = d


I would like this:

with open('out_file.json', 'w') as f:
for e, fn in enumerate(os.listdir(path)):
fp = os.path.join(path, fn)
d = html_to_dict(fp)
# update the file dict


Any ideas?

Answer

You should be able to do this by writing some of the JSON yourself and just using the json library for the individual records. For example:

with open('out_file.json', 'w') as f:
    f.write("{")
    delim = ""
    for e, fn in enumerate(os.listdir(path)):
        fp = os.path.join(path, fn)
        d = html_to_dict(fp)
        f.write(delim + str(e) + ":")
        json.dump(d, f)
        delim = ",\n"
    f.write("}")

In this case you could write an array instead of an object and save the space required for the keys:

with open('out_file.json', 'w') as f:
    f.write("[")
    delim = ""
    for fn in os.listdir(path):
        fp = os.path.join(path, fn)
        d = html_to_dict(fp)
        f.write(delim)
        json.dump(d, f)
        delim = ",\n"
    f.write("]")