Cleb Cleb - 2 months ago 8
JSON Question

How to convert all lists to sets when json file is loaded

I have json files that look like this:

{
"K1": {
"p": [
"A"
],
"s": [
"B",
"C"
]
},
"K2": {
"p": [
"A",
"F"
],
"s": [
"G",
"H",
"J"
]
}
}


I can easily read in this data:

import json

with open('json_lists_to_sets.json') as fi:
data = json.load(fi)


Then
data
looks as follows:

{u'K2': {u'p': [u'A', u'F'], u's': [u'G', u'H', u'J']}, u'K1': {u'p': [u'A'], u's': [u'B', u'C']}}


For my further analysis, however, it would be better to use
sets
instead of
lists
. I can of course convert
lists
to
sets
after I have read in the data:

for vi in data.values():
vi['p'] = set(vi['p'])
vi['s'] = set(vi['s'])


which gives me the desired output:

print data['K2']


yields

{u'p': {u'A', u'F'}, u's': {u'G', u'H', u'J'}}


My questions is whether I can convert these
lists
to
sets
directly when I read in the data in the
json.load
command, so something like "convert all lists you find to sets". Does something like this exist?

Answer

Although the json library offers many hooks to alter decoding, there is no hook to hook into loading a JSON array.

You'll have to recursively update the decoded result afterwards instead:

def to_sets(o):
    if isinstance(o, list):
        return {to_sets(v) for v in o}
    elif isinstance(o, dict):
        return {k: to_sets(v) for k, v in o.items()}
    return o

This handles lists at any nested dictionary depth:

>>> to_sets(data)
{u'K2': {u'p': set([u'A', u'F']), u's': set([u'H', u'J', u'G'])}, u'K1': {u'p': set([u'A']), u's': set([u'C', u'B'])}}

Take into account however, that lists containing other dictionaries can't be handled because dictionaries are not hashable.

If you expect to find lists nested inside other lists, you'd have to switch to using a frozenset() rather than a set() to be able to nest those:

def to_sets(o):
    if isinstance(o, list):
        return frozenset(to_sets(v) for v in o)
    elif isinstance(o, dict):
        return {k: to_sets(v) for k, v in o.items()}
    return o