Fxs7576 Fxs7576 - 1 month ago 8
Python Question

Combining the Values from Multiple Keys in Dictionary Python

In Python, I have the following dictionary of sets:

{
1: {'Hello', 'Bye'},
2: {'Bye', 'Do', 'Action'},
3: {'Not', 'But', 'No'},
4: {'No', 'Yes'}
}


My goal is combine the keys which contain match values (like in this example, "Bye" and "No"), so the result will look like this:

{
1: {'Hello', 'Bye', 'Do', 'Action'},
3: {'Not', 'But', 'No', 'Yes'}
}


Is there a way to do this?

Answer

If there are overlapping matches and you want the longest matches:

from collections import defaultdict

d = {
    1: {'Hello', 'Bye'},
    2: {'Bye', 'Do', 'Action'},
    3: {'Not', 'But', 'No'},
    4: {'No', 'Yes'}
}
grp = defaultdict(list)

# first group all keys with common words
for k, v in d.items():
    for val in v:
        grp[val].append(k)


# sort the values by lengths to find longest matches.    
for v in sorted(grp.values(), key=len, reverse=True):
    for val in v[1:]:
       if val not in d:
           continue
           # use first ele as the key and union to existing values
       d[v[0]] |= d[val]
       del d[val]


print(d)

if you don't have overlaps you can just:

grp = defaultdict(list)

for k, v in d.items():
    for val in v:
        grp[val].append(k)

for v in grp.values():
    for val in v[1:]:
        d[v[0]] |= d[val]
        del d[val]

Or if you want a new dict:

new_d = {}
for v in grp.values():
    if len(v) > 1:
        k = v[0]
        new_d[k] = d[k]
        for val in v[1:]:
            new_d[k] |= d[val]

All three give you the following but key order could be different:

{1: set(['Action', 'Do', 'Bye', 'Hello']), 3: set(['Not', 'Yes', 'But', 'No'])}

To combine into the least amount of keys:

from collections import defaultdict

d = {
    1: {'Hello', 'Bye'},
    2: {'Bye', 'Do', 'Action'},
    3: {'Not', 'But', 'No',"Hello"},
    4: {'No', 'Yes',"Hello"},
}
grp = defaultdict(list)
for k, v in d.items():
    for val in v:
        grp[val].append(k)
new_d = {}
for v in grp.values():
    if len(v) > 1:
        k = min(v)
        new_d[k] = d[k]
        for val in v[1:]:
            if val != k:
                new_d[k] |= d[val]