Thierry J. Thierry J. - 4 months ago 13
Python Question

Generate categorised data set from list in O(n)

I am trying to generate a data set with the following structure from a list of items with properties

prop1
and
prop2
:

result[p1][p2] => list of item with prop1=p1 and prop2=p2


I have been able to do it in O(n2) with:

result = {
item.prop1: {
item.prop2: [
i
for i in item_list
if i.prop1 == item.prop1 and i.prop2 == item.prop2
]
}
for item in item_list
}


But haven't been able to find a way to do it in less time. Is it possible to achieve this in O(n)?

Answer

You should only have to iterate over the items once:

from collections import defaultdict

result = defaultdict(lambda: defaultdict(list))
for item in item_list:
    result[item.prop1][item.prop2].append(item)