anvd - 11 months ago 40

Python Question

I have this list as example:

`[(148, Decimal('3.0')), (325, Decimal('3.0')), (148, Decimal('2.0')), (183, Decimal('1.0')), (308, Decimal('1.0')), (530, Decimal('1.0')), (594, Decimal('1.0')), (686, Decimal('1.0')), (756, Decimal('1.0')), (806, Decimal('1.0'))]`

Now i want to group by the id, so I will use

`itemgetter(0)`

`import operator, itertools`

from decimal import *

test=[(148, Decimal('3.0')), (325, Decimal('3.0')), (148, Decimal('2.0')), (183, Decimal('1.0')), (308, Decimal('1.0')), (530, Decimal('1.0')), (594, Decimal('1.0')), (686, Decimal('1.0')), (756, Decimal('1.0')), (806, Decimal('1.0'))]

for _k, data in itertools.groupby(test, operator.itemgetter(0)):

print list(data)

I don't know why but I am getting this wrong output:

`[(148, Decimal('3.0'))]`

[(325, Decimal('3.0'))]

[(148, Decimal('2.0'))]

[(183, Decimal('1.0'))]

[(308, Decimal('1.0'))]

[(530, Decimal('1.0'))]

[(594, Decimal('1.0'))]

[(686, Decimal('1.0'))]

[(756, Decimal('1.0'))]

[(806, Decimal('1.0'))]

As you can see the output is not grouped by id. However the code above works fine if I use

`itemgetter(1)`

`[(148, Decimal('3.0')), (325, Decimal('3.0'))]`

[(148, Decimal('2.0'))]

[(183, Decimal('1.0')), (308, Decimal('1.0')), (530, Decimal('1.0')), (594, Decimal('1.0')), (686, Decimal('1.0')), (756, Decimal('1.0')), (806, Decimal('1.0'))]

What I am missing here?

Answer Source

You would first need to *sort* the data for the *groupby* to work, it groups *consecutive elements* based on the key you provide:

```
import operator, itertools
from decimal import *
test=[(148, Decimal('3.0')), (325, Decimal('3.0')), (148, Decimal('2.0')), (183, Decimal('1.0')), (308, Decimal('1.0')), (530, Decimal('1.0')), (594, Decimal('1.0')), (686, Decimal('1.0')), (756, Decimal('1.0')), (806, Decimal('1.0'))]
for _k, data in itertools.groupby(sorted(test), operator.itemgetter(0)):
print list(data)
```

But you would be better using a dict to group to avoid an unnecessary O(n log n) sort:

```
from collections import defaultdict
d = defaultdict(list)
for t in test:
d[t[0]].append(t)
for v in d.values():
print(v)
```

Both would give you the same groupings, just not necessarily in the same order.