pam pam - 2 months ago 21
Python Question

Python: Itertools groupby for unique key value pairs

I'm trying to group data in a csv file based on a column. I tried:

from itertools import groupby
import csv
with open('path/trial.csv', 'rb') as f:
reader = csv.reader(f)
things = list(reader)

for key, group in groupby(things, lambda x: x[0]):
listOfThings = len(",".join([thing[1] for thing in group]).split(","))
print key + "," + str(listOfThings)


It worked when data in column 1 is in a certain order. If it repeats, the counts are getting messed up.

With

A,1
A,2
A,1
B,0
B,8


I get

A,3
B,2


With

A,1
A,2
B,0
B,8
A,1


I get

A,2
B,2
A,1


I'd like my script to consider both unique keys and unique values and the output to be (taking A,1 only once, though it is present twice):

A,2
B,2


Based on Chad Simmon's comment, changed it to:

sortedlist = list(reader)
things= sorted(sortedlist, key=operator.itemgetter(0), reverse=True)


It now gives me

B,2
A,3


I want A,2 instead.

pam pam
Answer

Got it by doing:

from itertools import groupby
import csv, operator, itertools
with open('trial.csv', 'rb') as f:
    reader = csv.reader(f)
    sortedlist = list(reader)
    things= sorted(sortedlist, key=operator.itemgetter(0), reverse=True)
    things.sort()
    things = list(k for k,_ in itertools.groupby(things))

for key, group in groupby(things, lambda x: x[0]):
    listOfThings = len(",".join([thing[1] for thing in group if not thing in things[1]]).split(","))
    print key + "," + str(listOfThings)