MattE MattE - 5 months ago 22
Python Question

Groupby and count/sum with list of tuples in python?

I have a list of tuples:

data = [('Team1', 'Mark Owen', 40),
('Team1', 'John Doe', 25),
('Team2', 'Raj Patel', 40),
('Team3', 'Matt Le Blanc', 30),
('Team1', 'Rene Russo', 40),
('Team1', 'Ronald Regan', 40),
('Team3', 'Dean Saunders', 15),
('Team2', 'Michael Antonio', 30)]

I would like to groupby Team (index[0] of each tuple), count the number or persons in each team (index[1]) and sum the numbers related to each Team (index[2]) but I cannot quite figure this out. So far I have tried using defaultdict(list) which returns a dict, for example I've tried this to group by Team:

def create_hrs_totals():
result = defaultdict(list)
for k, *v in data():
result[k] += v
return dict(result)

but then I am struggling with working with that output to achieve what I need using a list comp or whatever... The result I am looking for is a new list:

[Team1, 4, 145,
Team2, 2, 80,
Team3, 2, 70]

Is there a better way of doing this?


groupby is a function from itertools, but it's not quite what you want. Instead, lets import defaultdict from collections

from collections import defaultdict
def data_by_team(data):
    d = defaultdict(lambda: [0,0])
    for team, name, number in data:
        d[team][0] += 1
        d[team][1] += number
    return d

This returns a defaultdict that maps the names of the team to a list containing the number of players and the sum of their numbers.