Michele Zacco Michele Zacco - 8 months ago 19
Python Question

Iterate through list in python and get highest value for a given element

I have a CSV file made like so:

business unit; employee id; name; tax code;
1; 50; JOE BLOGGS; 123456789
1; 51; JOE BLOGGS; 123456789
1; 52; JOE BLOGGS; 123456789
3; 53; JOE BLOGGS; 123456789
5; 54; JOE BLOGGS; 123456789

The tax code is unique while the business units and id may vary.
Since the employee I need is always the last one, it being his most recent and thus active working position, how can I loop through this file and append ONLY the last row to the array? (Edit: the last line or better the highest id referring to that particular person, there might be other employees in the file)

my code is:

for line in csv:
l = [i.strip() for i in line.split(';')]
if l[3] not in d:

this way I obtain as a result a list containing only the first record, how can I get the last one?
Thank you!


If I understand well, you'd like to get the last line of the lines grouped by tax code. If so, use itertools.groupby,

import csv
import operator
import itertools

with open(filename, 'r') as f:
    header = next(f, None)
    for key, group in itertools.groupby(f, key=operator.itemgetter(3)):
        last_line = list(group)[-1]

        5; 54; JOE BLOGGS; 123456789

Previous answer: read the last line from a csv file.

Way 1: Use csv.reader

skipinitialspace=True is used to get rid of whitespace following the delimiter ;.

import csv

with open(filename, 'r') as f:
    headers = next(f, None) # the header
    lists = [row for row in csv.reader(f, delimiter=';', skipinitialspace=True)]

    # Output
    ['5', '54', 'JOE BLOGGS', '123456789']

Way 2: Use collections.deque

import csv
import collections

with open(filename, 'r') as f:
    last_line = collections.deque(csv.reader(f), 1)[0][0]

    # Output
    5; 54; JOE BLOGGS; 123456789