Robert Robert - 3 months ago 10
Python Question

Loop between 2 csv files stops at first mach

I have 2

csv
files: the source file has only one column with IDs (one ID per line). The 2nd
csv
file has the IDs plus extra info. I would like to check each ID from
file1
against all the records in
file2
and if there is a match, to print out the row that matched (from
file1
) with the corresponding info from
file2
.

The files look like this.

Source:

0234906006000
0234765306000
0231316005000
0234906006000
0212134006000
0125667806000
3334906006000
1778986006000
0239906006000


Mine:

02349-34-010-000,Adam
05125-07-033-000,Michael
05172-04-042-000,Debora
8071-33-001-000,Matt
2349-38-007, 2349-38-011, 2349-38-012,Ken
0234906006000,Roger
3334906006000,Hummels
0231316005000,Don
0501401028000,Gregg


My code looks like:

import csv
source = csv.reader(open("denver_source.csv"))
mine = csv.reader(open("denver_mine.csv"))
output = {}
for line in source:
print(line[0])
for xline in mine:
if line[0] not in xline: continue
output[line[0]] = xline
print(xline)
print("Result", output)


I get a result but I do not get all the matches, only the first match:

0234906006000
['0501401028000', 'Gregg']
0234765306000
['0501401028000', 'Gregg']
0231316005000
['0501401028000', 'Gregg']
0234906006000
['0501401028000', 'Gregg']
0212134006000
['0501401028000', 'Gregg']
0125667806000
['0501401028000', 'Gregg']
3334906006000
['0501401028000', 'Gregg']
1778986006000
['0501401028000', 'Gregg']
0239906006000
['0501401028000', 'Gregg']
Result: {'0234906006000': ['0234906006000', 'Roger']}


Can you please help me understand where I fail in continuing the loop properly in the second file? I apologize if this was posted before but I only found more complex examples.

Jim Jim
Answer

Well after you iterate through the second .csv in your inner loop you reach the end and no new lines can be read again.

You should be using something like this where the second file is going to seek(0) to go back to the beginning after the inner loop breaks. Also, it is advisable to use with statements to open files since they make sure the file is closed when its body has executed:

with open("denver_source.csv") as cf1, open("denver_mine.csv") as cf2:
    source = csv.reader(cf1)
    mine = csv.reader(cf2)
    for line in source:
        for xline in mine:
             if line[0] in xline: 
                 output[line[0]] = xline
                 break
        cf2.seek(0)

The noteworthy thing here is that you call seek(0) on the file you supply to csv.reader and not the instance of csv.reader. That is, you call cf2.seek(0) and not mine.seek(0); the changes will be reflected on the reader instance and you'll be able to re-iterate as needed.

Of course, you could re-factor this to use the with for every iteration of line in source instead of seek(0); that's really down to personal preference.