I have downloaded a CSV file from hotmail, but it has a lot of duplicates in it. These duplicates are complete copies and I don't know why my phone created them.
I want to get rid of the duplicates.
Write a python script to remove duplicates.
Windows XP SP 3
CSV file with 400 contacts
A more efficient version of @IcyFlame's solution
with open('1.csv','r') as in_file, open('2.csv','w') as out_file: seen = set() # set for fast O(1) amortized lookup for line in in_file: if line in seen: continue # skip duplicate seen.add(line) out_file.write(line)
To edit the same file in-place you could use this
import fileinput seen = set() # set for fast O(1) amortized lookup for line in fileinput.FileInput('1.csv', inplace=1): if line in seen: continue # skip duplicate seen.add(line) print line, # standard output is now redirected to the file