I'm storing hash codes in a file, one hash per line.
When I have a new hash code, I open the file and check if
the hash code already exists and if it doesn't exist,
I save it to that file.
f = open("hashes.txt", "w")
hashes = f.readlines()
hash_code = "ff071fdf1e060400"
if not hash_code in hashes:
for h in hashes:
I would do this:
class Hashes(object): def __init__(self, filename): self.filename = filename with open(filename, 'rt') as f: # read the file only once self.hashes = set(line.strip() for line in f) def add_hash(self, hash): if hash not in self.hashes: # this is very fast with sets self.hashes.add(hash) with open(self.filename, 'at') as f: print(hash, file=f) # write only one hash hashes = Hashes("hashes.txt") hashes.add_hash("ff071fdf1e060400")
The downside is that all hashes are kept in memory. If there are many millions of hashes, that could start causing problems, but until then it is fine. It is better than fine, if speed is important.