Pyth0nicPenguin Pyth0nicPenguin - 1 month ago 8
Python Question

Generate a lot of random strings

I have a method that will generate 50,000 random strings, save them all to a file, and then run through the file, and delete all duplicates of the strings that occur. Out of those 50,000 random strings, after using

set()
to generate unique ones, on average 63 of them are left.

Function to generate the strings:

def random_strings(size=8, chars=string.ascii_uppercase + string.digits + string.ascii_lowercase):
return ''.join(random.choice(chars) for _ in xrange(size))


Delete duplicates:

with open("dicts/temp_dict.txt", "a+") as data:
created = 0
while created != 50000:
string = random_strings()
data.write(string + "\n")
created += 1
sys.stdout.write("\rCreating password: {} out of 50000".format(created))
sys.stdout.flush()

print "\nRemoving duplicates.."
with open("dicts\\rainbow-dict.txt", "a+") as rewrite:
rewrite.writelines(set(data))


Example of before and after:
https://gist.github.com/Ekultek/a760912b40cb32de5f5b3d2fc580b99f

How can I generate completely random unique strings without duplicates?

Answer

You can use set from the start

created = set()
while len(created) < 50000:
    created.add(random_strings)

And save once outside the loop