Rahul Rahul - 1 month ago 22
Python Question

encoding url to short unique file name

I want to save html to a file based on the url.

to get unique name to url I am using uuid.

>>> url = "https://www.google.co.in/?gfe_rd=cr&ei=-koUWPf4HqzT8ge2g6HoBg&gws_rd=ssl"
>>> uuidstring = str(uuid.uuid5(uuid.NAMESPACE_DNS, url))


but i want to further shorten the name. Is there any way to shorten string to unique small string.

I tried base64 but I could not figure out.

>>> uuid.UUID(uuidstring).bytes.encode('base64').rstrip('=\n').replace('/', '_')
>>> AttributeError: 'bytes' object has no attribute 'encode'


linked question: Convert UUID 32-character hex string into a "YouTube-style" short id and back

Answer

Use the base64 module like this, it can handle binary data, then perform the decoding as ascii (will work because base64 is ascii).

import uuid,base64

url = "https://www.google.co.in/?gfe_rd=cr&ei=-koUWPf4HqzT8ge2g6HoBg&gws_rd=ssl"
uuidstring = str(uuid.uuid5(uuid.NAMESPACE_DNS, url))
z=base64.encodebytes(uuid.UUID(uuidstring).bytes).decode("ascii").rstrip('=\n').replace('/', '_')
print(z)

result:

pvEA9qOdX8COYyJf8zgzRA
Comments