Rahul Rahul - 10 months ago 69
Python Question

encoding url to short unique file name

I want to save html to a file based on the url.

to get unique name to url I am using uuid.

>>> url = "https://www.google.co.in/?gfe_rd=cr&ei=-koUWPf4HqzT8ge2g6HoBg&gws_rd=ssl"
>>> uuidstring = str(uuid.uuid5(uuid.NAMESPACE_DNS, url))

but i want to further shorten the name. Is there any way to shorten string to unique small string.

I tried base64 but I could not figure out.

>>> uuid.UUID(uuidstring).bytes.encode('base64').rstrip('=\n').replace('/', '_')
>>> AttributeError: 'bytes' object has no attribute 'encode'

linked question: Convert UUID 32-character hex string into a "YouTube-style" short id and back

Answer Source

Use the base64 module like this, it can handle binary data, then perform the decoding as ascii (will work because base64 is ascii).

import uuid,base64

url = "https://www.google.co.in/?gfe_rd=cr&ei=-koUWPf4HqzT8ge2g6HoBg&gws_rd=ssl"
uuidstring = str(uuid.uuid5(uuid.NAMESPACE_DNS, url))
z=base64.encodebytes(uuid.UUID(uuidstring).bytes).decode("ascii").rstrip('=\n').replace('/', '_')