mikl - 5 months ago 89x
Python Question

Base 62 conversion in Python

How would you convert an integer to base 62 (like hexadecimal, but with these digits: '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ').

I have been trying to find a good Python library for it, but they all seems to be occupied with converting strings. The Python base64 module only accepts strings and turns a single digit into four characters. I was looking for something akin to what URL shorteners use.

There is no standard module for this, but I have written my own functions to achieve that.

ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

def base62_encode(num, alphabet=ALPHABET):
"""Encode a number in Base X

`num`: The number to encode
`alphabet`: The alphabet to use for encoding
"""
if (num == 0):
return alphabet[0]
arr = []
base = len(alphabet)
while num:
rem = num % base
num = num // base
arr.append(alphabet[rem])
arr.reverse()
return ''.join(arr)

def base62_decode(string, alphabet=ALPHABET):
"""Decode a Base X encoded string into the number

Arguments:
- `string`: The encoded string
- `alphabet`: The alphabet to use for encoding
"""
base = len(alphabet)
strlen = len(string)
num = 0

idx = 0
for char in string:
power = (strlen - (idx + 1))
num += alphabet.index(char) * (base ** power)
idx += 1

return num

Notice the fact that you can give it any Alphabet to use for encoding and decoding.

Hope this helps.

PS - For URL shorteners, I have found that it's better to leave out a few confusing characters like 0Ol1oI etc. Thus I use this alphabet for my URL shortening needs - "23456789abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ"

Have fun.

Source (Stackoverflow)