Uphill_ What '1 Uphill_ What '1 - 9 days ago 5
Python Question

Is this string Base64? How can I tell what is the encoding used?

This is a puzzle to me and I am really annoyed that I cannot solve it! So, if anyone has some free time I would like to here some suggestions on how to solve it!

I use a software that stores the password in an oracle database. The password field is of type Varchar2(100 char). It seems to me that the software encodes the passwords and stores the encoded string in the database.

My password is '1234' and the encoded string is 'cRDtpNCeBiql5KOQsKVyrA0sAiA='. All the passwords in the database are 28 characters long.

The puzzle that I have assigned to myself is to find the encoding and/or encryption of the string. My first check was on Base64

So here is my first test in python (idle):

>>> import base64
>>> encoded = 'cRDtpNCeBiql5KOQsKVyrA0sAiA='
>>> decoded = base64.b64decode(encoded)
>>> decoded
'q\x10\xed\xa4\xd0\x9e\x06*\xa5\xe4\xa3\x90\xb0\xa5r\xac\r,\x02 '
>>> print decoded
qíᄂО*ᆬ䣐ᄚᆬrᆲ


,

Here is my second test:

>>> myString = '1234'
>>> encoded = base64.b64encode(myString)
>>> encoded
'MTIzNA=='
>>> decoded = base64.b64decode('MTIzNA==')
>>> decoded
'1234'


So, my first thought is that this is not Base64 encoded. After I checked wikipedia (https://en.wikipedia.org/wiki/Base64) it seems that Base64 encoded strings are not of fixed size. My second thought is that the string was encrypted and then encoded into Base64 and that is why I get the weird-looking decoded string.

Any ideas?

Answer

It is actually Base64 encoded. However, it is not the password itself that is encoded, but its SHA-1 hash.

from sha import sha
print 'cRDtpNCeBiql5KOQsKVyrA0sAiA='.decode('base64').encode('hex')
print sha('1234').hexdigest()

or for newer versions of Python:

from hashlib import sha1
print 'cRDtpNCeBiql5KOQsKVyrA0sAiA='.decode('base64').encode('hex')
print sha1('1234').hexdigest()

Base64 encodes 3 bytes as 4 characters. As you have 27 characters with one padding, you can see that there are 20 encoded bytes (27*3/4). When something security related is 20 bytes (or 160 bits) long, it's usually a SHA-1. When it's 16 bytes (128 bits), it's usually MD5.

BTW, it's always a good idea to add random salt to the mix so two identical passwords wouldn't stick out in the database. On Linux, the crypt module helps you with that and throws in a few more security measures.

Edit: to answer another comment - it's very easy to get the original from the "encrypted" password. There's a technique that got famous a few years back called Rainbow Tables. There are even online versions of it. Just type in your hash in hex (7110eda4d09e062aa5e4a390b0a572ac0d2c0220) and it'll give you 1234 in a second.