Esad A. Tokat Esad A. Tokat - 3 months ago 15
C++ Question

Padding of strings while MD5 Hashing

I'm trying to implement MD5 hash function in C++ and get correct results for some input values that I have found through some websites online which are some hexadecimal values. Up till this point I have been able to get it work correctly. However, when I try to do the same thing with ASCII strings, I'm out of luck and cannot really understand exactly what else should I do.

The first thing I have done is to convert ASCII string into hex value and append single 0x80 at the end after which follows a bunch of 0x00 s and at the last eight bytes the length of the non-appended message in hexadecimal.

For example, "test123" is represented in hexadecimal as "0x74, 0x65, 0x73, 0x74, 0x31, 0x32, 0x33" and its length in bytes is 7. Then the byte array to be used as input to hash function is as far as I get it is as below,

const uint8_t test123Array[64] = {
0x74, 0x65, 0x73, 0x74, 0x31, 0x32, 0x33, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x07
};


And when I applied the hash function, the result I get is,

e7 54 fa ea 1e d7 69 ba 85 59 62 bf 16 e9 98 48


Whereas result I get through online hash generator websites is something like

cc 03 e7 47 a6 af bb cb f8 be 76 68 ac fe be e5

Answer

The length of the data is not counted in bytes, but rather in bits. So, the length is not 7, but 56. This length is then encoded into 64 bit in Big-Endian byte order.

The prepared input should look like this:

const uint8_t test123Array[64] = {
    0x74, 0x65, 0x73, 0x74, 0x31, 0x32, 0x33, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x38, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
};

The specification of MD5 you linked to contains everything you need to know about that. Take a closer look at Section "3.2 Step 2. Append Length" and the Encode function in Section "A.3 md5c.c".

Comments