papabiceps - 1 year ago 71

Python Question

This code is an implementation to decode base64 strings. Please explain the bit manipulation part of the code.

`frombase64(s): \\base64_decode`

b64s = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"

b64p = "="

ret = ""

s2 = s.replace(b64p, "")

left = 0

for i in range(0, len(s2)):

if left == 0:

left = 6

else:

value1 = b64s.index(s2[i - 1]) & (2 ** left - 1)

value2 = b64s.index(s2[i]) >> (left - 2)

value = (value1 << (8 - left)) | value2

ret += chr(value)

left -= 2

return ret

I'm a bit new to programming.I didn't understand why the operations that are done on the variables value1, value2, value. I want to how those operations are converting base64 encoded string to text. I took this from here.

My question here is "what is happening in the else part of the code?"

Answer Source

If you want to know what code does, have it tell you

```
b64s = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
b64p = "="
ret = ""
s2 = """TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlzIHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2YgdGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGludWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRoZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=""".replace(b64p, "")
left = 0
for i in range(0, len(s2)):
if left == 0:
left = 6
else:
value1 = b64s.index(s2[i - 1]) & (2 ** left - 1)
value2 = b64s.index(s2[i]) >> (left - 2)
value = (value1 << (8 - left)) | value2
print(left, chr(value), bin(ord(s2[i-1])).rjust(10), bin(ord(s2[i])).rjust(10), bin(value1).rjust(10), bin(value2).rjust(10), bin(value).rjust(10))
ret += chr(value)
left -= 2
print(ret)
```

NB `(2 ** left - 1)`

could be `((1<<left)-1)`

. Which would a bitmask (left = 3, then 0111 is the resulting binary)

We want to take some rightmost bits from `s2[i-1]`

, combine them with some leftmost bits from `s2[i]`

, of which the remaining bits of `s2[i]`

will be used in the next iteration

In response to your comment.. Each character represents 6 bits instead of 8. So you need to combine 2 characters to represent 8 bits. left -= 2 is because 8-6 is 2. If each character was 4 bits then you'd set left to 4 & subtract 4 making the left == 0 trigger every 2 iterations. Because each pair of characters would map exactly to 1 byte. You're just having a sliding amount of right/left which add up to 8. Every iteration it slides because you need to read each bit only once, so you need to slide the bit index past the bits read. Try to imagine left as a portion of i which indexes the bit value. When left is 0 we need to take all 6 from the current character & 2 from the next, hence why we set left to 6 & continue on to the next character where we'll take 4 from it & 2 from the previous (which we took 6 from last iteration)