Vladislav Ladenkov Vladislav Ladenkov - 29 days ago 8
Python Question

Decoding bytes to string in python

i've got a row of bytes:

'\udcd0\udca0\udcd0\udcbe\udcd1\udc81\udcd0\udcbd\udcd0\udcb5\udcd1\udc84\udcd1\udc82\udcd1\udc8c'


If i do:

b'\udcd0\udca0\udcd0\udcbe\udcd1'.decode("utf8"),


I recieve:

'\\udcd0\\udca0\\udcd0\\udcbe\\udcd1'


I cant decode it, because i dont know, how it was encoded. At least, we can see, that its not
utf-8
, because, symbols i expect to see, have a
\x23
-similar representation. How can i discover the decoder and decode it?

P.S. i expect to see russian symbols there

Answer

I am able to print your string in this way, but the output is all "invalid characters."

>>> string = u'\udcd0\udca0\udcd0\udcbe\udcd1\udc81\udcd0\udcbd\udcd0\udcb5\udcd1\udc84\udcd1\udc82\udcd1\udc8c'
>>> print string
����������������

According to Charbase.com, your first character (u'\udcd0') is invalid character. So maybe the output is correct.

Comments