user2288043 user2288043 - 1 month ago 16
Python Question

Unicode to String Python 2

I am trying to convert an plain string into the special character to work it in my logic in python 2.

word = 'Tb\u03b1'
word = unicode('Tb\u03b1')

if word.encode('utf-8') == u'Tb\u03b1'.encode('utf-8'):
print 'They are equals'

print word.encode('utf-8')
print type(word.encode('utf-8'))
print u'Tb\u03b1'.encode('utf-8')
print type(u'Tb\u03b1'.encode('utf-8'))


I am getting this response

Tb\u03b1
<type 'str'>
Tbα
<type 'str'>


My question is... As I apply the
unicode
method to the word, I am not supposed to have the same response in line 1 and 3? I would like to get the line 3 because I need to do some logic based on that special character

Answer

The problem is that \u has no special meaning in a non-unicode literal, so it remains as \u in your string. To interpret the \u escapes and produce the corresponding Unicode, use the encoding "unicode_escape":

>>> as_str = "\u03b1"
>>> as_unicode = as_str.decode(encoding="unicode_escape")
>>> print as_unicode
α

But you'd be better off if you could figure out a way to avoid being in this situation. Even better, switch to Python 3 where these kinds of things make a lot more sense.