Bender Rodriguez Bender Rodriguez - 4 months ago 27
Python Question

Python. Difference between unicode+variable and u+constant?

Can someone please tell me how to fix this please.

This works:

nOrd = (ord(u'ط'))

But this fails:

s=unicode(s, 'utf-8')
nOrd = (ord((s)))

The error I get is:

TypeError: ord() expected a character, but string of length 2 found


Your second s is simply not the same text as the first example:

>>> u'ط'
>>> u'ط'.encode('utf8')
>>> s="‎ط"
>>> s
>>> s.decode('utf8')

You have a U+200E LEFT-TO-RIGHT MARK character in the second example. That makes it two characters, not one.

You could remove it by stripping with str.lstrip() or by using str.replace(); the first only removes it from the start, the other from everywhere in the string:

s = s.lstrip(u'\u200e')
# or
s = s.replace(u'\u200e', u'')