Jim K Jim K - 4 months ago 15
Python Question

Raw unicode literal that is valid in Python 2 and Python 3?

Apparently the

ur""
syntax has been disabled in Python 3. However, I need it! "Why?", you may ask. Well, I need the
u
prefix because it is a unicode string and my code needs to work on Python 2. As for the
r
prefix, maybe it's not essential, but the markup format I'm using requires a lot of backslashes and it would help avoid mistakes.

Here is an example that does what I want in Python 2 but is illegal in Python 3:

tamil_letter_ma = u"\u0bae"
marked_text = ur"\a%s\bthe Tamil\cletter\dMa\e" % tamil_letter_ma


After coming across this problem, I found http://bugs.python.org/issue15096 and noticed this quote:


It's easy to overcome the limitation.


Would anyone care to offer an idea about how?

Related: What exactly do "u" and "r" string flags do in Python, and what are raw string literals?

Answer

Why don't you just use raw string literal (r'....'), you don't need to specify u because in Python 3, strings are unicode strings.

>>> tamil_letter_ma = "\u0bae"
>>> marked_text = r"\a%s\bthe Tamil\cletter\dMa\e" % tamil_letter_ma
>>> marked_text
'\\aம\\bthe Tamil\\cletter\\dMa\\e'

To make it also work in Python 2.x, Add the following Future import statement at the very beginning of your source code, so that in the string literals in the source code become unicode.

from __future__ import unicode_literals