jlconlin jlconlin -3 years ago 66
Python Question

How to do string formatting with unicode emdash?

I am trying do string formatting with a unicode variable. For example:

>>> x = u"Some text—with an emdash."
>>> x
u'Some text\u2014with an emdash.'
>>> print(x)
Some text—with an emdash.
>>> s = "{}".format(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 9: ordinal not in range(128)

>>> t = "%s" %x
>>> t
u'Some text\u2014with an emdash.'
>>> print(t)
Some text—with an emdash.


You can see that I have a unicode string and that it prints just fine. The trouble is when I use Python's new (and improved?)
format()
function. If I use the old style (using
%s
) everything works out fine, but when I use
{}
and the
format()
function, it fails.

Any ideas of why this is happening? I am using Python 2.7.2.

Answer Source

The new format() is not as forgiving when you mix ASCII and unicode strings ... so try this:

s = u"{}".format(x)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download