Hariom Singh Hariom Singh - 1 year ago 185
Python Question

How to strip unicode in a list

I want to strip unicode string from the list for example of airports

[u'KATL',u'KCID']


expected output

[KATL,KCID]


Followed the below link

Strip all the elements of a string list

Tried one of the solution

>>> my_list = ['this\n', 'is\n', 'a\n', 'list\n', 'of\n', 'words\n']
>>> map(str.strip, my_list)
['this', 'is', 'a', 'list', 'of', 'words']


got the following error

TypeError: descriptor 'strip' requires a 'str' object but received a 'unicode'

Answer Source

First, I strongly suggest you switch to Python 3, which treats Unicode strings as first-class citizens (all strings are Unicode strings, but they are called str).

But if you have to make it work in Python 2, you can strip unicode strings with unicode.strip (if your strings are true Unicode strings):

>>> lst = [u'KATL\n', u'KCID\n']
>>> map(unicode.strip, lst)
[u'KATL', u'KCID']

If your unicode strings are limited to ASCII subset, you can convert them to str with:

>>> lst = [u'KATL', u'KCID']
>>> map(str, lst)
['KATL', 'KCID']

Note that this conversion will fail for non-ASCII strings. To encode Unicode codepoints as a str (string of bytes), you have to choose your encoding algorithm (usually UTF-8) and use .encode() method on your strings:

>>> lst = [u'KATL', u'KCID']
>>> map(lambda x: x.encode('utf-8'), lst)
['KATL', 'KCID']
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download