minerals minerals -4 years ago 110
Python Question

Python and Turkish capitalization

I have not found a good description on how to handle this problem on windows so I am doing it here.

There are two letters in Turkish

ı
(
I
) and
i
(
İ
) which are incorrectly handled by python.

>>> [char for char in 'Mayıs']
['M', 'a', 'y', 'i', 's']

>>> 'ı'.upper().lower()
'i'


How it should be, given the locale is correct:

>>> [char for char in 'Mayıs']
['M', 'a', 'y', 'ı', 's']

>>> 'ı'.upper().lower()
'ı'


and

>>> 'i'.upper()
'İ'

>>> 'ı'.upper()
'I'


I tried
locale.setlocale(locale.LC_ALL,'Turkish_Turkey.1254')
or even
'ı'.encode('cp857')
but it didn't help.

How do I make python handle these two letters correctly?

Answer Source

You should use PyICU

>>> from icu import UnicodeString, Locale
>>> tr = Locale("TR")
>>> s = UnicodeString("i")
>>> print(unicode(s.toUpper(tr)))
İ
>>> s = UnicodeString("I")
>>> print(unicode(s.toLower(tr)))
ı
>>>
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download