Bin Bin - 4 months ago 10
Python Question

Check whether a string is alphabetical for languages other than english

I am working with text data with a mix of several languages. Now trying to test whether a token/string is alphabetical, which means is potentially a word.
Is there some built in function like

'somestring'.isAlpha()
to test whether a string is alphabetical for other languages (Portuguese and Spanish)? I tried
'ó'.isalpha()
, which returns
False
.

What I thought of now is to get the Unicode table. Find the starting and ending letter and test whether a letter is in the range of alphabets.

Answer

Will this solve your problem?

>>> u'é'.isalpha()
True

Just as an FYI, the below example works perfectly in Python 3:

words = ['você', 'quer', 'uma', 'maçã']
for word in words:
    word.isalpha()

In python 2, you could do something like:

for word in words:
    unicode(word, "utf-8").isalpha()
Comments