I am working with text data with a mix of several languages. Now trying to test whether a token/string is alphabetical, which means is potentially a word.
Is there some built in function like
Will this solve your problem?
>>> u'é'.isalpha() True
Just as an FYI, the below example works perfectly in Python 3:
words = ['você', 'quer', 'uma', 'maçã'] for word in words: word.isalpha()
In python 2, you could do something like:
for word in words: unicode(word, "utf-8").isalpha()