sawa sawa - 7 months ago 34
Ruby Question

Character classes used in ffi-aspell

I am trying to use the ffi-aspell gem to spell check a text. In order to do that, it seems that I have to extract the words by myself. I am trying to do that by applying

String#scan
to the text with a regex, but it does not seem straightforward.

What is the easiest way to define the class of characters that may appear in an ffi-aspell dictionary of some language? I want to make it available not only for English, so things like
/[a-zA-Z']/
for the character (or
/[a-zA-Z']+/
the word) does not work.
/[[:word:]]/
seems to capture characters that are not in the dictionary, such as numerals, and further does not match the apostrophe (single quote), which is frequently used in a word. Is there some documentation that defines the character set used in an ffi-aspell dictionary?

Answer

I guess it would be easier to scan ffi_aspell dictionary first for entries and just kinda Regexp#union uniques afterwards.