notif notif - 4 months ago 14
Ruby Question

How to match latin and not latin characters by normalised version of string?

How to find by

Bartlomiej Zolc
also string with latin version?
My current version matches only english version.

regex101

/Bartlomiej Zolc/g

hello Bartłomiej Żółć match me!
hello Bartlomiej Zolc match me too!

Answer

It may be prohibitively hard to normalize the thing you match against, so I recommend changing the regex.

I don't know if Ruby supports the [=o=] (which matches o and all its accented versions) POSIX bracket expression syntax, but there is also another way.

Replace every letter with an alternative accented form with a character class. For example:

/Bart[lł]omiej [ZŻ][oó][lł][cć]/g