Adam Adam - 1 month ago 11
Javascript Question

How to match all 4byte UTF-8 characters in JavaScript?

I've tried a lot of variations, like

/[\u0FFF-\uFFFF]/
, but it never worked for me as I expected.

The reason why I ask is because the
mysql
version I use doesn't support these characters, and cuts strings when there is an emoticon or something like that. Updating the mysql for the new version is not a solution at the moment.

Answer

According to this, code points U+10000 to U+10FFFF are encoded with 4 bytes.

With a recent enough Node version (v6, perhaps v5 as well but I didn't test), you can use those in a regular expression like this (notice the u flag):

const str = 'hello world