Adam Adam - 1 year ago 142
Javascript Question

How to match all 4byte UTF-8 characters in JavaScript?

I've tried a lot of variations, like

, but it never worked for me as I expected.

The reason why I ask is because the
version I use doesn't support these characters, and cuts strings when there is an emoticon or something like that. Updating the mysql for the new version is not a solution at the moment.

Answer Source

According to this, code points U+10000 to U+10FFFF are encoded with 4 bytes.

With a recent enough Node version (v6, perhaps v5 as well but I didn't test), you can use those in a regular expression like this (notice the u flag):

const str = 'hello world
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download