4EverLive 4EverLive - 1 month ago 5
Javascript Question

Why does JavaScript string.split(/[ -_]+/) act as if quotes (' and ") are included in the [group]?

When using

/[ -_]+/
as parameter to string.split in JavaScript, it acts as if it was
/['\"]+/


"a'b".split(/[ -_]+/)
'a"b'.split(/[ -_]+/)


returns

["a", "b"]


I only see this behavior with the exact regex:
[ -_]
i.e. space, hyphen, and underscore. If I remove any of these 3 characters it behaves (what appears to me to be) correctly, by not splitting on ' and "

Is this behavior correct?

Answer

- has special meaning inside of character classes. It denotes a range of characters. In this case the range is from space (ASCII 32) to underscore (ASCII 95). Because the ASCII codes for ' and " are 39 and 34 (respectively) they fall within that range.

Escape it if you want to split on a literal -;

"a'b".split(/[ \-_]+/)

Or make the hyphen the first character of the character class:

"a'b".split(/[- _]+/)