Blaise Blaise - 7 days ago 4x
Javascript Question

Split a string by whitespace, keeping quoted segments, allowing escaped quotes

I currently have this regular expression to split strings by all whitespace, unless it's in a quoted segment:

keywords = 'pop rock "hard rock"';
keywords = keywords.match(/\w+|"[^"]+"/g);
console.log(keywords); // [pop, rock, "hard rock"]

However, I also want it to be possible to have quotes in keywords, like this:

keywords = 'pop rock "hard rock" "\"dream\" pop"';

This should return

[pop, rock, "hard rock", "\"dream\" pop"]

What's the easiest way to achieve this?


You can change your regex to:

keywords = keywords.match(/\w+|"(?:\\"|[^"])+"/g);

Instead of [^"]+ you've got (?:\\"|[^"])+ which should be self-explanatory - allow \" or other character, but not an unescaped quote.

One important note is that if you want the string to include a literal slash, it should be:

keywords = 'pop rock "hard rock" "\\"dream\\" pop"'; //note the escaped slashes.

Also, there's a slight inconsistency between \w+ and [^"]+ - for example, it will match the word "ab*d", but not ab*d (without quotes). Consider using [^"\s]+ instead, that will match non-spaces.