Peter Peter - 4 months ago 9
Javascript Question

Regex group mateches with space

I have simple problem with regex, but I haven't idea for resolve them. I have string (in grey this is a label):


cccc
:ddddd
bbbb
:fgggg
aaa aa
:ddd ddd
cccc
:ggggggg


and regex

/(aaa aa|bbbb|cccc)+:([\sa-zA-Z]*)(?:$|\s)/ig


https://regex101.com/r/mR3vK5/1

After parsing string 'label'
aaa aa
is ignoring, because have space and is taken to second match. I want to do first match labels (with white space or not), colon and anything (with spaces) after insert into second match to next 'label' or end line.

Any suggestions?

Answer

If you know all the keys you may use them inside the positive lookahead and match the values with lazy dot:

/(aaa aa|bbbb|cccc):(.*?)(?=$|\s+(?:aaa aa|bbbb|cccc))/gi

See the JS demo:

var block = "aaa aa|bbbb|cccc";
var rx = RegExp("(" + block + "):(.*?)(?=$|\\s+(?:" + block + "))", "ig");
var s = "cccc:ddddd bbbb:fgggg aaa aa:ddd ddd cccc:ggggggg";
while ((m = rx.exec(s)) !== null) {
    document.body.innerHTML += m[1] + ": " + m[2] + "<br/>";
}

Pattern explanation:

  • (aaa aa|bbbb|cccc) - either of aaa aa or bbbb or cccc
  • : - a literal colon
  • (.*?) - Group 2 matching 0+ any chararacter other than a newline as few as possible up to the first...
  • (?=$|\s+(?:aaa aa|bbbb|cccc)) - (a positive lookahead that limits the .*? matching)
    • $ - ... end of string
    • | - or...
    • \s+ - one or more whitespaces followed with...
      • (?:aaa aa|bbbb|cccc) - any of the three alternatives (inside a non-capturing group used for grouping only, not capturing)
Comments