XCanG XCanG - 3 months ago 6
HTML Question

How to properly exclude group in regex?

I need to match in some text some pattern, but this pattern should not have another pattern.
I use in html some groups and html page does not add new line. Rather than new line in html added
so I get trouble here.

I try to use this regex:

/\|([^\r\n|]+?(?!<br>))\|/igm


and example is:

test1 | test2 | test3<br>| test4<br>| test5 |<br>test6


Should be matching only
| test2 |
and group
test2
, but right now also matching
| test4<br>|
and not right
| test5 |
. I need to exclude test4 match, but don't know how to use it with
[]
because it ignored
(?!<br>)
.

P.S. of course
| test2 |
also may be
| text1 <span ...>text2</span> text3 |
, so placing
<>
into
[]
is not a solution I need.

Answer

The regex you need should be based on a tempered greedy token:

/\|((?:(?!<br\s*\/?>)[^\r\n|])*)\|/gi
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^

See the regex demo

The token is (?:(?!<br\s*\/?>)[^\r\n|])* and it matches any character other than a CR/LF/| (the [^\r\n|] negated character class accounts for that) that is not starting a <br> tag sequence (or <br > or <br/> or <br />, etc.) The contents matched with the token are captured into group #1 since it is wrapped with a capturing parentheses (...).

JS demo:

var re = /\|((?:(?!<br\s*\/?>)[^\r\n|])*)\|/ig; 
var str = 'test1 | test2 | test3<br>| test4<br>| test5 |<br>test6|';
var res = [];
while ((m = re.exec(str)) !== null) {
  res.push(m[1]); // Grab Group 1 value only
}
console.log(res);

Comments