Asara Asara - 1 year ago 181
Markdown Question

Regex: ignoring match with two brackets

I try to match markup by regex:

1. thats an [ External Link], as you can see
2. thats an [[Internal Link]], as you can see

That should result in

1. thats an [External Link](, as you can see
2. thats an [Internal Link]( Link), as you can see

Both of it work fine with this preg_replaces:

1. $line = preg_replace("/(\[)(.*?)( )(.*)(\])/", "[$4]($2)", $line);
2. $line = preg_replace("/(\[\[)(.*)(\]\])/", "[$2]($2)", $line);

But they interfere with each other, so using the replaces one after the other returns ugly results. So Iam trying to ignore in one of the matches the other one. I tried to replace the first regex by this one:

([^\[]{0,})(\[)([^\[]{1,})( )(.*)(])

It should check if there is only one
and the char after and before isn't a
. But its still matching the
[Internal Link]
within the
, but it should ignore this part completely

Answer Source

With preg_replace_callback you can build a pattern to handle the two cases and to define a conditional replacement in the callback function. In this way the string is parsed only once.

$str =  <<<'EOD'
1. thats an [ External Link], as you can see
2. thats an [[Internal Link]], as you can see

$domain = '';
$pattern = '~\[(?:\[([^]]+)]|([^] ]+) ([^]]+))]~';    

$str = preg_replace_callback($pattern, function ($m) use ($domain) {
    return empty($m[1]) ? "[$m[2]]($m[3])" : "[$m[1]]($domain/$m[1])";
}, $str);

echo $str;

The pattern uses an alternation (?: xxx | yyy). The first branch describes internal links and the second external links.

When the second branch succeeds the first capture group 1 is empty (but defined). The callback function has to test it to know which branch succeeds and to return the appropriate replacement string.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download