Asara Asara - 4 months ago 22
Markdown Question

Regex: ignoring match with two brackets

I try to match markup by regex:

1. thats an [www.external.com External Link], as you can see
2. thats an [[Internal Link]], as you can see


That should result in

1. thats an [External Link](www.external.com), as you can see
2. thats an [Internal Link](wiki.com/Internal Link), as you can see


Both of it work fine with this preg_replaces:

1. $line = preg_replace("/(\[)(.*?)( )(.*)(\])/", "[$4]($2)", $line);
2. $line = preg_replace("/(\[\[)(.*)(\]\])/", "[$2](wiki.com/$2)", $line);


But they interfere with each other, so using the replaces one after the other returns ugly results. So Iam trying to ignore in one of the matches the other one. I tried to replace the first regex by this one:

([^\[]{0,})(\[)([^\[]{1,})( )(.*)(])


It should check if there is only one
[
and the char after and before isn't a
[
. But its still matching the
[Internal Link]
within the
[]
, but it should ignore this part completely

Answer

With preg_replace_callback you can build a pattern to handle the two cases and to define a conditional replacement in the callback function. In this way the string is parsed only once.

$str =  <<<'EOD'
1. thats an [www.external.com External Link], as you can see
2. thats an [[Internal Link]], as you can see
EOD;

$domain = 'wiki.com';
$pattern = '~\[(?:\[([^]]+)]|([^] ]+) ([^]]+))]~';    

$str = preg_replace_callback($pattern, function ($m) use ($domain) {
    return empty($m[1]) ? "[$m[2]]($m[3])" : "[$m[1]]($domain/$m[1])";
}, $str);

echo $str;

The pattern uses an alternation (?: xxx | yyy). The first branch describes internal links and the second external links.

When the second branch succeeds the first capture group 1 is empty (but defined). The callback function has to test it to know which branch succeeds and to return the appropriate replacement string.