Oleksandr Kaleniuk Oleksandr Kaleniuk - 2 months ago 13
PHP Question

How to replace a substring with help of preg_replace

I have a string that consists of repeated words. I want to replace a substring 'OK' located between 'L3' and 'L4'. Below you can find my code:

$search = "/(?<=L3).*(OK).*(?=L4)/";
$replace = "REPLACEMENT";
$subject = "'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'OK'), 'L4' => ('John', 'Madrid', 'OK')";
$str = preg_replace($search, $replace, $str);

If I use that pattern with preg_match, it finds a correct substring(third 'OK'). However, when I apply that pattern to preg_replace, it replaces substring that matches the full pattern, instead of the parenthesized subpattern.

So could you please give me an advice what I should change in my code? I know that there are plenty amount of similar questions about regex, but as I understand my pattern is correct and I'm only confused with preg_replace function


It is true that your regex matches a place in the string that is preceded with L3 then contains the last OK substring after 0+ chars other than linebreak symbols and then matches any 0+ chars up to the place followed with L4. See your regex demo.

A possible solution is to use 2 capturing groups around the subpatterns before and after the OK, and use backreferences in the replacement pattern:

$search = "/(L3.*?)OK(.*?L4)/";
$replace = "REPLACEMENT";
$subject = "'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'OK'), 'L4' => ('John', 'Madrid', 'OK')";
$str = preg_replace($search, '$1'.$replace.'$2', $subject);
echo $str; // => 'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'REPLACEMENT'), 'L4' => ('John', 'Madrid', 'OK')

See the PHP demo

If there cannot be any L3.5 in between L3 and L4, the (L3.*?)OK(.*?L4) pattern is safe to use. It will match and capture L3 and then 0+ chars other than a linebreak up to the first OK, then will match OK, and then will match and capture 0+ chars up to the first L4.

If there can be no L4, use a (?:(?!L4).)* tempered greedy token matching any symbol other than a linebreak symbol that is not starting an L4 sequence:


See the regex demo

NOTE: If you want to make the regexps safer, add ' around L# inside the patterns.