Diksha Diksha - 3 months ago 7
PHP Question

Regex to replace html tags between parenthesis

I need to replace html tags placed between parenthesis. Following is my code. Any help would be appreciated.

$string = '<table><tr>Hello{<strong><br/>name<br/></strong>}</tr></table>';
echo preg_replace("/\{<.*?>\}/","",$string);


Required output is

<table><tr>Hello name</tr></table>

Answer

You can not do this using a simple regex alone, but you can use a regex to find the paranthesis blocks as follwing

function process_paranthesis($match) {
  return strip_tags($match[1]);
}

$string  = '<table><tr>Hello { <strong>name</strong>}</tr></table>';
echo preg_replace_callback("/\{([^\}]*)\}/", "process_paranthesis",$string);

The RegEx was modified to just find all {...}-blocks and we use preg_replace_callback(), which calls a function that computes the string that the match is to be replaced to. The parameter $match of the callback function contains information about the match in various ways. $match[0] contains the whole text of the match and $match[1] contains the text within the first paranthesis within the match. The function strip_tags() is then used within the callback function, to remove all HTML-Tags. This is a predefined function and should be used instead of reinventing the wheel.

The RegEx is constructed as following:

  1. A match starts with a { and ends with a }; we need to escape it so we use \{...\}.
  2. We want to process everything, but the surrounding { and }, so we put round paranthesis inside: \{(...)\} and will then get the whole content within the curly braces as $match[1] without further need to remove those curly braces by using other string functions.
  3. We want to allow all characters between the { and } except for the } itself; we use [^\}], which matches every kind of character but }; and we want to allow multiple of them, resulting in: [^\}]*

NOTE: .* is greedy. So, if we just use .* instead of [^\}]* we would get weird results in case there are multiple blocks of curly braces. The match would start at the first opening { and end at the last } within the string and would containing all blocks and everything between it. This would match like this: "Text {in first} something between {and second one}. And some more." -- But we want it to match like this: "Text {in first} something between {and second one}. And some more.", right?

Comments