stack stack - 1 year ago 83
HTML Question

How can I remove all HTML tags from a PHP string?

I have a PHP string like this:

$string = "<b class='classname'>this</b> is a `<a href='#'>link</a>`

<p>and this is a test</p>

Also this is <i>another</i> test.";


I want this output:

$string = "this is a `<a href='#'>link</a>`

<p>and this is a test</p>

Also this is another test.";


As you see, I want to remove all HTML tags except:


  • it's surrounded between this "`".

  • there is four space in the beginning of it, plus an enter (blank line) in the top and bottom of it.



Note: I can use
strip_tags()
to remove all HTML tags, but it will also remove those tags which shouldn't be removed. Also
htmlspecialchars()
doesn't work as expected.

Answer Source

Well its ugly but works on this example

<?php
function translate($m) {
     if(isset($m[1]) && $m[1] != "") {
       $m[0] = str_replace($m[1], "", $m[0]);
       return strip_tags($m[1]).$m[0];
     }else {
       return strip_tags($m[0]);
     }
  }

$re = "/(.*)`.*`|\n((?<![[:space:]]{4})(.*)\n)/m";
$string = "this is a `<a href='#'>link</a>`

               <p>and this is a test</p>

           Also this is another test.";
$string = $string.$string.$string.$string;
echo preg_replace_callback($re, "translate", $string);
?>

Output:

this is a `<a href='#'>link</a>`

               <p>and this is a test</p>

           Also this is another test.this is a `<a href='#'>link</a>`

               <p>and this is a test</p>

           Also this is another test.this is a `<a href='#'>link</a>`

               <p>and this is a test</p>

           Also this is another test.this is a `<a href='#'>link</a>`

               <p>and this is a test</p>

           Also this is another test.