stack stack - 3 months ago 10
HTML Question

How can I remove all HTML tags from a PHP string?

I have a PHP string like this:

$string = "<b class='classname'>this</b> is a `<a href='#'>link</a>`

<p>and this is a test</p>

Also this is <i>another</i> test.";


I want this output:

$string = "this is a `<a href='#'>link</a>`

<p>and this is a test</p>

Also this is another test.";


As you see, I want to remove all HTML tags except:


  • it's surrounded between this "`".

  • there is four space in the beginning of it, plus an enter (blank line) in the top and bottom of it.



Note: I can use
strip_tags()
to remove all HTML tags, but it will also remove those tags which shouldn't be removed. Also
htmlspecialchars()
doesn't work as expected.

Answer

Well its ugly but works on this example

<?php
function translate($m) {
     if(isset($m[1]) && $m[1] != "") {
       $m[0] = str_replace($m[1], "", $m[0]);
       return strip_tags($m[1]).$m[0];
     }else {
       return strip_tags($m[0]);
     }
  }

$re = "/(.*)`.*`|\n((?<![[:space:]]{4})(.*)\n)/m";
$string = "this is a `<a href='#'>link</a>`

               <p>and this is a test</p>

           Also this is another test.";
$string = $string.$string.$string.$string;
echo preg_replace_callback($re, "translate", $string);
?>

Output:

this is a `<a href='#'>link</a>`

               <p>and this is a test</p>

           Also this is another test.this is a `<a href='#'>link</a>`

               <p>and this is a test</p>

           Also this is another test.this is a `<a href='#'>link</a>`

               <p>and this is a test</p>

           Also this is another test.this is a `<a href='#'>link</a>`

               <p>and this is a test</p>

           Also this is another test.
Comments