Martin AJ Martin AJ - 1 month ago 4x
HTML Question

How can I strip HTML tags that have attribute(s) from string?

I have a question and answer website like SO. Also I have a textarea and a preview under it (exactly the same as SO). I use a markdown library to converts some symbols to HTML tags. For example that JS library replaces

. Ok all fine.

Now I need to escape HTML tags that have attribute. I can do that by PHP like this:


$data = <<<DATA
<p>These line shall stay</p>
<p class="myclass">Remove this one</p>
<p>But keep this</p>
<div style="color: red">and this</div>

$dom = new DOMDOcument();

$xpath = new DOMXPath($dom);

$lines_to_be_removed = $xpath->query("//*[count(@*)>0]");

foreach ($lines_to_be_removed as $line) {

// just to check
echo $dom->saveHtml($dom->documentElement);

I'm not sure code above is the best, but as you see (in the fiddle I've linked) it works as expected. I mean it removes nodes that are at least one attribute. Now I need to do that by JS (or jQuery) (I need this for that textarea preview simulator). Anyway how can I do that? Do I need regex?


You could do something like this:

$('.myTextArea *').each(function(){
    if (this.attributes.length)


It's not the most efficient, but if it's just a textarea preview it should be fine. I'd recommend running it as little as possible though. As far as I know there is no selector (jQuery or otherwise) that would otherwise do you have to make the JS do the work.

Edit based on comment:

To not remove the element, just the surrounding tag, do something like this:

$('.myTextArea *').each(function(){
    if (this.attributes.length)
        this.outerHTML = this.textContent;