Martin AJ Martin AJ - 3 months ago 18
HTML Question

How can I strip HTML tags that have attribute(s) from string?

I have a question and answer website like SO. Also I have a textarea and a preview under it (exactly the same as SO). I use a markdown library to converts some symbols to HTML tags. For example that JS library replaces

**
with
<b>
. Ok all fine.

Now I need to escape HTML tags that have attribute. I can do that by PHP like this:

<?php

$data = <<<DATA
<div>
<p>These line shall stay</p>
<p class="myclass">Remove this one</p>
<p>But keep this</p>
<div style="color: red">and this</div>
</div>
DATA;

$dom = new DOMDOcument();
$dom->loadHTML($data, LIBXML_HTML_NOIMPLIED);

$xpath = new DOMXPath($dom);

$lines_to_be_removed = $xpath->query("//*[count(@*)>0]");

foreach ($lines_to_be_removed as $line) {
$line->parentNode->removeChild($line);
}

// just to check
echo $dom->saveHtml($dom->documentElement);
?>


I'm not sure code above is the best, but as you see (in the fiddle I've linked) it works as expected. I mean it removes nodes that are at least one attribute. Now I need to do that by JS (or jQuery) (I need this for that textarea preview simulator). Anyway how can I do that? Do I need regex?

Answer

You could do something like this:

$('.myTextArea *').each(function(){
    if (this.attributes.length)
        $(this).remove();
});

JSFIDDLE

It's not the most efficient, but if it's just a textarea preview it should be fine. I'd recommend running it as little as possible though. As far as I know there is no selector (jQuery or otherwise) that would otherwise do this...so you have to make the JS do the work.


Edit based on comment:

To not remove the element, just the surrounding tag, do something like this:

$('.myTextArea *').each(function(){
    if (this.attributes.length)
        this.outerHTML = this.textContent;
});

JSFIDDLE