CSAT CSAT - 1 year ago 151
HTML Question

Remove style from HTML Tags using Regex C#

I want to remove style from HTML Tags using C#. It should return only HTML Simple Tags.

For i.e.

String = <p style="margin: 15px 0px; padding: 0px; border: 0px; outline: 0px;">Hello</p>

Then it should return
String = <p>Hello</p>

Like that for all HTML Tags,
<strong></string>, <b></b>
etc. etc.

Please help me for this.

Answer Source

First, as others suggest, an approach using a proper HTML parser is much better. Either use HtmlAgilityPack or CsQuery.

If you really want a regex solution, here it is:

Replace this pattern: (<.+?)\s+style\s*=\s*(["']).*?\2(.*?>)
With: $1$3

Demo: http://regex101.com/r/qJ1vM1/1

To remove multiple attributes, since you're using .NET, this should work:

Replace (?<=<[^<>]+)\s+(?:style|class)\s*=\s*(["']).*?\1
With an empty string