CSAT CSAT - 6 months ago 75
HTML Question

Remove style from HTML Tags using Regex C#

I want to remove style from HTML Tags using C#. It should return only HTML Simple Tags.


For i.e.
if

String = <p style="margin: 15px 0px; padding: 0px; border: 0px; outline: 0px;">Hello</p>

Then it should return
String = <p>Hello</p>




Like that for all HTML Tags,
<strong></string>, <b></b>
etc. etc.


Please help me for this.

Answer

First, as others suggest, an approach using a proper HTML parser is much better. Either use HtmlAgilityPack or CsQuery.

If you really want a regex solution, here it is:

Replace this pattern: (<.+?)\s+style\s*=\s*(["']).*?\2(.*?>)
With: $1$3

Demo: http://regex101.com/r/qJ1vM1/1


To remove multiple attributes, since you're using .NET, this should work:

Replace (?<=<[^<>]+)\s+(?:style|class)\s*=\s*(["']).*?\1
With an empty string