Santiago Trejo Santiago Trejo - 2 years ago 139
HTML Question

Remove HTML nodes from HTTP Request

I have some HTML code stored into a string variable, resulting from a


<div>Lots of scripts and libraries</div>
<div>Some very useful data</div>
<div>Not interesting struff</div>

How can I do to remove all unecesary nodes and get into this:

<div>Some very useful data</div>

Answer Source

The easiest way is to use HtmlAgilityPack to grab just the body tag.

var document = new HtmlAgilityPack.HtmlDocument();

HtmlNode body = document.DocumentNode.SelectSingleNode("//body");

From there, you can use HtmlAgilityPack to further parse the body node for more detail.

