Zach Johnson Zach Johnson - 4 months ago 45
C# Question

Remove parent node but keep child node htmlagility pack?

Ok I'm stumped here how can I remove a parent node and replace it with its child?

My goal here is to remove outbound links from images. I do not want to remove normal links fromt he document just remove the ones making an image into a link while keeping the image intact.

<a href=""><img src="logo_w3s.gif"></a>

Should be replaced and become:

<img src="logo_w3s.gif">

Here's my code that doesn't work but I feel is getting close:

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
dynamic allimages = doc.DocumentNode.Descendants("img").ToList;

if (scrapeimages.Checked) {
//the user does want images scraped. Remove image outbound links
try {
foreach (void n_loopVariable in allimages) {
n = n_loopVariable;
if (n.ParentNode.Name == "a") {
dynamic outer = n.OuterHtml;
dynamic newnode = HtmlNode.CreateNode(outer);

n.ParentNode.ReplaceChild(n.ParentNode, newnode);

maintext = doc.DocumentNode.OuterHtml;
} catch {

var node = doc.DocumentNode.SelectSingleNode(yourANode);
node.ParentNode.RemoveChild(node, true);

Something like this should help, this will remove Child of the parent node of your <a>, but it will keep grandChildren. This true parameter in RemoveChild indicates keepGrandChild.

If all <img> have <a>

var nodeList = doc.DocumentNode.SelectNodes("img");

for(HtmlNode node in nodeList)
    var parentATagNode = node.Parent.Parent;
    parentATagNode.RemoveChild(node.Parent, true);