Muhammad Shakir Aziz Muhammad Shakir Aziz - 1 year ago 110
C# Question

Selecting attribute value using XPath and HtmlAgilityPack

I am trying to get second attribute value of a meta tag using an xpath expression in html agility pack:
The meta tag:

<meta name="pubdate" content="2012-08-30" />

The xml path expression i am using:


But it does not return anything. I tried to search around and implement this solution:


Another way:


But it gives xml exception in html agility pack.
Another solution did not work as well.


For reasons i wanted to use just xml path (and not html agility pack functions to get the attribute value). The function i use is below:

date = TextfromOneNode(document.DocumentNode.SelectSingleNode(".//body"), "meta[@name='pubdate']/@content");
public static string TextfromOneNode(HtmlNode node, string xmlPath)
string toReturn = "";
if(node.SelectSingleNode(xmlPath) != null)
toReturn = node.SelectSingleNode(xmlPath).InnerText;
return toReturn;

So far it looks like there is no way to use xml path expression to get an attribute value directly.
Any ideas?

Answer Source

There is a way using HtmlNodeNavigator :

public static string TextfromOneNode(HtmlNode node, string xmlPath)
    string toReturn = "";
    var navigator = (HtmlAgilityPack.HtmlNodeNavigator)node.CreateNavigator();
    var result = navigator.SelectSingleNode(xmlPath);
    if(result != null)
        toReturn = result.Value;
    return toReturn;

The following console app example demonstrates how HtmlNodeNavigator.SelectSingleNode() works with both XPath that return element and XPath that return attribute :

var raw = @"<div>
<meta name='pubdate' content='2012-08-30' />
var doc = new HtmlAgilityPack.HtmlDocument(); 

var navigator = (HtmlAgilityPack.HtmlNodeNavigator)doc.CreateNavigator();

var xpath1 = "//meta[@name='pubdate']/@content";
var xpath2 = "//span";

var result = navigator.SelectSingleNode(xpath1);
result = navigator.SelectSingleNode(xpath2);

dotnetfiddle demo

output :