csprv csprv - 2 months ago 22
HTML Question

HtmlAgilityPack SelectSingleNode returns an HtmlNode without InnerHtml

I'm a bit confused about SelectSingleNode method.
I'm passing to it a simple xpath expression and expect to get the node with full content, with all nested nodes, but actually retrieve just a html tag what I was looking for without any inner and outer text, and the node does not contain any childs.

Xpath:

//form


Here is the html:

<HTML>
<BODY>
<FORM METHOD="POST" ACTION="https://test.com/action">
<INPUT TYPE="hidden" NAME="attribute1" VALUE="some value"/>
<INPUT TYPE="hidden" NAME="attribute2" VALUE="another value"/>
</FORM>
</BODY>
</HTML>


And there is a method:

public List<Parameter> CollectFilledInputsFromResponseForm(IRestResponse response, string formXpath)
{
var responseAsHtml = new HtmlDocument();
responseAsHtml.LoadHtml(response.Content);
var formDoc = responseAsHtml.DocumentNode.SelectSingleNode(formXpath);

if (formDoc == null)
throw new Exception(string.Format("No form found for '.{0}' xPath", formXpath));

var formHtml = new HtmlDocument();
formHtml.LoadHtml(formDoc.OuterHtml);
var inputs = formHtml.DocumentNode.SelectNodes("//input");

var parameters = new List<Parameter>();
foreach (var input in inputs)
{
var name = input.GetAttributeValue("name", "Name not found");
var value = input.GetAttributeValue("value", "Value not found");

if (name.Equals("Name not found") || value.Equals("Value not found"))
continue;

parameters.Add(new Parameter(){Name = name,Value = value,Type = ParameterType.GetOrPost});
}

return parameters;
}


Locals Screenshot

Please advice.

Bob Bob
Answer

Do HtmlNode.ElementsFlags.Remove("form"); before loading the document

see http://stackoverflow.com/a/4219060/4033466