romellem romellem - 3 months ago 22
PowerShell Question

Powershell 2.0 - Using HtmlAgilityPack to get children of FORM elements

Main problem stemmed from the fact that HtmlAgiltyPack won't get child nodes from a

<form>
element by default. See How to get all input elements in a form with HtmlAgilityPack without getting a null reference error for more information.

The problem is, that link shows how to fix the issue in C#, but I need to fix it in PowerShell. Any ideas?




I'll simplify my HTML

<form method="POST" action="post.aspx" id="form">
<div>
<input type="hidden" name="test1" id="test1" value="1" />
</div>
<input type="text" name="test2" id="test2" value="12345" />
</form>


Now I see that when I select the
<form>
element, I don't get any children back, hence why I couldn't select the
<input>
elements.

Add-Type -Path "C:\Program Files (x86)\HtmlAgilityPack\HtmlAgilityPack.dll"
$HTMLDocument = New-Object HtmlAgilityPack.HtmlDocument
$HTMLDocument.Load("C:\users\smithj\Desktop\test2.html")
$inputNodes=$HTMLDocument.DocumentNode.SelectNodes("//form")
$inputNodes

# Output shortened to show important bits ...
ChildNodes : {}
HasChildNodes : False


You can see that
HasChildNodes
is equal to false.

From the C# link I provided, I somehow need to run
HtmlNode.ElementsFlags.Remove("form");
but I can't figure out what to type into PowerShell that would be equivalent.

Thanks again!




EDIT



Thanks to har07 for pointing me in the right direction.
[HtmlAgilityPack.HtmlNode]::ElementsFlags.Remove("form")
was what I needed to run.

Note that I need to run that command before I load in the HTML.

> Add-Type -Path ".\Net40\HtmlAgilityPack.dll"
> [HtmlAgilityPack.HtmlNode]::ElementsFlags.Remove("form")
True
>
> $HTMLDocument = New-Object HtmlAgilityPack.HtmlDocument
> $HTMLDocument.Load(".\file.html")
> $HTMLDocument.DocumentNode.SelectNodes("//form")

# Output shortened to show important bits ...
ChildNodes : {#text, div, #text, input...}
HasChildNodes : True
OuterHtml : <form method="POST" action="post.aspx" id="form">
<div>
<input type="hidden" name="test1" id="test1" value="1">
</div>
<input type="text" name="test2" id="test2" value="12345">
</form>

Answer

Actually I'm not a user of PowerShell, but according to this blog post, you may want to try something like this :

[HtmlAgilityPack.HtmlNode.ElementsFlags]::Remove("form")
Comments