H.Fadlallah H.Fadlallah - 1 month ago 12
Vb.net Question

Parsing html table containing images to datatable attribute

i used the following code to parse html table inner text to datatable (using Html-Agility-Pack):

Imports System.Net

Public Sub ParseHtmlTable(byval HtmlFilePath as String)

Dim webStream As Stream
Dim webResponse = ""
Dim req As FileWebRequest
Dim res As FileWebResponse

' REQUEST PAGE (We are requesting Google Finance Page with NSE:RENUKA Stock Info

req = WebRequest.Create("file:///" & HtmlFilePath)



req.Method = "GET" ' Method of sending HTTP Request(GET/POST)

res = req.GetResponse ' Send Request

webStream = res.GetResponseStream() ' Get Response

Dim webStreamReader As New StreamReader(webStream)

Dim htmldoc As New HtmlAgilityPack.HtmlDocument
htmldoc.LoadHtml(webStreamReader.ReadToEnd())

Dim nodes As HtmlAgilityPack.HtmlNodeCollection = htmldoc.DocumentNode.SelectNodes("//table/tbody/tr")

Dim dtTable As New DataTable("Table1")

Dim Headers As List(Of String) = nodes(0).Elements("th").Select(Function(x) x.InnerText.Trim).ToList

For Each Hr In Headers

dtTable.Columns.Add(Hr)

Next


For Each node As HtmlAgilityPack.HtmlNode In nodes

Dim Row = node.Elements("td").Select(Function(x) x.InnerText.Trim).ToArray

dtTable.Rows.Add(Row)

Next


dtTable.WriteXml("G:\1.xml", XmlWriteMode.WriteSchema)

End Sub


How to parse an html table containing images to a Datatable and saving images as binary or saving their links using VB.net

Answer

I found the answer finally. images look like:

<img src="img.jpg"/> 

We can use

.SelectNodes("./img").Attributes("src").Value()

To return the image path on the node containing it