Parsing html table containing images to datatable attribute

i used the following code to parse html table inner text to datatable (using Html-Agility-Pack):

Imports System.Net

Public Sub ParseHtmlTable(byval HtmlFilePath as String)

Dim webStream As Stream
Dim webResponse = ""
Dim req As FileWebRequest
Dim res As FileWebResponse

' REQUEST PAGE (We are requesting Google Finance Page with NSE:RENUKA Stock Info

req = WebRequest.Create("file:///" & HtmlFilePath)

req.Method = "GET" ' Method of sending HTTP Request(GET/POST)

res = req.GetResponse ' Send Request

webStream = res.GetResponseStream() ' Get Response

Dim webStreamReader As New StreamReader(webStream)

Dim htmldoc As New HtmlAgilityPack.HtmlDocument

Dim nodes As HtmlAgilityPack.HtmlNodeCollection = htmldoc.DocumentNode.SelectNodes("//table/tbody/tr")

Dim dtTable As New DataTable("Table1")

Dim Headers As List(Of String) = nodes(0).Elements("th").Select(Function(x) x.InnerText.Trim).ToList

For Each Hr In Headers



For Each node As HtmlAgilityPack.HtmlNode In nodes

Dim Row = node.Elements("td").Select(Function(x) x.InnerText.Trim).ToArray



dtTable.WriteXml("G:\1.xml", XmlWriteMode.WriteSchema)

End Sub

How to parse an html table containing images to a Datatable and saving images as binary or saving their links using VB.net

Answer Source

I found the answer finally. images look like:

<img src="img.jpg"/> 

We can use


To return the image path on the node containing it

