Pierre M Fiorini Pierre M Fiorini - 4 months ago 16
HTML Question

XPath - Need to Parse Some Html

I have some Html in a doc that looks like this:

enter image description here

I am using this code:

Dim web As New HtmlWeb
Dim doc As New HtmlDocument
doc = web.Load("http://www.reedmantollchryslerdodgejeepram.com/new-inventory/index.htm?search=&saveFacetState=true&year=2017&lastFacetInteracted=inventory-listing1-facet-anchor-year-0")

Dim label = doc.DocumentNode.SelectNodes("//*[@class='facetmulti-label make']")


That "kind of" works because I get this result (as one item in a collection of two):

enter image description here

Ideally, the end result of this scrape would be "Chrysler (35)"

I'm new to XPath...can this be done?

EDITED:

To be clear (hopefully), on the webpage, I am trying to parse these quantities (i.e., the "Chrysler (35)" and "Fiat (3)"):

enter image description here

Thanks for help.

Answer

You can use the value attribute in the input tag to find the relevant part of the html then get the parent with /..:

"//input[@value='Chrysler']/..//text()"

Or select the label using filtering by the input child that has the attribute value='Chrysler':

"//label[@class='facetmulti-label  make'][input/@value='Chrysler']//text()" 

Or using contains to find the label using the label text:

"//label[@class='facetmulti-label  make' and contains(.,'Chrysler')]//text()"

You can also combine any with normalize-space if you want to remove newlines etc..

"normalize-space(//input[@value='Chrysler']/..)"
Comments