Kevin George Kevin George - 5 months ago 16
HTML Question

Using XPath to select the href attribute of the following-sibling

I am attempting to scrape the following site: http://www.hudson211.org/zf/profile/service/id/659837

I am trying to select the href next to the "web address" text. The following xpath selector gets the tag I am after:

$x("//th[contains(text(), 'Web Address')]/following-sibling::td/a")


returns

<a href="http://www.co.sullivan.ny.us">www.co.sullivan.ny.us</a>


However, when I specifically try to extract the href using @href, the return value is an empty array:

$x("//th[contains(text(), 'Web Address')]/following-sibling::td/a/@href")


returns
[]


This is the html of the row I am looking at:

<tr valign="top">
<td class="profile_view_left"></td>
<th align="left" class="profile_view_center">Web Address</th>
<td class="profile_view_right">
<ahref="http://www.co.sullivan.ny.us">www.co.sullivan.ny.us</a> </td>
<td></td>
</tr>

Answer

I assume you're using Google Chrome console because of that $x() function. Your xpath which selects @href attribute actually worked, as I tested in my Chrome, only the result is not displayed in the console like when you selected an element -for a reason that I'm not quite sure at the moment- :

>var result = $x("//th[contains(text(), 'Web Address')]/following-sibling::td/a/@href")
undefined
>result[0].value
"http://www.co.sullivan.ny.us"

see that using the exact same expression, variable result contains the expected url value. If your intention is simply to display single href value in the console without further processing, this will do :

>$x("//th[contains(text(), 'Web Address')]/following-sibling::td/a/@href")[0].value
"http://www.co.sullivan.ny.us"
Comments