Robert Harvey Robert Harvey - 1 month ago 6
C# Question

How do I get my XPath to search only within each table?

I have a bit of HTML that looks like this:

<table class="resultsTable">
<tbody>
<tr class="even">
<td width="35%"><strong>Name</strong></td>
<td>ACME ANVILS, INC.</td>
</tr>
</tbody>
</table>


and some C# code that looks like this:

var name = document.DocumentNode
.SelectSingleNode("//*[text()='Name']/following::td").InnerText


which happily returns

ACME ANVILS, INC.


However, there's a new wrinkle. The page in question now returns multiple results:

<table class="resultsTable">
<tbody>
<tr class="even">
<td width="35%"><strong>Name</strong></td>
<td>ACME ANVILS, INC.</td>
</tr>
</tbody>
</table>
<table class="resultsTable">
<tbody>
<tr class="even">
<td width="35%"><strong>Name</strong></td>
<td>ROAD RUNNER RACES, LLC</td>
</tr>
</tbody>
</table>


So now I'm working with

var tables = document.DocumentNode.SelectNodes("//table/tbody");
foreach (var table in tables)
{
var name = table.SelectSingleNode("//*[text()='Name']/following::td").InnerText;
...
}


Which falls over, because
SelectSingleNode
returns null.

How do I get my XPath to actually return a result, searching only within the specific table I have selected?

Answer

Change your absolute XPath,

 //*[text()='Name']/following::td

to a relative XPath:

 .//*[text()='Name']/following::td

Update: Ah, yes, per your comment, the following:: axis can now be returning multiple td elements now that there are multiple tables.

Grab just the first,

 (.//*[text()='Name']/following::td)[1]

or, better note the difference between Testing text() nodes vs string values in XPath, and use the following-sibling:: axis instead.

 .//td[.='Name']/following-sibling::td
Comments