prehawk prehawk - 24 days ago 7
CSS Question

XPath: how to extract nodes from a path containing an unknown tag name?

<html><body>
<div id="start">
<div>
<div>NOT A TARGET</div>
</div>
<aBcDeFG>
<div>target</div>
</aBcDeFG>
</div>
</body></html>


There is a document similar to this one. The
<aBcDeFG>
tag is a random tag generated on every page refresh. I wrote an XPath expression with wildcard to locate the
target
div
:

$x('/html/body/div/*/div')


The expression returns two
div
s, as
NOT A TARGET
is matched:
[div, div]
.

$x('/html/body/div/*[2]/div')
doesn't work, the return value is empty.

$x('/html/body/div/node()[2]/div')
doesn't work either, the return value is empty.

How can I locate an unknown tag just by its index?

Answer

Your /html/body/div/*[2]/div selector is correct. I guess you have difficulties fetching the node in the developer tools console. Try this:

$x('/html/body/div/*[2]/div')[0].innerHTML

Example using document.evaluate

var r = document.evaluate('/html/body/div/*[2]/div', document, null, XPathResult.ANY_TYPE, null);
var n = r.iterateNext();
console.log(n.innerHTML);
<html><body>
<div id="start">
   <div>
      <div>NOT A TARGET</div>
   </div>
   <aBcDeFG>
      <div>target</div>
   </aBcDeFG>
</div>
</body></html>