Dave Dave - 6 days ago 6
Ruby Question

How do I find direct children and not nested children using Rails and Nokogiri?

I’m using Rails 4.2.7 with Ruby (2.3) and Nokogiri. How do I find the most direct tr children of a table, as opposed to nested ones? Currently I find table rows within a table like so …

tables = doc.css('table')
tables.each do |table|
rows = table.css('tr')


This not only finds direct rows of a table, e.g.

<table>
<tbody>
<tr>…</tr>


but it also finds rows within rows, e.g.

<table>
<tbody>
<tr>
<td>
<table>
<tr>This is found</tr>
</table>
</td>
</tr>


How do I refine my search to only find the direct tr elements?

Answer

You can do it in a couple of steps using XPath. First you need to find the “level” of the table (i.e. how nested it is in other tables), then find all descendant tr that have the same number of table ancestors:

tables = doc.xpath('//table')
tables.each do |table|
  level = table.xpath('count(ancestor-or-self::table)')
  rows = table.xpath(".//tr[count(ancestor::table) = #{level}]")
  # do what you want with rows...
end