sayth sayth - 5 months ago 8
Ruby Question

How to get descendant nodes from XML based on an attribute

I'm trying to get descendant children of a node:

require 'nokogiri'

@doc = Nokogiri::XML(File.open('data/20160521RHIL0.xml'))
nom_id = @doc.xpath('//race/nomination/@id')

race_id.each do |x|
puts race_id.traverse {|race_id| puts nom_id }
end


I'm looking at two sources of info:


  1. The documentation for
    XML:Node
    , which has

    Nokogiri::XML::Node#children

  2. sparklemotion's Cheat-sheet:

    node.traverse {|node| } # yields all children and self to a block, _recursivel






This is my test XML:

<meeting id="42977">
<race id="215411">
<nomination number="8" saddlecloth="8" horse="Chipanda" id="198926" />
<nomination number="2" saddlecloth="2" horse="Chifries" id="198965" />
<nomination number="1" saddlecloth="1" horse="Itpanda" id="199260" />
</race>
<race id="215412">
<nomination number="1" saddlecloth="1" horse="Ruby" id="199634" />
<nomination number="2" saddlecloth="2" horse="Gems" id="208926" />
<nomination number="3" saddlecloth="3" horse="Rock" id="122923" />
</race>
</meeting>


I can use XPath to easily get the race
id
:

require 'nokogiri'

@doc = Nokogiri::XML(File.open('data/20160521RHIL0.xml'))

race_id = @doc.xpath('//race/@id')
nom_id = @doc.xpath('//race/nomination/@id')

...
215411
215412


How can I get the nodes nomination id and number of just the
race_id
215411 and store it to a hash (like below)?

{215411 => [{id:198926, number:8},{id:198965, number:2}]}

Answer
require 'nokogiri'

# xml data
str =<<-EOS
<meeting id="42977">
  <race id="215411">
    <nomination number="8" saddlecloth="8" horse="Chipanda" id="198926" />
    <nomination number="2" saddlecloth="2" horse="Chifries" id="198965" />
    <nomination number="1" saddlecloth="1" horse="Itpanda" id="199260" />
  </race>
  <race id="215412">
    <nomination number="1" saddlecloth="1" horse="Ruby" id="199634" />
    <nomination number="2" saddlecloth="2" horse="Gems" id="208926" />
    <nomination number="3" saddlecloth="3" horse="Rock" id="122923" />
  </race>
</meeting>
EOS

# create doc
doc = Nokogiri::XML(str)

# clean; via http://stackoverflow.com/a/1528247
doc.xpath('//text()[not(normalize-space())]').remove

# parse doc
parsed_doc = doc.xpath('//race').inject({}) {|h,x| h[x.get_attribute('id').to_i] = x.children.map {|y| {id: y.get_attribute('id').to_i, number: y.get_attribute('number').to_i}}; h}
# {215411=>
#  [{:id=>198926, :number=>8},
#   {:id=>198965, :number=>2},
#   {:id=>199260, :number=>1}],
# 215412=>
#  [{:id=>199634, :number=>1},
#   {:id=>208926, :number=>2},
#   {:id=>122923, :number=>3}]}

# select via id
parsed_doc.select {|k,v| k == 215411}
# {215411=>
#  [{:id=>198926, :number=>8},
#   {:id=>198965, :number=>2},
#   {:id=>199260, :number=>1}]}

Here's the one-liner as a multi-liner:

parsed_doc = doc.xpath('//race').inject({}) do |h,x|
  h[x.get_attribute('id').to_i] = x.children.map do |y|
    {
      id: y.get_attribute('id').to_i,
      number: y.get_attribute('number').to_i
    }
  end
  h
end
Comments