Oscar321 Oscar321 - 6 months ago 17
Ruby Question

How to get path from XML file using Nokogiri

I have the following XML file named "test.xml":

<REGSET>
<PATHELEMENT node="REGISTRY">
<PATHELEMENT node="CLASSES">
<PATHELEMENT node="(Default)" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="MACHINE">
<PATHELEMENT node="SOFTWARE">
<PATHELEMENT node="Wow6432Node">
<PATHELEMENT node="Microsoft">
<PATHELEMENT node="Windows">
<PATHELEMENT node="CurrentVersion">
<PATHELEMENT node="Uninstall">
<PATHELEMENT node="pgAgent 3.3.0-5">
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="PgBouncer 1.5.4-3">
<PATHELEMENT node="(Default)" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="PostgreSQL 8.3">
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="SYSTEM">
<PATHELEMENT node="CurrentControlSet">
<PATHELEMENT node="Control">
<PATHELEMENT node="Session Manager">
<PATHELEMENT node="Environment">
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="Services">
<PATHELEMENT node="pgAgent">
<PATHELEMENT node="Enum">
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="Security">
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="pgbouncer">
<PATHELEMENT node="Enum">
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="Security">
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="postgresql-8.3">
<PATHELEMENT node="Enum">
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="Security">
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
<PATHELEMENT node="(Default)
" keyValue="true"/>
</PATHELEMENT>
</REGSET>


I need to access elements like the following:

\REGISTRY\CLASSES\(Default)

\MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows\CurrentVersion\Uninstall\pgAgent 3.3.0-5\(Default)

\MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows\CurrentVersion\Uninstall\PgBouncer 1.5.4-3\(Default)


How do I get read the XML with Nokogiri to retrieve values from those elements?

Answer

Using XPath first get the child nodes (".//REGSET//*[not(*)]"). Then for each child, go backwards and keep adding each parent until the parent node is <REGSET>. The code should be something like this:

       key_value = ""
       doc.xpath(".//REGSET//*[not(*)]").each do |node|
          path = node['node']
          if(node['node'] != nil)
            key_value = node['keyValue']
          end
          while node.parent.name != 'REGSET'
            node = node.parent
            path = node['node'] + "\\" + path
          end
        end
Comments