Hightower Hightower - 6 months ago 41
Python Question

Recursively search for parent child combinations and build tree in python and XML

I am trying to traverse this XML data full of parent->child relationships and need a way to build a tree. Any help will be really appreciated. Also, in this case, is it better to have attributes or nodes for the parent-->child relationship?

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<nodes>
<node name="Car" child="Engine"/>
<node name="Car" child="Wheel"/>
<node name="Engine" child="Piston"/>
<node name="Engine" child="Carb"/>
<node name="Carb" child="Bolt"/>
<node name="Spare Wheel"/>
<node name="Bolt" child="Thread"/>
<node name="Carb" child="Foat"/>
<node name="Truck" child="Engine"/>
<node name="Engine" child="Bolt"/>
<node name="Wheel" child="Hubcap"/>
</nodes>


On the Python Script, this is what i have. My brain is fried and I cannot get the logic going? please help

import xml.etree.ElementTree as ET
tree = ET.parse('rec.xml')
root = tree.getroot()
def find_node(data,search):
#str = root.find('.//node[@child="1.2.1"]')
for node in data.findall('.//node'):
if node.attrib['name']==search:
print('Child-->', node)

for nodes in root.findall('node'):
parent = nodes.attrib.get('name')
child = nodes.attrib.get('child')
print (parent,'-->', child)
find_node(root,child)


A possible output that is expected is something like this (really dont care about the sorting order, As long as all node items are represented somewhere in the tree.

Car --> Engine --> Piston
Car --> Engine --> Carb --> Float
Car --> Engine --> Carb --> Bolt --> Thread
Car --> Wheel --> Hubcaps
Truck --> Engine --> Piston
Truck --> Engine --> Carb --> Bolt --> Thread
Truck --> Loading Bin
Spare Wheel -->

Answer

rec.xml:

<?xml version="1.0"?>
<nodes>
    <node name="Car" child="Engine"></node>
    <node name="Engine" child="Piston"></node>
    <node name="Engine" child="Carb"></node>
    <node name="Car" child="Wheel"></node>
    <node name="Wheel" child="Hubcaps"></node>
    <node name="Truck" child="Engine"></node>
    <node name="Truck" child="Loading Bin"></node>
    <node name="Piston" child="Loa"></node>
    <node name="Piston" child="Loaqq"></node>
    <node name="Piston" child="Loaww"></node>
    <node name="Loaww" child="Loawwqqqqq"></node>
    <node name="Spare Wheel" child=""></node>
</nodes>

parse.py:-

import xml.etree.ElementTree as ET
tree = ET.parse('rec.xml')
root = tree.getroot()
data = {}
child_list = []
def recursive_print(string,x):
    if x in data.keys():
     for x_child in data[x]:
        if x_child in data.keys():
          recursive_print(string+'-------->'+x_child,x_child)
        else:
         print string+'-------->'+x_child
    else:
       print string

for nodes in root.findall('node'):
    parent = nodes.attrib.get('name')
    child = nodes.attrib.get('child')
    child_list.append(child)
    if parent not in data.keys():
        data[parent] = []
    data[parent].append(child)
for key in data.keys():
    if key not in child_list:
      for x in data[key]:
        string = key+'------->'+x
        recursive_print(string,x)

output:-

Spare Wheel------->
Car------->Engine-------->Piston-------->Loa
Car------->Engine-------->Piston-------->Loaqq
Car------->Engine-------->Piston-------->Loaww-------->Loawwqqqqq
Car------->Engine-------->Carb
Car------->Wheel-------->Hubcaps
Truck------->Engine-------->Piston-------->Loa
Truck------->Engine-------->Piston-------->Loaqq
Truck------->Engine-------->Piston-------->Loaww-------->Loawwqqqqq
Truck------->Engine-------->Carb
Truck------->Loading Bin