Madhavan Kumar Madhavan Kumar - 10 months ago 78
Python Question

ElementTree - findall to recursively select all child elements

Python code:

import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print root.findall('saybye')

h.xml code:


Code outputs,

[<Element 'saybye' at 0x7fdbcbbec690>, <Element 'saybye' at 0x7fdbcbbec790>]

which is a child of another
is not selected here. So, how to instruct findall to recursively walk down the DOM tree and collect all three

Answer Source

Quoting findall,

Element.findall() finds only elements with a tag which are direct children of the current element.

Since it finds only the direct children, we need to recursively find other children, like this

>>> import xml.etree.ElementTree as ET
>>> def find_rec(node, element, result):
...     for item in node.findall('saybye'):
...         result.append(item)
...         find_rec(item, element, result)
...     return result
>>> find_rec(ET.parse("h.xml"), 'saybye', [])
[<Element 'saybye' at 0x7f4fce206710>, <Element 'saybye' at 0x7f4fce206750>, <Element 'saybye' at 0x7f4fce2067d0>]

Even better, make it a generator function, like this

>>> def find_rec(node, element):
...     for item in node.findall('saybye'):
...         yield item
...         for child in find_rec(item, element):
...             yield child
>>> list(find_rec(ET.parse("h.xml"), 'saybye'))
[<Element 'saybye' at 0x7f4fce206a50>, <Element 'saybye' at 0x7f4fce206ad0>, <Element 'saybye' at 0x7f4fce206b10>]