Madhavan Kumar Madhavan Kumar - 7 months ago 48
Python Question

ElementTree - findall to recursively select all child elements

Python code:

import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print root.findall('saybye')

h.xml code:


Code outputs,

[<Element 'saybye' at 0x7fdbcbbec690>, <Element 'saybye' at 0x7fdbcbbec790>]

which is a child of another
is not selected here. So, how to instruct findall to recursively walk down the DOM tree and collect all three


Quoting findall,

Element.findall() finds only elements with a tag which are direct children of the current element.

Since it finds only the direct children, we need to recursively find other children, like this

>>> import xml.etree.ElementTree as ET
>>> def find_rec(node, element, result):
...     for item in node.findall('saybye'):
...         result.append(item)
...         find_rec(item, element, result)
...     return result
>>> find_rec(ET.parse("h.xml"), 'saybye', [])
[<Element 'saybye' at 0x7f4fce206710>, <Element 'saybye' at 0x7f4fce206750>, <Element 'saybye' at 0x7f4fce2067d0>]

Even better, make it a generator function, like this

>>> def find_rec(node, element):
...     for item in node.findall('saybye'):
...         yield item
...         for child in find_rec(item, element):
...             yield child
>>> list(find_rec(ET.parse("h.xml"), 'saybye'))
[<Element 'saybye' at 0x7f4fce206a50>, <Element 'saybye' at 0x7f4fce206ad0>, <Element 'saybye' at 0x7f4fce206b10>]