randombits randombits - 2 months ago 6x
Python Question

How do I parse XML in Python?

I have many rows in a database that contains xml and I'm trying to write a Python script that will go through those rows and count how many instances of a particular node attribute show up. For instance, my tree looks like:

<type foobar="1"/>
<type foobar="2"/>

How can I access the attributes 1 and 2 in the XML using Python?


I suggest ElementTree. There are other compatible implementations of the same API, such as lxml, and cElementTree in the Python standard library itself; but, in this context, what they chiefly add is even more speed -- the ease of programming part depends on the API, which ElementTree defines.

After building an Element instance e from the XML, e.g. with the XML function, or by parsing a file with something like

import xml.etree.ElementTree
e = xml.etree.ElementTree.parse('thefile.xml').getroot()

or any of the many other ways shown at ElementTree, you just do something like:

for atype in e.findall('type'):

and similar, usually pretty simple, code patterns.