Alex Morega Alex Morega - 7 months ago 12
Python Question

How to stream XML output quickly from Python

What is a quick way of writing an XML file iteratively (i.e. without having the whole document in memory)?

xml.sax.saxutils.XMLGenerator
works but is slow, around 1MB/s on an I7 machine. Here is a test case.

Answer

I realize that this question has been asked awhile ago, but, in the mean time, an lxml API has been introduced that looks promising in terms of addressing the problem: http://lxml.de/api.html ; specifically, refer to the following section: "Incremental XML generation".

I quickly tested it by streaming a 10M file just as in your benchmark, and it took a fraction of a second on my old laptop, which is by no means very scientific, but is quite in the same ballpark as your generate_large_xml() function.