Max Cohen Max Cohen - 25 days ago 13
Python Question

Passing XML through XSLT sheet results in duplicated nodes

I have a whole long XML document that I'm editing with Python. It's a document that's generated every month and then edited down by hand and I'm trying to automate at least some of that process. The first portion of this works just fine. However when I run the document through a final reorder change via an XSLT document (change()) I get the reordered elements and the original elements in their original order and I have no idea why. I had thought it was because I was rewriting the same file over and over again but the duplicates don't appear until after change() runs. So I assume it's something to do with how I'm using XSLT but I am a real beginner at that. So any help you want to shoot me would be greatly appreciated.

from __future__ import print_function

from lxml import etree
import xml.etree.ElementTree as et

def adultSmash():
def adultGrab(): #grab all adult events
src_tree = et.parse('quartertwo.xml')
src_root = src_tree.getroot()
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
for event in src_root.findall('event'):
agerange = event.find('AgeRanges')
if agerange is None:
continue
ageranges = agerange.text
if ageranges == 'Adult':
dest_root.append(event)
et.ElementTree(dest_root).write('dest_tree.xml')

def clean():
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
for event in dest_root.findall('event'):
book = event.find('EventType') #
books = book.text
if books == 'Book Groups':
dest_root.remove(event)
elif books == 'Book Sales':
dest_root.remove(event)
elif books == 'Bookmobile Stop':
dest_root.remove(event)
et.ElementTree(dest_root).write('dest_tree.xml')

def cleanNodes():
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()

foos = dest_tree.findall('event')
for event in foos:
bars = event.findall('Notes')
for Notes in bars:
event.remove(Notes)
et.ElementTree(dest_root).write('dest_tree.xml')

def change():
dom = et.parse('dest_tree.xml')
xslt = et.parse('change.xslt')
transform = et.XSLT(xslt)
newdom = transform(dom)
log = open('dest_tree.xml', 'w')
print(str(newdom), file = log)
adultGrab()
clean()
cleanNodes()
change()


This is the XML

<?xml version="1.0" encoding="utf-8"?>
<events>
<event>
<EventType>Blah</EventType>
<title>Blah Blah</title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Friday, September 2, 2016</Date>
<DateYear>2016</DateYear>
<DateMonth>09</DateMonth>
<DateDay>02</DateDay>
<Body>Derp</Body>
<Notes>Notes are not displayed to the public.</Notes>
</event>
<event>
<EventType>Blah</EventType>
<title>Blah Blah</title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Friday, September 2, 2016</Date>
<DateYear>2016</DateYear>
<DateMonth>09</DateMonth>
<DateDay>02</DateDay>
<Body>Derp</Body>
<Notes>Notes are not displayed to the public.</Notes>
</event>
</events>


This is the XSLT I'm using to change it:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output encoding="UTF-8" indent="yes" method="xml" />
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="event">
<xsl:copy>
<xsl:apply-templates select="@*" />
<xsl:apply-templates select="title" />
<xsl:apply-templates select="RelatedLocations" />
<xsl:apply-templates select="Date" />
<xsl:apply-templates select="DateYear" />
<xsl:apply-templates select="DateMonth" />
<xsl:apply-templates select="DateDay" />
<xsl:apply-templates select="Body" />
<xsl:apply-templates select="AgeRanges" />
<xsl:apply-templates select="*[not(self::Location or self::EventType)]" />
</xsl:copy>
</xsl:template>




and finally this is the result:

<?xml version="1.0" encoding="UTF-8"?>
<events>
<event>
<title>Blah</title>
<RelatedLocations>Derp</RelatedLocations>
<Date>Every Saturday through Nov 30 2016. Saturday, October 1, 2016 - 10 a.m.-5 p.m.</Date>
<DateYear>2016</DateYear>
<DateMonth>10</DateMonth>
<DateDay>01</DateDay>
<Body>Blah</Body>
<AgeRanges>Adult</AgeRanges>
<AgeRanges>Adult</AgeRanges>
<title></title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Every Saturday through Nov 30 2016. Saturday, October 1, 2016 - 10 a.m.-5 p.m.</Date>
<DateYear>2016</DateYear>
<DateMonth>10</DateMonth>
<DateDay>01</DateDay>
<Body>Blah</Body>




So any help would be appreciated.

Answer

You are getting duplicate nodes in your output because you are applying templates to the same nodes twice. For example, you do:

<xsl:apply-templates select="title" />

and then:

<xsl:apply-templates select="*[not(self::Location or self::EventType)]" />

The title element is neither Location nor EventType, so the second instruction applies templates to it again.

Comments