er4z0r er4z0r - 5 months ago 30
Java Question

Remove Empty Attributes from XML

I have a buggy xml that contains empty attributes and I have a parser that coughs on empty attributes.
I have no control over the generation of the xml nor over the parser that coughs on empty attrs. So what I want to do is a pre-processing step that simply removes all empty attributes.

I have managed to find the empty attributes, but now I don't know how to remove them:

XPathFactory xpf = XPathFactory.newInstance();
XPath xpath = xpf.newXPath();
XPathExpression expr = xpath.compile("//@*");
Object result = expr.evaluate(d, XPathConstants.NODESET);

if (result != null) {
NodeList nodes = (NodeList) result;
for(int node=0;node<nodes.getLength();node++)
{
Node n = nodes.item(node);
if(isEmpty(n.getTextContent()))
{
this.log.warn("Found empty attribute declaration "+n.toString());
NamedNodeMap parentAttrs = n.getParentNode().getAttributes();
parentAttrs.removeNamedItem(n.getNodeName());
}
}

}


This code gives me a NPE when accessing n.getParentNode().getAttributes().
But how can I remove the empty attribute from an element, when I cannot access the element?

Answer

The following stylesheet will copy all content in the source document - except attributes that contain only whitespace. The first template simply copies everything - including empty attributes. However, the second template has a higher priority than the first due to its use of a predicate, which is why it will be chosen in preference to the more general first template when an empty attribute is encountered: and this second template does not generate any output.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> 
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="@*[normalize-space()='']"/>
</xsl:stylesheet>