Luigi Cortese Luigi Cortese - 27 days ago 6
Java Question

Anonymizing the xml: how to remove data while leaving the tags in Java?

Given an xml structure in a String type, I'm looking for a way to replace data with four asterisks, while leaving the tags in their place. That is, starting from this

<one> <two> abc </two> <two> def </two> </one>


I want it to become

<one> <two> **** </two> <two> **** </two> </one>


I've tried

requestBody.replaceAll(">[^<]+?<","> **** <")


but I'm also capturing any blank spaces between two adjacent tags, having therefore

<one> **** <two> **** </two> **** <two> **** </two> **** </one>


How could I achieve my goal? Any suggestions?

Here for some tests.

Edit



Following Michael Kay suggestions I've found this solution

/**
* Anonimyzes an xml structure replacing all data between tags with 4 asterisks.
* Tags won't be replaced.
*
* @param xmlInput the string representing the xml to be anonymized
* @return the anonymized xml structure.
*/
private String anonymizeXml(String xmlInput){
String anonimizedXml=null;
try {
TransformerFactory factory = TransformerFactory.newInstance();
Source xslt = new StreamSource(new StringReader("<xsl:transform version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\"><xsl:template match=\"*\"> <xsl:copy> <xsl:apply-templates/> </xsl:copy></xsl:template><xsl:template match=\"text()[normalize-space()]\"> **** </xsl:template></xsl:transform>"));
Transformer transformer;
transformer = factory.newTransformer(xslt);
Source text = new StreamSource(new StringReader(xmlInput));

StringWriter writer = new StringWriter();
transformer.transform(text, new StreamResult(writer));
anonimizedXml = writer.toString();

} catch (TransformerConfigurationException e) {
e.printStackTrace();
} catch (TransformerException e) {
e.printStackTrace();
}
return anonimizedXml;
}

Answer

This is a job for a very simple XSLT transformation:

<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="*">
  <xsl:copy>
   <xsl:apply-templates/>
  </xsl:copy>
</xsl:template>

<xsl:template match="text()[normalize-space()]">****</xsl:template>
</xsl:transform>