Java Question

XML without namespace. Validate against one of several XSD's

I have a situation where we receive a bunch of XML files on a regular basis. We have no control over them, and they do not have namespace information, and we would really like to avoid changing them.

We have an XSD which we need to use to validate the XML files, and which works if explicitly coded to be applied. Now we would like to hint to a SAX parser that this particular XML dialect should be validated against this XSD (which we have on the file system), but I cannot find any other way than providing a noNamespaceSchemaLocation in the XML file which we really would like to avoid.

Suggestions? Will an EntityResolver always be called with a null/empty namespace?

(a functional solution will give 500 bonus points when I am allowed to)

Answer

Using java.xml.validation you can specify the XSD schema which should be used to validate a XML document without being referenced by the document:

import javax.xml.XMLConstants;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;

...
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(new File("<path to the xsd>"));

SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setValidating(false);
spf.setSchema(schema);

XMLReader xmlReader = spf.newSAXParser().getXMLReader();
xmlReader.setContentHandler(...);
xmlReader.parse(new InputSource(...)); // will validate against the schema
Comments