Validating Schemas in JAXP

I try to avoid moaning in blogs, personal or especially professional.  However, having been in .NET land for so long, coming back to Java and trying to do some XML processing has simply been a nightmare.

So many magic strings, badly-named methods, poorly designed interfaces and needless “pluggability” has meant meant that I’ve spent around 3 hours using search engines to try to techniques for achieving what I consider to be very simple and common use-cases.

Rather than a long diatribe listing all the things that I’ve found frustrating, instead I wanted to share just one method, which, for me, sums up XML processing Java quite neatly:

/**
 * Create a SAX parser instance that is configured to validate against the schemas used within this application. 
 * @return A SAX Parser instance, never returns null (exceptions thrown for all failures).
 * @throws SAXException If unable to create the parser.
 * @throws ParserConfigurationException If unable to create the parser.
 */
private SAXParser getParser() throws SAXException, ParserConfigurationException {
	
	// Where the XSD file is within my application resources, just one so far, but others will follow.
	final String[] schemaResourceNames = new String[] { "com/locima/xml2csv/inputparser/xml/MappingSet.xsd" };

	// So far, so good.
	SAXParserFactory factory = SAXParserFactory.newInstance();
	
	// To enable schema validation, ensure you set validating to false.  Yes, really.
	factory.setValidating(false);
	
	// Apparently, namespaces are a bit complicated, so override the default to ignore them.
	factory.setNamespaceAware(true);
	
	// Now tell it what language (using a magic string), as the parser can't work it out for itself,
	// as if XML files could declare what they are...
	SchemaFactory schemaFactory = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
	
	// Pass a set of schemas (schemata if you're feeling pedantic) to a method called setSchema <-- singular.
	factory.setSchema(schemaFactory.newSchema(getSchemasFromResourceNames(schemaResourceNames)));
	
	SAXParser parser = factory.newSAXParser();
	
	return parser;
}
This entry was posted in Hints and Tips, Java and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.