XHTML validation against DTD schema in Java

Recently I had the problem of validating XHTML file using DTD schema and Java. As for DTD validation to find little on the Internet, I would like to explain briefly how this is possible by simply using Java resources.

What happened?

An external component has delivered a XHTML file, which should have been “xhtml1-strict” according to the W3C schema definition. But unfortunately, this file was incorrectly because no schema validation check took place during/after the generation.

A subsequent XML comparator (DeltaXML) should compare this file with default settings and chrased, because the delivered XHTML was invalid.

How could the problem be solved?

A validator class DOMValidateDTD was written, which uses standard JDK libraries. For this example the resources (XHTML file and DTD schema) are loaded from classpath (resources package), so the example.xhtml and xhtml1-strict.dtd must be stored to this folder. Be aware, to set the doctype path to your current resource directory.


1. Content of example.xhtml:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html SYSTEM "<YOUR DTD PATH>\xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
	<head>
		<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
		<title>XHTML 1.0 Strict Example</title>
	</head>
	<body>
		<p>This is an example of a XHTML page</p>
	</body>
</html>

2. Current xhtml1-strict.dtd schema:

The schema must be downloaded/stored from here


3. Validator class DOMValidateDTD.java:

package it.heber.sandbox;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;

public class DOMValidateDTD {

	public static void main(String args[]) {
		if (validateAgainstDTD()) {
			System.out.println("XML file against DTD is valid ");
		} else {
			System.out.println("XML file against DTD is invalid ");
		}
	}
	
	static boolean validateAgainstDTD() {
		try {
			DocumentBuilderFactory factory = DocumentBuilderFactory
					.newInstance();
			factory.setValidating(true);
			DocumentBuilder builder = factory.newDocumentBuilder();
			builder.setErrorHandler(new org.xml.sax.ErrorHandler() {
				// Ignore the fatal errors
				public void fatalError(SAXParseException exception)
						throws SAXException {
				}

				// Validation errors
				public void error(SAXParseException e) throws SAXParseException {
					System.out.println("Error at " + e.getLineNumber()
							+ " line.");
					System.out.println(e.getMessage());
					System.exit(0);
				}

				// Show warnings
				public void warning(SAXParseException err)
						throws SAXParseException {
					System.out.println(err.getMessage());
					System.exit(0);
				}
			});
			Document xmlDocument = builder.parse(ClassLoader
					.getSystemResourceAsStream("example.html"));
			DOMSource source = new DOMSource(xmlDocument);
			StreamResult result = new StreamResult(System.out);
			TransformerFactory tf = TransformerFactory.newInstance();
			Transformer transformer = tf.newTransformer();
			transformer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM,
					"xhtml1-strict.dtd");
			transformer.transform(source, result);
			return true;
		} catch (Exception e) {
			System.out.println(e.getMessage());
			return false;
		}
	}
}

This entry was posted in DTD, Java, XML and tagged , , , . Bookmark the permalink.

Hinterlasse eine Antwort

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind markiert *

Du kannst folgende HTML-Tags benutzen: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>