General FAQs


	Questions

Querying Xerces Version
JIRA
Jar file changes
Using JAXP 1.3 on JDK 1.4
ClassCastException using Xerces
Xerces HTML, XHTML, and XML Serializers
Obtaining smaller jars
Validation against DTD
International Encodings
Accessing Documents on the Internet
JDK Compatibility


	Answers


	How do I find out which Xerces version I am using?

To find out the release version of Xerces, execute the following: java org.apache.xerces.impl.Version.


	How do I use JIRA to report bugs?

Please refer to the Reporting bugs in JIRA.


	What happened to xerces.jar?

In order to take advantage of the fact that this parser is very often used in conjunction with other XML technologies, such as XSLT processors, which also rely on standard API's like DOM and SAX, xerces.jar was split into two jarfiles:

xml-apis.jar contains the DOM level 3, SAX 2.0.2 and the JAXP 1.3 APIs;
xercesImpl.jar contains the implementation of these API's as well as the XNI API.

For backwards compatibility, we have retained the ability to generate xerces.jar. For instructions, see the installation documentation.


	How can I use JAXP 1.3 on JDK 1.4?

Use the Endorsed Standards Override Mechanism to specify xml-apis.jar and xercesImpl.jar. This will override the version of JAXP in the JDK. A more complete description is available here.

The following methods do not work:

Using the CLASSPATH environment variable or using -classpath to place the new classes in the classpath.
Using the -jar option to explicitly execute the classes inside the new jar files.


	Why do I get a ClassCastException when I use Xerces and WebSphere Application Server?

Xerces uses the ObjectFactory class to load some classes dynamically, e.g. the parser configuration. The ObjectFactory finds the specified implementation class by querying the system property, reading META-INF/services/factoryId file or using a fallback classname. After the implementation is found, the ObjectFactory tries to load the file using the context classloader and if it is null, the ObjectFactory uses the system classloader.

If you run Xerces in an environment, such as WebSphere® Application Server, that has multiple classloaders you may get ClassCastExceptions thrown from Xerces because different classloaders might get involved in loading Xerces classes. For example, ClassCastExceptions may occur when utility EAR classes that use Xerces load Xerces classes from WAR modules.

We suggest you read the "Avoiding ClassCastExceptions..." article which explains a workaround for this problem. Also you might want to read the "J2EE Class Loading Demystified" article that explains how multiple classloaders work in WebSphere Application Server.


	What can I use instead of Xerces' HTML, XHTML, or XML serializers?

If you want to achieve interoperability, you should not be using Xerces serialization code directly. Instead, the JAXP Transformer API should be used to serialize HTML, XHTML, and SAX. The DOM Level 3 Load and Save API (or JAXP Transformer API) should be used to serialize DOM.

Using JAXP you can serialize HTML and XHTML as follows:

// Create an "identity" transformer - copies input to output
Transformer t = TransformerFactory.newInstance().newTransformer();

// for "XHTML" serialization, use the output method "xml"
// and set publicId as shown
t.setOutputProperty(OutputKeys.METHOD, "xml");

t.setOutputProperty(OutputKeys.DOCTYPE_PUBLIC,
                    "-//W3C//DTD XHTML 1.0 Transitional//EN");

t.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM,
               "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd");

// For "HTML" serialization, use
t.setOutputProperty(OutputKeys.METHOD, "html");

// Serialize DOM tree
t.transform(new DOMSource(doc), new StreamResult(System.out));

You can find more details about the future of Xerces' serializers in the archives.

The HTML and XHTML serializers (org.apache.xml.serialize) have been deprecated in the Xerces 2.6.2 release. We might deprecate XMLSerializer in a future release.


	I don't need all the features Xerces provides, but I'm running in an environment where space is at a premium. Is there anything I can do?

Partially to address this issue, we've recently begun to distribute compressed jarfiles instead of our traditionally uncompressed files. But if you still need a smaller jar, and don't need things like support for XML Schema or the WML/HTML DOM implementations that Xerces provides, then look at the dtdjars target in our buildfile.


	How do I turn on DTD validation?

You can turn validation on and off via methods available on the SAX2 XMLReader interface. While only the SAXParser implements the XMLReader interface, the methods required for turning on validation are available to both parser classes, DOM and SAX.
The code snippet below shows how to turn validation on -- assume that parser is an instance of either org.apache.xerces.parsers.SAXParser or org.apache.xerces.parsers.DOMParser.

parser.setFeature("http://xml.org/sax/features/validation", true);


	What international encodings are supported by Xerces-J?

UTF-8
UTF-16 Big Endian and Little Endian
UCS-2 (ISO-10646-UCS-2) Big Endian and Little Endian
UCS-4 (ISO-10646-UCS-4) Big Endian and Little Endian
IBM-1208
ISO Latin-1 (ISO-8859-1)
ISO Latin-2 (ISO-8859-2) [Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian (in Latin transcription), Serbocroatian, Slovak, Slovenian, Upper and Lower Sorbian]
ISO Latin-3 (ISO-8859-3) [Maltese, Esperanto]
ISO Latin-4 (ISO-8859-4)
ISO Latin Cyrillic (ISO-8859-5)
ISO Latin Arabic (ISO-8859-6)
ISO Latin Greek (ISO-8859-7)
ISO Latin Hebrew (ISO-8859-8)
ISO Latin-5 (ISO-8859-9) [Turkish]
ISO Latin-7 (ISO-8859-13)
ISO Latin-9 (ISO-8859-15)
Extended Unix Code, packed for Japanese (euc-jp, eucjis)
Japanese Shift JIS (shift-jis)
Chinese (big5)
Chinese for PRC (mixed 1/2 byte) (gb2312)
Japanese ISO-2022-JP (iso-2022-jp)
Cyrillic (koi8-r)
Extended Unix Code, packed for Korean (euc-kr)
Russian Unix, Cyrillic (koi8-r)
Windows Thai (cp874)
Latin 1 Windows (cp1252) (and all other cp125? encodings recognized by IANA)
cp858
EBCDIC encodings:

EBCDIC US (ebcdic-cp-us)
EBCDIC Canada (ebcdic-cp-ca)
EBCDIC Netherland (ebcdic-cp-nl)
EBCDIC Denmark (ebcdic-cp-dk)
EBCDIC Norway (ebcdic-cp-no)
EBCDIC Finland (ebcdic-cp-fi)
EBCDIC Sweden (ebcdic-cp-se)
EBCDIC Italy (ebcdic-cp-it)
EBCDIC Spain, Latin America (ebcdic-cp-es)
EBCDIC Great Britain (ebcdic-cp-gb)
EBCDIC France (ebcdic-cp-fr)
EBCDIC Hebrew (ebcdic-cp-he)
EBCDIC Switzerland (ebcdic-cp-ch)
EBCDIC Roece (ebcdic-cp-roece)
EBCDIC Yugoslavia (ebcdic-cp-yu)
EBCDIC Iceland (ebcdic-cp-is)
EBCDIC Urdu (ebcdic-cp-ar2)
Latin 0 EBCDIC
EBCDIC Arabic (ebcdic-cp-ar1)


	Why is the parser unable to access schema documents or external entities available on the Internet?

The parser may not be able to access various external entities or schema documents (imported, included etc...) available on the Internet, such as the Schema for Schemas "http://www.w3.org/2001/XMLSchema.xsd" or the schema defining xml:base, xml:lang attributes etc... "http://www.w3.org/2001/xml.xsd" or any other external entity available on the Internet. There are various reasons one could experience such a problem.

One of the reasons could be that your proxy settings do not allow the parser to make URL connections through a proxy server. To solve this problem, before parsing a document, the application must set the two system properties: "http.proxyHost" and "http.proxyPort". Another reason could be due to strict firewall settings that do not allow any URL connection to be made to the outside web. The problem may also be caused by a server that is offline or inaccessible on the network, preventing documents hosted by the server from being accessed.


	What JDK level is required for Xerces?

As of version 2.6.2, Xerces requires JDK 1.2 or later to run and also requires JDK 1.2 or later to build the source code.