An XML catalog is made up of entries from one or more catalog entry files. A catalog entry file is an XML file whose document element is
catalog and whose content follows
the XML catalog DTD defined by OASIS at
http://www.oasis-open.org/committees/entity/spec.html.
Most of the elements are catalog entries, each of which serves to map an
identifier or URL to another location. Following are some useful
examples.
The DOCTYPE declaration at the top of an XML document gives the processor information to identify the DTD. Here is a declaration suggested by the DTD itself:
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
The first quoted string after PUBLIC is the DTD's PUBLIC identifier, and the second quoted string is the SYSTEM identifier. In this case, the SYSTEM identifier is a full URL to the OASIS website.
You can use a public catalog entry to resolve a DTD's PUBLIC identifier, or you can use a system catalog entry to resolve a DTD's SYSTEM identifier. These two kinds of catalog entries are used only to resolve DTD identifiers and system entity identifiers (external files), not stylesheet references. Here is a simple XML catalog file that shows how to resolve a DTD identifier:
Example 4.1. Catalog entry to resolve DTD location
<?xml version="1.0"?> <!DOCTYPE catalog PUBLIC "-//OASIS/DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"><catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<group prefer="public" xml:base="file:///usr/share/xml/" >
<public publicId="-//OASIS//DTD DocBook XML V4.4//EN"
uri="docbook44/docbookx.dtd"/> <system systemId="http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"
uri="docbook44/docbookx.dtd"/> <system systemId="docbook4.4.dtd"
uri="docbook44/docbookx.dtd"/> </group> </catalog>
Note these features of this catalog:
Why have multiple entries? So different documents that specify their DOCTYPE differently can resolve to the same location. So when a DocBook document that has this DOCTYPE declaration is
processed with this catalog and a catalog resolver:
<?xml version="1.0"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
The catalog resolver loads the catalog, and as it reads the files to
be processed, it looks for items to resolve. In this case we have a
DOCTYPE with both a PUBLIC identifier (-//OASIS//DTD DocBook XML V4.4//EN) and a SYSTEM identifier (http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd). It finds a match
on the public identifier in the catalog, and since that entry's group wrapper element prefers
using the public identifier, it uses that entry. It uses the uri attribute
value for that entry, and then prepends the xml:base value from its group
wrapper. The result is a full pathname
/usr/share/xml/docbook44/docbookx.dtd.
If it
turns out that such a file is not at that location, then the catalog
resolver looks for other catalog entries to resolve the item.
It then tries the first system entry, which in this case matches the www.oasis-open.org URL to the same local file. If no catalog entry works, then the resolver gives up. Then the XML processor falls back to using the literal DOCTYPE's SYSTEM identifier http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd without catalog resolution, and tries to retrieve the DTD over the web.
The XML catalog file that ships with version 4.3 of the DocBook XML DTD is missing an entry for the htmltblx.mod file. If your resolver reports it as missing, then add an entry like this to your catalog file:
<public publicId="-//OASIS//ELEMENTS DocBook XML HTML Tables V4.3//EN" uri="htmltblx.mod"/>
This problem was fixed in version 4.4.
When you are specifying an xml:base or uri attribute for use on a Microsoft Windows system, you must include the drive letter in the full URI syntax if you want it to work across processors. A Windows URI has this form:
file:///c:/xml/docbook/
Note the use of forward slashes, which is standard URI syntax.
Another document might have a much simpler DOCTYPE declaration:
<!DOCTYPE book SYSTEM "docbook4.4.dtd">
If processed with the same catalog, there is no PUBLIC identifier to match on. So despite the prefer="public" attribute, it is forced to try to match the DOCTYPE's SYSTEM identifier with a system catalog entry. It finds a match in the systemId attribute and the uri value maps it to the same location.
Unfortunately, XML catalog entries that try to use relative system identifiers like systemId="docbook4.4.dtd" don't work with the Java resolver software currently available. The problem is that when a document with the example DOCTYPE is processed, the SAX interface in the XML parser resolves such references relative to the current document's location before the resolver gets to see it. So the resolver never has a chance to match on the original string. If you are going to use catalog files, you should probably stick with the recommended value of http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd for the SYSTEM identifier.
You use the uri element in an XML catalog to locate stylesheets and other files. It can be used for everything that is not a declared PUBLIC or SYSTEM identifier for a DTD or system entity file. Here is an example of mapping a relative stylesheet reference to an absolute path:
Example 4.2. Catalog entry to locate a stylesheet
<?xml version="1.0"?>
<!DOCTYPE catalog
PUBLIC "-//OASIS/DTD Entity Resolution XML Catalog V1.0//EN"
"http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd">
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<uri
name="docbook.xsl"
uri="file:///usr/share/xml/docbook-xsl-1.68.1/html/docbook.xsl"/>
</catalog>
With a catalog entry like this, your scripts and Makefiles can refer to the stylesheet file simply as docbook.xsl and let the catalog find its location on the system. By using a different catalog, you can map the name to a different stylesheet file without changing the script or Makefile command line.
As mentioned above, you can specify an web URL for the DTD or stylesheet to fetch it over the Internet. For efficiency, though, it's better to map the URLs to local files if they are available. The following catalog will do that.
Example 4.3. Catalog entry to map web address to local file
<?xml version="1.0"?> <!DOCTYPE catalog PUBLIC "-//OASIS/DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"> <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> <system systemId="http://www.oasis-open.org/docbook/xml/4.4/" uri="file:///usr/share/xml/docbook44/" /> <uri name="http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl" uri="file:///usr/share/xml/docbook-xsl-1.68.1/html/docbook.xsl" /> <uri name="http://docbook.sourceforge.net/release/xsl/current/html/chunk.xsl" uri="file:///usr/share/xml/docbook-xsl-1.68.1/html/chunk.xsl" /> </catalog>
There are two uri entries here, to handle both the regular and the chunking stylesheets.
To reduce the number of catalog entries, you can map a prefix instead of a bunch of similar names. Two catalog entry elements named rewriteSystem and rewriteURI let you map the first part of a reference to a different prefix. That lets you map many files in the same location with a single catalog entry. Use rewriteSystem to remap a
DOCTYPE system identifier, and use rewriteURI to remap other URLs like
stylesheet references.
Here is the previous example done with rewrite entries:
<?xml version="1.0"?>
<!DOCTYPE catalog
PUBLIC "-//OASIS/DTD Entity Resolution XML Catalog V1.0//EN"
"http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd">
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<rewriteSystem
systemIdStartString="http://www.oasis-open.org/docbook/xml/4.4/"
rewritePrefix="file:///usr/share/xml/docbook44/" />
<rewriteURI
uriStartString="http://docbook.sourceforge.net/release/xsl/current/"
rewritePrefix="file:///usr/share/xml/docbook-xsl-1.68.1/" />
</catalog>
The two stylesheet uri entries are replaced with a single rewriteURI entry. Whatever directory structure below that point that
matches on both ends can be mapped. For example:
This URL: http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl is mapped to: file:///usr/share/xml/docbook-xsl-1.68.1/html/docbook.xsl This URL: http://docbook.sourceforge.net/release/xsl/current/fo/custom.xsl is mapped to: file:///usr/share/xml/docbook-xsl-1.68.1/fo/custom.xsl
You can use the nextCatalog element to include other catalog entry files in the process. If a reference can't be resolved in the current catalog entry file, then the processor moves on to the next catalog specified by such an element. You can put nextCatalog elements anywhere in a catalog entry file, since they aren't looked at until all catalog entries in the current file have been tried. Each new catalog file can also contain nextCatalog entries.
Using this feature lets you organize your catalog entries into modular files which can be combined in various ways. For example, you could separate your DTD lookups from your stylesheet lookups. Since the DocBook DTD comes with a catalog file, you can just point to that catalog to resolve DTD PUBLIC identifiers.
For DocBook 4.4: <nextCatalog catalog="/usr/share/xml/docbook44/catalog.xml" /> For DocBook 4.1.2: <nextCatalog catalog="/usr/share/xml/docbook412/docbook.cat" />
In the latter example, it is pointing to the SGML catalog that was included with an older version of the DTD. The references in either of those catalog files are all relative to the catalog file location, so the resolver should be able to find any of the DTD files by its PUBLIC identifier. Don't try to move the DocBook catalog file out of the directory that contains the DTD files or the relative references won't work.
| DocBook XSL: The Complete Guide - 3rd Edition | PDF version available | Copyright © 2002-2005 Sagehill Enterprises |