332 lines
6.6 KiB
HTML
332 lines
6.6 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
|
<HTML
|
|
><HEAD
|
|
><TITLE
|
|
>The DocBook toolchain</TITLE
|
|
><META
|
|
NAME="GENERATOR"
|
|
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
|
|
REL="HOME"
|
|
TITLE="DocBook Demystification HOWTO"
|
|
HREF="index.html"><LINK
|
|
REL="PREVIOUS"
|
|
TITLE="Other DTDs"
|
|
HREF="x120.html"><LINK
|
|
REL="NEXT"
|
|
TITLE="asciidoc"
|
|
HREF="x183.html"></HEAD
|
|
><BODY
|
|
CLASS="sect1"
|
|
BGCOLOR="#FFFFFF"
|
|
TEXT="#000000"
|
|
LINK="#0000FF"
|
|
VLINK="#840084"
|
|
ALINK="#0000FF"
|
|
><DIV
|
|
CLASS="NAVHEADER"
|
|
><TABLE
|
|
SUMMARY="Header navigation table"
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TH
|
|
COLSPAN="3"
|
|
ALIGN="center"
|
|
>DocBook Demystification HOWTO</TH
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="left"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="x120.html"
|
|
ACCESSKEY="P"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="80%"
|
|
ALIGN="center"
|
|
VALIGN="bottom"
|
|
></TD
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="right"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="x183.html"
|
|
ACCESSKEY="N"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"></DIV
|
|
><DIV
|
|
CLASS="sect1"
|
|
><H1
|
|
CLASS="sect1"
|
|
><A
|
|
NAME="AEN128"
|
|
></A
|
|
>6. The DocBook toolchain</H1
|
|
><P
|
|
>The easiest way to format and render XML-DocBook documents is to
|
|
use the <SPAN
|
|
CLASS="application"
|
|
>xmlto</SPAN
|
|
> toolchain. This ships with
|
|
Red Hat; Debian users can get it with the command <B
|
|
CLASS="command"
|
|
>apt-get
|
|
install xmlto</B
|
|
>.</P
|
|
><P
|
|
>Normally, what you'll do to make XHTML from your
|
|
DocBook sources will look like this:</P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="screen"
|
|
> bash$ xmlto xhtml foo.xml
|
|
bash$ ls *.html
|
|
ar01s02.html ar01s03.html ar01s04.html index.html
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><P
|
|
>In this example, you converted an XML-Docbook document named
|
|
<TT
|
|
CLASS="filename"
|
|
>foo.xml</TT
|
|
> with three top-level sections into an
|
|
index page and two parts. Making one big page is just as easy:</P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="screen"
|
|
> bash$ xmlto xhtml-nochunks foo.xml
|
|
bash$ ls *.html
|
|
foo.html
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><P
|
|
>Finally, here is how you make PDF for printing:</P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="screen"
|
|
> bash$ dblatex foo.xml # To make PDF
|
|
bash$ ls *.pdf
|
|
foo.pdf
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><P
|
|
>Some older versions of <B
|
|
CLASS="command"
|
|
>xmlto</B
|
|
> may be
|
|
more verbose, emitting noise like "Converting to XHTML" and so forth.</P
|
|
><P
|
|
>To turn your documents into HTML or PDF, you need an
|
|
engine that can apply the combination of DocBook DTD and
|
|
a suitable stylesheet to your document. Here is how the
|
|
open-source tools for doing this fit together:</P
|
|
><DIV
|
|
CLASS="mediaobject"
|
|
><P
|
|
><IMG
|
|
SRC="figure2.png"><DIV
|
|
CLASS="caption"
|
|
><P
|
|
>Present-day XML-DocBook toolchain</P
|
|
></DIV
|
|
></P
|
|
></DIV
|
|
><P
|
|
>Parsing your document and applying the stylesheet transformation
|
|
will be handled by one of three programs. The most likely one is
|
|
<SPAN
|
|
CLASS="application"
|
|
>xsltproc</SPAN
|
|
>. The other
|
|
possibilities are two Java programs,
|
|
<SPAN
|
|
CLASS="application"
|
|
>Saxon</SPAN
|
|
>
|
|
and
|
|
<SPAN
|
|
CLASS="application"
|
|
>Xalan</SPAN
|
|
>,</P
|
|
><P
|
|
>It is relatively easy to generate high-quality XHTML from
|
|
DocBook; the fact that XHTML is simply another XML DTD helps a lot.
|
|
Translation to HTML is done by applying a rather simple stylesheet,
|
|
and that's the end of the story. RTF is also simple to generate in
|
|
this way, and from XHTML or RTF it's easy to generate a flat ASCII
|
|
text approximation in a pinch.</P
|
|
><P
|
|
>The awkward case is print. Generating high-quality printed
|
|
output (which means, in practice, Adobe's
|
|
PDF or Portable Document
|
|
Format, a packaged form of PostScript) is difficult. Doing it right
|
|
requires algorithmically duplicating the delicate judgments of a human
|
|
typesetter moving from content to presentation level.</P
|
|
><P
|
|
>So, first, a stylesheet translates Docbook's structural markup
|
|
into another dialect of XML —
|
|
FO
|
|
(Formatting Objects). FO markup is very much presentation-level; you
|
|
can think of it as a sort of XML functional equivalent of troff. It
|
|
has to be translated to Postscript for packaging in a PDF.</P
|
|
><P
|
|
>In the toolchain shipped with most present-day Linux
|
|
distributions, this job is best handled by a program called
|
|
<SPAN
|
|
CLASS="application"
|
|
>dblatex</SPAN
|
|
>
|
|
(this obsoletes the older passivetex package that previous versions of
|
|
tis HOWTO described).</P
|
|
><P
|
|
><B
|
|
CLASS="command"
|
|
>dblatex</B
|
|
> translates the formatting objects
|
|
generated by <B
|
|
CLASS="command"
|
|
>xsltproc</B
|
|
> into Donald Knuth's TeX
|
|
language. TeX was one of the earliest open-source projects, an old
|
|
but powerful presentation-level formatting language much beloved of
|
|
mathematicians (to whom it provides particulaly elaborate facilities
|
|
for describing mathematical notation). TeX is also famously good at
|
|
basic typesetting tasks like kerning, line filling, and hyphenating.
|
|
TeX's output is then massaged into PDF.</P
|
|
><P
|
|
>If you think this bucket chain of XML to Tex macros to
|
|
PDF sounds like an awkward kludge, you're right. It clanks, it
|
|
wheezes, and it has ugly warts. Fonts are a significant problem,
|
|
since XML and TeX and PDF have very different models of how fonts
|
|
work; also, handling internationalization and localization is a
|
|
nightmare. About the only thing this code path has going for it is
|
|
that it works.</P
|
|
><P
|
|
>The elegant way will be <A
|
|
HREF="http://xmlgraphics.apache.org/fop/"
|
|
TARGET="_top"
|
|
> FOP</A
|
|
>, a direct
|
|
FO-to-Postscript translator being developed by the Apache project.
|
|
With FOP, the internationalization problem is, if not solved, at least
|
|
well confined; XML tools handle Unicode all the way through to FOP.
|
|
Glyph to font mapping is also strictly FOP's problem. The only
|
|
trouble with this approach is that it entirely doesn't work yet. As
|
|
of October 2010 FOP is at 1.0 and usable, but with rough edges and
|
|
missing features. I recommed dblatex for production use.</P
|
|
><P
|
|
>Here is what the FOP toolchain looks like:</P
|
|
><DIV
|
|
CLASS="mediaobject"
|
|
><P
|
|
><IMG
|
|
SRC="figure3.png"><DIV
|
|
CLASS="caption"
|
|
><P
|
|
>Future XML-DocBook toolchain with FOP.</P
|
|
></DIV
|
|
></P
|
|
></DIV
|
|
></DIV
|
|
><DIV
|
|
CLASS="NAVFOOTER"
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"><TABLE
|
|
SUMMARY="Footer navigation table"
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="x120.html"
|
|
ACCESSKEY="P"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="index.html"
|
|
ACCESSKEY="H"
|
|
>Home</A
|
|
></TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="x183.html"
|
|
ACCESSKEY="N"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
>Other DTDs</TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
> </TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
>asciidoc</TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></BODY
|
|
></HTML
|
|
> |