old-www/HOWTO/DocBook-Demystification-HOWTO/x206.html

272 lines
5.0 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML
><HEAD
><TITLE
>Migration tools</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
REL="HOME"
TITLE="DocBook Demystification HOWTO"
HREF="index.html"><LINK
REL="PREVIOUS"
TITLE="Who are the projects and the players?"
HREF="x191.html"><LINK
REL="NEXT"
TITLE="Editing tools"
HREF="x253.html"></HEAD
><BODY
CLASS="sect1"
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#840084"
ALINK="#0000FF"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
>DocBook Demystification HOWTO</TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="x191.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="x253.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="sect1"
><H1
CLASS="sect1"
><A
NAME="AEN206"
></A
>9. Migration tools</H1
><P
>The second biggest problem with DocBook is the effort needed to
convert old-style presentation markup to DocBook markup. Human beings
can usually parse the presentation of a document into logical
structure automatically, because (for example) they can tell from
context when an italic font means `emphasis' and when it means
something else such as `this is a foreign phrase'.</P
><P
>Somehow, in converting documents to DocBook, those
sorts of distinctions need to be made explicit. Sometimes
they're present in the old markup; often they are not, and the
missing structural information has to be either deduced by
clever heuristics or added by a human.</P
><P
>Here is a summary of the state of conversion tools from
various other formats:</P
><P
></P
><DIV
CLASS="variablelist"
><DL
><DT
>GNU Texinfo</DT
><DD
><P
>The Free Software Foundation has made a policy decision to
support DocBook as an interchange format. Texinfo has enough
structure to make reasonably good automatic conversion possible, and
the 4.x versions of <B
CLASS="command"
>makeinfo</B
> feature a
<TT
CLASS="option"
>--docbook</TT
> switch that generates DocBook.
More at the <A
HREF="http://www.gnu.org/directory/texinfo.html"
TARGET="_top"
>makeinfo project
page</A
>.</P
></DD
><DT
>POD</DT
><DD
><P
>There is a <A
HREF="http://www.cpan.org/modules/by-module/Pod/"
TARGET="_top"
>POD::DocBook</A
>
module that translates Plain Old Documentation markup to DocBook. It
claims to translate every POD tag except the L&#60;&#62; italic tag.
The man page also says "Nested =over/=back lists are not supported
within DocBook." but notes that the module has been heavily
tested.</P
></DD
><DT
>LaTeX</DT
><DD
><P
>LaTeX is a (mostly) structural markup macro language built on
top of the TeX formatter. There is a project called <A
HREF="http://www.lrz-muenchen.de/services/software/sonstiges/tex4ht/mn.html"
TARGET="_top"
>&#13;TeX4ht</A
> that (according to the author of PassiveTeX) can
generate DocBook from LaTeX.</P
></DD
><DT
>man pages and other troff-based markups</DT
><DD
><P
>This is generally considered the biggest and nastiest conversion
problem. And indeed, the basic
<SPAN
CLASS="citerefentry"
><SPAN
CLASS="refentrytitle"
>troff</SPAN
>(1)</SPAN
> markup is at too low a presentation
level for automatic conversion tools to do much of any good. However,
the gloom in the picture lightens significantly if we consider
translation from sources of documents written in macro packages like
<SPAN
CLASS="citerefentry"
><SPAN
CLASS="refentrytitle"
>man</SPAN
>(7)</SPAN
>. These have enough structural
features for automatic translation to get some traction.</P
><P
>I wrote a tool to do this myself, because I couldn't find
anything else that did a half-decent job of it (and the problem is
interesting). It's called <A
HREF="http://www.catb.org/~esr//doclifter/"
TARGET="_top"
>doclifter</A
>. It will
translate to either SGML or XML DocBook from
<SPAN
CLASS="citerefentry"
><SPAN
CLASS="refentrytitle"
>man</SPAN
>(7)</SPAN
>,
<SPAN
CLASS="citerefentry"
><SPAN
CLASS="refentrytitle"
>mdoc</SPAN
>(7)</SPAN
>,
<SPAN
CLASS="citerefentry"
><SPAN
CLASS="refentrytitle"
>ms</SPAN
>(7)</SPAN
>, or
<SPAN
CLASS="citerefentry"
><SPAN
CLASS="refentrytitle"
>me</SPAN
>(7)</SPAN
> macros. See the documentation
for details.</P
></DD
></DL
></DIV
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="x191.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="x253.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>Who are the projects and the players?</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
>&nbsp;</TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Editing tools</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>