272 lines
5.0 KiB
HTML
272 lines
5.0 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
|
<HTML
|
|
><HEAD
|
|
><TITLE
|
|
>Migration tools</TITLE
|
|
><META
|
|
NAME="GENERATOR"
|
|
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
|
|
REL="HOME"
|
|
TITLE="DocBook Demystification HOWTO"
|
|
HREF="index.html"><LINK
|
|
REL="PREVIOUS"
|
|
TITLE="Who are the projects and the players?"
|
|
HREF="x191.html"><LINK
|
|
REL="NEXT"
|
|
TITLE="Editing tools"
|
|
HREF="x253.html"></HEAD
|
|
><BODY
|
|
CLASS="sect1"
|
|
BGCOLOR="#FFFFFF"
|
|
TEXT="#000000"
|
|
LINK="#0000FF"
|
|
VLINK="#840084"
|
|
ALINK="#0000FF"
|
|
><DIV
|
|
CLASS="NAVHEADER"
|
|
><TABLE
|
|
SUMMARY="Header navigation table"
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TH
|
|
COLSPAN="3"
|
|
ALIGN="center"
|
|
>DocBook Demystification HOWTO</TH
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="left"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="x191.html"
|
|
ACCESSKEY="P"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="80%"
|
|
ALIGN="center"
|
|
VALIGN="bottom"
|
|
></TD
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="right"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="x253.html"
|
|
ACCESSKEY="N"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"></DIV
|
|
><DIV
|
|
CLASS="sect1"
|
|
><H1
|
|
CLASS="sect1"
|
|
><A
|
|
NAME="AEN206"
|
|
></A
|
|
>9. Migration tools</H1
|
|
><P
|
|
>The second biggest problem with DocBook is the effort needed to
|
|
convert old-style presentation markup to DocBook markup. Human beings
|
|
can usually parse the presentation of a document into logical
|
|
structure automatically, because (for example) they can tell from
|
|
context when an italic font means `emphasis' and when it means
|
|
something else such as `this is a foreign phrase'.</P
|
|
><P
|
|
>Somehow, in converting documents to DocBook, those
|
|
sorts of distinctions need to be made explicit. Sometimes
|
|
they're present in the old markup; often they are not, and the
|
|
missing structural information has to be either deduced by
|
|
clever heuristics or added by a human.</P
|
|
><P
|
|
>Here is a summary of the state of conversion tools from
|
|
various other formats:</P
|
|
><P
|
|
></P
|
|
><DIV
|
|
CLASS="variablelist"
|
|
><DL
|
|
><DT
|
|
>GNU Texinfo</DT
|
|
><DD
|
|
><P
|
|
>The Free Software Foundation has made a policy decision to
|
|
support DocBook as an interchange format. Texinfo has enough
|
|
structure to make reasonably good automatic conversion possible, and
|
|
the 4.x versions of <B
|
|
CLASS="command"
|
|
>makeinfo</B
|
|
> feature a
|
|
<TT
|
|
CLASS="option"
|
|
>--docbook</TT
|
|
> switch that generates DocBook.
|
|
More at the <A
|
|
HREF="http://www.gnu.org/directory/texinfo.html"
|
|
TARGET="_top"
|
|
>makeinfo project
|
|
page</A
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
>POD</DT
|
|
><DD
|
|
><P
|
|
>There is a <A
|
|
HREF="http://www.cpan.org/modules/by-module/Pod/"
|
|
TARGET="_top"
|
|
>POD::DocBook</A
|
|
>
|
|
module that translates Plain Old Documentation markup to DocBook. It
|
|
claims to translate every POD tag except the L<> italic tag.
|
|
The man page also says "Nested =over/=back lists are not supported
|
|
within DocBook." but notes that the module has been heavily
|
|
tested.</P
|
|
></DD
|
|
><DT
|
|
>LaTeX</DT
|
|
><DD
|
|
><P
|
|
>LaTeX is a (mostly) structural markup macro language built on
|
|
top of the TeX formatter. There is a project called <A
|
|
HREF="http://www.lrz-muenchen.de/services/software/sonstiges/tex4ht/mn.html"
|
|
TARGET="_top"
|
|
> TeX4ht</A
|
|
> that (according to the author of PassiveTeX) can
|
|
generate DocBook from LaTeX.</P
|
|
></DD
|
|
><DT
|
|
>man pages and other troff-based markups</DT
|
|
><DD
|
|
><P
|
|
>This is generally considered the biggest and nastiest conversion
|
|
problem. And indeed, the basic
|
|
<SPAN
|
|
CLASS="citerefentry"
|
|
><SPAN
|
|
CLASS="refentrytitle"
|
|
>troff</SPAN
|
|
>(1)</SPAN
|
|
> markup is at too low a presentation
|
|
level for automatic conversion tools to do much of any good. However,
|
|
the gloom in the picture lightens significantly if we consider
|
|
translation from sources of documents written in macro packages like
|
|
<SPAN
|
|
CLASS="citerefentry"
|
|
><SPAN
|
|
CLASS="refentrytitle"
|
|
>man</SPAN
|
|
>(7)</SPAN
|
|
>. These have enough structural
|
|
features for automatic translation to get some traction.</P
|
|
><P
|
|
>I wrote a tool to do this myself, because I couldn't find
|
|
anything else that did a half-decent job of it (and the problem is
|
|
interesting). It's called <A
|
|
HREF="http://www.catb.org/~esr//doclifter/"
|
|
TARGET="_top"
|
|
>doclifter</A
|
|
>. It will
|
|
translate to either SGML or XML DocBook from
|
|
<SPAN
|
|
CLASS="citerefentry"
|
|
><SPAN
|
|
CLASS="refentrytitle"
|
|
>man</SPAN
|
|
>(7)</SPAN
|
|
>,
|
|
<SPAN
|
|
CLASS="citerefentry"
|
|
><SPAN
|
|
CLASS="refentrytitle"
|
|
>mdoc</SPAN
|
|
>(7)</SPAN
|
|
>,
|
|
<SPAN
|
|
CLASS="citerefentry"
|
|
><SPAN
|
|
CLASS="refentrytitle"
|
|
>ms</SPAN
|
|
>(7)</SPAN
|
|
>, or
|
|
<SPAN
|
|
CLASS="citerefentry"
|
|
><SPAN
|
|
CLASS="refentrytitle"
|
|
>me</SPAN
|
|
>(7)</SPAN
|
|
> macros. See the documentation
|
|
for details.</P
|
|
></DD
|
|
></DL
|
|
></DIV
|
|
></DIV
|
|
><DIV
|
|
CLASS="NAVFOOTER"
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"><TABLE
|
|
SUMMARY="Footer navigation table"
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="x191.html"
|
|
ACCESSKEY="P"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="index.html"
|
|
ACCESSKEY="H"
|
|
>Home</A
|
|
></TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="x253.html"
|
|
ACCESSKEY="N"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
>Who are the projects and the players?</TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
> </TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
>Editing tools</TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></BODY
|
|
></HTML
|
|
> |