LDP/LDP/guide/docbook/LDP-Author-Guide/x2docbook.xml

209 lines
6.5 KiB
XML

<!--
<!DOCTYPE book PUBLIC '-//OASIS//DTD DocBook XML V4.2//EN'>
-->
<section id="docbook">
<title>Converting Documents to DocBook XML</title>
<section id="xmldifferences">
<title>Differences between XML and SGML</title>
<para>
There are a few changes between DocBook XML and SGML. Handling
these differences should be relatively easy for most small documents,
and many authors will not need to make any changes
to convert their documents other than the XML and DocBook
declarations at the start of their document.
</para>
<para>
For others, here is a list of what you should keep in mind
when converting your documents from SGML to XML.
</para>
<note>
<para>
An element typically has three parts: the start tag,
the content (your words) and the end tag. Qualifiers
are added in the start tag and are known as
attributes. They will always have a name and a
quoted value.</para>
<screen>&lt;filename
class="directory"&gt;/usr/local&lt;filename&gt;</screen>
<para>
The start tag contains one attribute (class)
with a value of directory. The end tag (also filename)
must not contain any attributes.
</para>
</note>
<itemizedlist>
<listitem>
<para>
Element names (tags) and their attributes are
case-dependent--typically lowercase.
The following will not validate because the end tag
&tl;PARA&gt; is uppercase:
</para>
<screen>
&lt;para&gt;This part will fail XML validation&lt;/PARA&gt;
</screen>
</listitem>
<listitem>
<para>
Most XML-specific tags (like entity)
have to be in all capital letters (ENTITY).
</para>
</listitem>
<listitem>
<para>
All attributes in the start tag must be
"quoted". This can
be either single (') or double (") quotes, but
not
reverse (`) or <quote>smart quotes</quote>.
The quote used to start a name="value"
pair must be the same quote used at the end of
the value. In other words: "this" would
validate, but 'that" would not.
</para>
</listitem>
<listitem>
<para>
Tags that have a start tag, but no end tag are
referred to as <quote>empty</quote> because they do
not contain (wrap around) anything. These tags must still be
closed with a trailing slash (/). For example:
<sgmltag>xref</sgmltag> must be written as
&lt;xref linkend="software"/&gt;. You may not
have any spaces between the / and &gt;.
(Although you may have a space after the final
attribute: &lt;xref linkend="foo" /&gt;.)
</para>
</listitem>
<listitem>
<para>
Processing instructions that get sent to the DSSSL
must have a question mark at the
end of the tag. The XML version of this tag
would look like this:
</para>
<screen>
&lt;?dbhtml filename="foo"?&gt;
</screen>
</listitem>
<listitem>
<para>
If you're converting from SGML to XML, be sure
file names refer to .xml files instead of .sgml.
Some tools may get confused if a .sgml file contains XML.
</para>
</listitem>
<listitem>
<para>
Tag minimizations were used in SGML instead of
writing out the element name in the end tag.
e.g. <sgmltag>para<sgmltag>This is foo.<sgmltag
class="endtag"></sgmltag> Tag minimizations are
not supported in XML and their use is
discouraged in DocBook.
</para>
</listitem>
</itemizedlist>
</section>
<section id="differences">
<title>Changing DTDs</title>
<para>
The big changes between version changes in the DTD
involve changed elements (tags). Elements may be:
depricated (which means they will be removed in
future versions); removed; modified; or added.
Almoost all authors will run into a changed or
depricated tag when going to from a lower version
of DocBook to a higher version (e.g. 3.x to 4.x).
</para>
<para>
The <cite>DocBook: The Definitive Guide</cite> does
an excellent job of showing you how elements fit
together. For each element it tells you what an
element must contain (its content model) and what is
may be contained in (who its parents are). For
example: a <sgmltag>note</sgmltag> must contain a
<sgmltag>para<sgmltag>. If you try to write
&lt;note&gt;Content in a
note&lt;/note&gt; your document will not validate.
Learning how elements are assembled will make it a
lot easier to understand any validation errors that
are thrown at you. If you get trully stuck you can
also email the LDP's docbook mailing list for extra
hints. Information on subscribing is available from
<xref linkend="mailinglists" />
</para>
<para>
All tags that have been depricated or changed for 4.x are listed
in <cite>DocBook: The definitive guide</cite>,
published by O'Reilly and Associates. This book is
also available online from <ulink
url="http://www.docbook.org">http://www.docbook.org</ulink>.
</para>
<itemizedlist>
<section id="differences3040">
<title>Differences between version 3.x and 4.x</title>
<para>
Here are a few elements that are of particular
relevence to LDP authors:
</para>
<listitem>
<formalpara>
<title><sgmltag>artheader</sgmltag></title>
has been changed to
<sgmltag>articleinfo</sgmltag>.
Most other header elements have been renamed to info.
</formalpara>
</listitem>
<listitem>
<formalpara>
<title><sgmltag>graphic</sgmltag></title>
has being depricated and will be removed as of DocBook 5.x.
To prepare for this, start using
<sgmltag>mediaobject</sgmltag>. There is more
information about <sgmltag>mediaobject</sgmltag>
in <xref linkend="images"/>.
</formalpara>
</listitem>
<listitem>
<formalpara>
<title><sgmltag>imagedata</sgmltag> file formats
must now be written in UPPERCASE letters. If you
use lowercase or mixed-case spellings
for your file formats, it will fail.
</formalpara>
<para>
Valid:
</para>
<screen>
&lt;imagedata format="EPS" fileref="foo.eps"&gt;
</screen>
<para>
Invalid:
</para>
<screen>
&lt;imagedata format="eps" fileref="foo.eps"&gt;
</screen>
</listitem>
</itemizedlist>
</section>
</section>