mirror of https://github.com/tLDP/LDP
209 lines
6.5 KiB
XML
209 lines
6.5 KiB
XML
<!--
|
|
<!DOCTYPE book PUBLIC '-//OASIS//DTD DocBook XML V4.2//EN'>
|
|
-->
|
|
<section id="docbook">
|
|
<title>Converting Documents to DocBook XML</title>
|
|
|
|
<section id="xmldifferences">
|
|
<title>Differences between XML and SGML</title>
|
|
<para>
|
|
There are a few changes between DocBook XML and SGML. Handling
|
|
these differences should be relatively easy for most small documents,
|
|
and many authors will not need to make any changes
|
|
to convert their documents other than the XML and DocBook
|
|
declarations at the start of their document.
|
|
</para>
|
|
|
|
<para>
|
|
For others, here is a list of what you should keep in mind
|
|
when converting your documents from SGML to XML.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
An element typically has three parts: the start tag,
|
|
the content (your words) and the end tag. Qualifiers
|
|
are added in the start tag and are known as
|
|
attributes. They will always have a name and a
|
|
quoted value.</para>
|
|
<screen><filename
|
|
class="directory">/usr/local<filename></screen>
|
|
<para>
|
|
The start tag contains one attribute (class)
|
|
with a value of directory. The end tag (also filename)
|
|
must not contain any attributes.
|
|
</para>
|
|
</note>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Element names (tags) and their attributes are
|
|
case-dependent--typically lowercase.
|
|
The following will not validate because the end tag
|
|
&tl;PARA> is uppercase:
|
|
</para>
|
|
<screen>
|
|
<para>This part will fail XML validation</PARA>
|
|
</screen>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
Most XML-specific tags (like entity)
|
|
have to be in all capital letters (ENTITY).
|
|
</para>
|
|
</listitem>
|
|
|
|
|
|
<listitem>
|
|
<para>
|
|
All attributes in the start tag must be
|
|
"quoted". This can
|
|
be either single (') or double (") quotes, but
|
|
not
|
|
reverse (`) or <quote>smart quotes</quote>.
|
|
The quote used to start a name="value"
|
|
pair must be the same quote used at the end of
|
|
the value. In other words: "this" would
|
|
validate, but 'that" would not.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
Tags that have a start tag, but no end tag are
|
|
referred to as <quote>empty</quote> because they do
|
|
not contain (wrap around) anything. These tags must still be
|
|
closed with a trailing slash (/). For example:
|
|
<sgmltag>xref</sgmltag> must be written as
|
|
<xref linkend="software"/>. You may not
|
|
have any spaces between the / and >.
|
|
(Although you may have a space after the final
|
|
attribute: <xref linkend="foo" />.)
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
Processing instructions that get sent to the DSSSL
|
|
must have a question mark at the
|
|
end of the tag. The XML version of this tag
|
|
would look like this:
|
|
</para>
|
|
<screen>
|
|
<?dbhtml filename="foo"?>
|
|
</screen>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
If you're converting from SGML to XML, be sure
|
|
file names refer to .xml files instead of .sgml.
|
|
Some tools may get confused if a .sgml file contains XML.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
Tag minimizations were used in SGML instead of
|
|
writing out the element name in the end tag.
|
|
e.g. <sgmltag>para<sgmltag>This is foo.<sgmltag
|
|
class="endtag"></sgmltag> Tag minimizations are
|
|
not supported in XML and their use is
|
|
discouraged in DocBook.
|
|
</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
</section>
|
|
|
|
<section id="differences">
|
|
<title>Changing DTDs</title>
|
|
<para>
|
|
The big changes between version changes in the DTD
|
|
involve changed elements (tags). Elements may be:
|
|
depricated (which means they will be removed in
|
|
future versions); removed; modified; or added.
|
|
Almoost all authors will run into a changed or
|
|
depricated tag when going to from a lower version
|
|
of DocBook to a higher version (e.g. 3.x to 4.x).
|
|
</para>
|
|
<para>
|
|
The <cite>DocBook: The Definitive Guide</cite> does
|
|
an excellent job of showing you how elements fit
|
|
together. For each element it tells you what an
|
|
element must contain (its content model) and what is
|
|
may be contained in (who its parents are). For
|
|
example: a <sgmltag>note</sgmltag> must contain a
|
|
<sgmltag>para<sgmltag>. If you try to write
|
|
<note>Content in a
|
|
note</note> your document will not validate.
|
|
Learning how elements are assembled will make it a
|
|
lot easier to understand any validation errors that
|
|
are thrown at you. If you get trully stuck you can
|
|
also email the LDP's docbook mailing list for extra
|
|
hints. Information on subscribing is available from
|
|
<xref linkend="mailinglists" />
|
|
</para>
|
|
<para>
|
|
All tags that have been depricated or changed for 4.x are listed
|
|
in <cite>DocBook: The definitive guide</cite>,
|
|
published by O'Reilly and Associates. This book is
|
|
also available online from <ulink
|
|
url="http://www.docbook.org">http://www.docbook.org</ulink>.
|
|
</para>
|
|
<itemizedlist>
|
|
|
|
<section id="differences3040">
|
|
<title>Differences between version 3.x and 4.x</title>
|
|
<para>
|
|
Here are a few elements that are of particular
|
|
relevence to LDP authors:
|
|
</para>
|
|
<listitem>
|
|
<formalpara>
|
|
<title><sgmltag>artheader</sgmltag></title>
|
|
has been changed to
|
|
<sgmltag>articleinfo</sgmltag>.
|
|
Most other header elements have been renamed to info.
|
|
</formalpara>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<formalpara>
|
|
<title><sgmltag>graphic</sgmltag></title>
|
|
has being depricated and will be removed as of DocBook 5.x.
|
|
To prepare for this, start using
|
|
<sgmltag>mediaobject</sgmltag>. There is more
|
|
information about <sgmltag>mediaobject</sgmltag>
|
|
in <xref linkend="images"/>.
|
|
</formalpara>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<formalpara>
|
|
<title><sgmltag>imagedata</sgmltag> file formats
|
|
must now be written in UPPERCASE letters. If you
|
|
use lowercase or mixed-case spellings
|
|
for your file formats, it will fail.
|
|
</formalpara>
|
|
<para>
|
|
Valid:
|
|
</para>
|
|
<screen>
|
|
<imagedata format="EPS" fileref="foo.eps">
|
|
</screen>
|
|
<para>
|
|
Invalid:
|
|
</para>
|
|
<screen>
|
|
<imagedata format="eps" fileref="foo.eps">
|
|
</screen>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
</section>
|
|
</section>
|