old-www/LDP/LG/issue43/gibbs/Web_Design.html

690 lines
36 KiB
HTML

<!--startcut ==========================================================-->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<title>Better Web Page Design Under Linux LG #43</title>
<LINK href="special.css" rel="stylesheet" type="text/css">
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#0000AF"
ALINK="#FF0000">
<!--endcut ============================================================-->
<H4>
"Linux Gazette...<I>making Linux just a little more fun!</I>"
</H4>
<br><br><hr><br><br>
<!--===================================================================-->
<center>
<H1><font color="maroon">Better Web Page Design Under Linux</font></H1>
<H4>By <a href="mailto:chrisgibbs@geocities.com">Chris Gibbs</a></H4>
</center>
6-Jul-1999 revision: changes for the style sheet, also updated the URL
for the Netscape multimedia plugin.
<BR>
<BR> Note: The author does not have regular Internet access at this time and may be
slow in responding to e-mails.<BR>
<br><br><hr><br><br>
<H2>Contents</H2>
<P ALIGN=LEFT><A HREF="#bwd1"><FONT COLOR="#280099">Wysiwyg Editors</FONT></A>
</P>
<P ALIGN=LEFT><A HREF="#bwd2"><FONT COLOR="#280099">The Advantage of
Linux</FONT></A>
</P>
<P ALIGN=LEFT><A HREF="#bwd3"><FONT COLOR="#280099">Setting up Apache</FONT></A></P>
<P ALIGN=LEFT><A HREF="#bwd3.1"><FONT COLOR="#280099">Starting and
Testing Apache</FONT></A></P>
<P ALIGN=LEFT><A HREF="#bwd4"><FONT COLOR="#280099">Search Engines</FONT></A></P>
<P ALIGN=LEFT><A HREF="#bwd5"><FONT COLOR="#280099">SGML Support</FONT></A></P>
<H2>Introduction</H2>
<P>Recently an article was published in Linux Gazette entitled <B>Web
Page Design Under Linux</B>. This article produced some criticism in
later issues. The main criticism seems to have been of the authors
preference for hand coding HTML rather than using a HTML editor like
the Windows HotDog editor. This is an argument I do not really want
to get involved with. Neither do I want to spend much time on style.
Whilst in most cases users want simple fast loading, clear pages,
there will always be a place for garish eye candy, huge graphics and
all kinds of complexities that take forever to download on a 28k
modem. What I do want to address are the great things that linux
offers. Great things that are free and would cost a fortune to
implement on other operating systems. In particular I shall explain
how to set your linux box up to be your own intranet server, and
thereby fully exploit the abilities Linux offers for designing
applications for the Web.
</P>
<p>
One point I think needs making, and which does not fit in with the rest of this
article, is the <b>Plugger</b> Plug-in for Netscape Navigator. In the past
many people have complained that Netscape plug-ins are not generally available
for Linux. Plugger from <a href="http://fredrik.hubbe.net/plugger.html">
http://fredrik.hubbe.net/plugger.html</a>,
seeks to address this by providing support for many audio/video/image types.
<H2><A NAME="bwd1"></A>Wysiwyg Editors</H2>
<P>By way of introduction though, I will put my two penny worth into
the 'editor argument'. I have never yet found a HTML editor that I
like! I am writing this article in StarOffice 5.0. I have never used
it to write HTML so this is something of a test. I expect I'll have
to edit the source when I finish writing. Another editor that seems
as good as any other I have tried is the composer part of Netscape
Communicator. I find this irritating, very very irritating. Why?
Because I like my text to be fully justified. OK I know that some
people think that full justification 'goes against the spirit of
HTML', but personally I would rather read text that is fully
justified than text which is not. I do not believe I am alone in this
preference.</P>
<P>What happens with Netscape is that after I have spent a couple of
hours designing some pages until I am happy with them, I load them
all into vi and change every occurrence of <B>&lt;P&gt;</B> into <B>&lt;P
align=justify&gt;</B>, which can take some time if I've written a lot
of text. Now a little later I want to make some changes, so I load
the pages into Netscape Composer and I make some changes. But whist
Communicator understands <B>&lt;P align=justify&gt;</B>, Composer
does not. In fact Composer does not allow <B>&lt;P align=justify&gt;</B>
and changes each occurrence back to <B>&lt;P&gt;.</B>... Bummer... I
have to re-edit all the source by hand again. If I thought there was
some advantage to using Composer, rather than hand writing my HTML I
guess I would write a little program to search HTML files for <B>&lt;P&gt;</B>
and replace with <B>&lt;P align=justify&gt;</B>. But this is not the
only short coming of HTML wysiwyg editors. They just don't seem able
to do exactly what I want, how I want.
</P>
<P>OK in fairness I am now impressed with StarOffice! Although there
is no button to give full justification, it is easy to edit the <B>Text
Body style</B> so that full justification is automatic. It is also
easy to automatically indent the first line of a paragraph, set
double line spacing etc. etc. Maybe I will be converted to using a
wysiwyg editor for my HTML after all.
</P>
<P>One feature that seems to be missing from StarOffice 5.0, is any
easy way to define lists. Tables are well supported, but lists are
not. I guess that it should be possible to define some new styles to
allow the use of different kinds of list, but one would have thought
that a button should be available for them. Also given the different
kinds of list available for HTML, one might find that the styles menu
becomes cumbersome and more difficult than it should be.
</P>
<P>OK simple layouts are quicker with a HTML editor, but if you want
full control you have to hand edit at some point. So to my way of
thinking if you want to write good HTML you must learn HTML. It is a
very bad idea to to think you can skip learning HTML by getting an
editor that works like a word processor. You will not have the skills
you need to produce good web pages. HTML is very easy to learn. Once
you know it then you might find that Netscape or StarOffice provide
useful tools to help you. But please do not think such tools replace
the need to be able to hand code HTML.
</P>
<P>The essential document to read if you want to produce great
Web-Pages efficiently is <A HREF="http://www.w3.org/TR/REC-html40">HTML
4.0 (W3C: HTML 4.0 Specification)</A>, this is the full Document Type
Definition for HTML and SGML. For once I have taken my own advice and
read it! The problems I mention above regarding text formatting have
all been solved for me! I look at the HTML source StarOffice has
given, whilst I am impressed, I am not happy. Again I think that an
editor like <B>vi</B> or <B>emacs</B> really is better and more
efficient than using a <B>wysiwyg</B> editor.
</P>
<P>The reason is that HTML 4.0 allows the use of <B>Style Sheets</B>.
This article depends upon the use of a style sheet, <A HREF="special.txt">special.css</A>.
This is a document that says how a browser should render my document.
An important feature is that browsers that cannot display certain
things (e.g. graphics) are not disadvantaged. All browsers can access
this page in the way I intend them to. In the past authors have been
forced to use techniques to format their pages that cannot be
displayed correctly on all browsers. Propriety HTML extensions, the
conversion of text into graphics, the use of images for white space
control, the use of tables for layout and even the use of programs,
have all been used to format text, all these methods cause difficulty
for users and extra work for developers. The correct use of style
sheets avoids these problems.
</P>
<P>Once you are familiar with the use of <B>style sheets</B>, it will
not matter how badly Netscape Composer performs, or how unfamiliar
you are with StarOffice, using an editor like <B>vi</B>, really can
be simpler than using something like <B>Hotdog</B>. Load <A
HREF="special.txt">my style sheet</A> into your favorite editor and see for
yourself how easy it is to change the look and feel of this document (this link
and the one above are to an identical copy called <tt>special.txt</tt>, so
that you can see the source without the browser parsing it).
</P>
<p>STOP PRESS.....
<P>Even as I am writing this document, I have found yet another web browser
for linux! This one is worh some attention since it is produced by the W3
consortium, the same people who define the HTML specification. In fact
this is the browser they use to test their specification. The following text
is displayed when you start it for the first time:-
<dl>
<dt><b>Amaya</b></dt><dd> is a Web client that acts both as a browser and as
an authoring tool. It has been designed with the primary purpose of
demonstrating new Web technologies in a WYSIWYG environment. The current
version implements HTML, MathML, CSS, and HTTP.<br><br></dd>
<dt><b> Main Features</b></dt><dd>
With Amaya, you can manipulate rich Web pages containing forms, tables and
the most advanced features from HTML. You can create and edit complex
mathematical expressions within Web pages. You can style your documents
using Cascading Style Sheets. You can publish documents on local or remote
servers with the HTTP Put method.<br>
Browsing and authoring are integrated seamlessly. You can browse and edit
Web pages at the same time. For that reason, a simple click just moves the
caret to allow text editing; to follow a link, you have to double click.
<br><br></dd>
<dt><b>Online Manual</b></dt><dd>
A User's Manual is available online. You can browse it with the Help menu,
which displays each section separately. You can also print it: just follow
the Online Manual link below. You'll get the front page. Then build the
whole book with the "Make book" entry from the Special menu and print the
result.</dd>
</dl>
<p>
This browser certainly has some advantages. The version I have is still beta
(1.3b), so there are some short comings. I found that the <b>File -
Open Document</b> dialog can resize its file box so it is non-functional.
Also for some reason not all directories can appear in the directory box. At
least one can specify the required file in the <b>URL</b> box! The fact
that the manual does not come with the package is a definate minus for me.
<p>What is nice about the browser is the pleasent way it renders pages. This
page, for instance, uses full text justification, Amaya can actually split
words in the traditional manner when required.
<p>
The really nice thing about this browser is the fact that you can edit files as you browse them. So if you are creating a document with many pages it is
easy to switch between them. The down side of this is that there seems to be
no way to to edit or view document source. Something that I would like to
see in other browsers is the ability to create a "Table of Contents", with
Amaya you can generate one based on the <b>&lt;H...&gt;</b> elements in your
document. This will pop up as a seperate window and allow you to easily
navigate through a document that has no links of its own.
<p>At about 4.5 Megbytes, this is probably a very good alternative to
StarOffice if you do not have the disk space required for StarOffice. I
am certainly interested in seeing how this browser develops in the future.
If you want to give it a try you can obtain it from <a href="http://www.w3.org/Amaya">the Amaya homepage</a>. Additionaly there was a review of an earlier
release of Amaya in Linux Gazette some years ago <a href="../../issue15/amaya.html">see issue 15</a>. All I have to add to that review is that improvements must have been made. It seems the same in appearence as the screen shots show. Amaya displays the old style of Linux Gazette Contents pages quite well, but the new style in the last three or four issues is completey garbled. When Amaya starts up it no longer looks for a page on its home site, and
I have not seen it seg fault as described. On the whole it does a very good job.
<H2><A NAME="bwd2"></A>The Advantage of Linux</H2>
<P>Now I've got that out of my system I'll get on to my main point.
Drum roll please..... With linux it is simple to build a system you
can gain http access to. Trumpet fanfare please.</P>
<H3>Why is http access to your machine important?</H3>
<P>Even if, like me, you are a stand alone machine, with no kind of
network, it is easy to start up your favorite browser and <B>http://</B>
yourself. This means you can get into the wonderful worlds of cgi
scripts, client server applications, java. etc. etc. etc. Without the
need to access a 'real' network you can test any network application
you care to develop for the Internet. You can test every aspect of
your web design without wasting a phone bill. You can test
applications safe in the knowledge that no matter what mistakes are
in your code, only the machine you are using will be at risk, the
&quot;<I>real</I>&quot; network will be unaffected until you decide
your code is working correctly.</P>
<P>Web page design is not just about putting text/graphics and links
onto the Internet. Increasingly it is about providing good user
interfaces to network applications and providing an efficient means
of communication. In the past only the largest corporations could
afford to implement a WAN (Wide area network). Today anybody with a
modem and pc can join the Internet, or implement their own intranet
(a private network that acts in the same way as the Internet).</P>
<P>To illustrate my points consider the following scenario. You own a
small tobacconists and live in a village called Tiny. Because the
village is small you do not have many customers, so you don't sell
items in vast numbers. That means you do not buy in large quantities
from your suppliers and you cannot get the kind of discounts larger
shops would get. But you have many relations and friends in other,
similar villages who also run small tobacconists. If you all clubbed
together and ordered your supplies as one entity you could take the
discount advantages of bulk buying from your suppliers. The only
problem is knowing which shop needs what items at any given time. You
know that the discounts you would get would allow you to employ a van
driver to deliver to all the shops and still leave each shop a
significant saving.
</P>
<P>How can web design under linux help you solve this problem?</P>
<P>The 'man with a van' needs information, what to buy in what
quantity and where to deliver it. This sounds like a classic database
application. Linux offers many sql database solutions. We want to
keep costs to a minimum, we also want to maximize security and
reliability. So good choices might be ingres or postgreSQL. If we
look at these DBMS's we find that postgreSQL comes with a java
interface. So lets say we design a suitable database with postgreSQL.
This database will be held on a box that will be our server.
</P>
<P>What we need is the ability for each shop to communicate with the
server to tell it what stock we need to buy in. Shop keepers do not
have to be computer literate. They also do not want to spend much
money on computer systems. At least at this time it is unlikely that
they could be persuaded to learn a UNIX operating system like linux.
Cheap boxes already have Windows. An ideal solution is one where each
shop can dial into the server, the manager can start up his/her
favorite browser and use it to enter information to the server. It
should not matter what operating system each shop uses.
</P>
<P>What does our server need to do?</P>
<P>The first thing is to get Apache set up and running. Apache is a
web server and comes with most if not all linux distributions as
standard. What is not always clear is how to set it up correctly.
This is something an installation program cannot do (easily) and
needs to be done by hand. It is Apache that allows us to http
ourselves. Of course, we will also need to allow remote machines to
dial into our server, but that is a matter outside the scope of this
document.
</P>
<P>Once Apache is running we can design a java application to act as
a user interface to our database.</P>
<P>We can test both the client and the server parts of our
application on our server until we are certain it performs as
required.
</P>
<P>Then all we need to do is allow the shopkeepers to be able to dial
into the server and gain access via their browsers to the java
database interface.</P>
<P>The wonderful thing is that at the test stage we only need to use
one linux box which acts as both client and server at the same time.</P>
<H2><A NAME="bwd3"></A>Setting up Apache</H2>
<P>If you do not already know, then Apache is one of the most common
http servers in existence. A great many ISP's (Internet Service
Providers) use Apache to give their clients (i.e. You) access to the
world wide web.
</P>
<P>This document does not attempt to address the requirements of a
true Internet or intranet server. All I am concerned with here is
getting Apache up and running on a standalone machine so that
client/server software can be tested. In particular I am not
concerned with security issues here. If you do not intend to have a
permanent network connection then all should be well. If you intend
other machines to have access to your http server then you should
read all the relevant documentation. Complete configuration of Apache
can be a very complex issue which does not fall within the scope of
this document.</P>
<P>Modern Linux distributions, such as S.u.S.E., have special
requirements for setting up Apache correctly. To avoid confusion
please read the documentation that came with both your linux
distribution and your Apache distribution. The following steps will
work for any Linux distribution, but be warned, if your distribution
has special requirements I cannot be responsible for getting your
system startup files in a mess.
</P>
<P>For instance I shall describe how to start Apache automatically at
boot time by adding a line to your /etc/inittab. Whilst some
Slackware users will benefit from this approach S.u.S.E. users should
find that it is better to edit their /etc/rc.config file in the
appropriate manner.</P>
<H3>Preparing your machine for Apache</H3>
<P>These steps will prepare your machine for the installation of
Apache. You might find that Apache is already installed, following
the above steps will not hurt such installations.
</P>
<OL>
<LI>Make certain you have set your <B>/etc/HOSTNAME</B>
correctly. I call my machine <B>Hawklord</B><br><br>
<LI>Create a new account for the httpd
administrator. I use the user <B>wwwrun</B>, whose primary group is
<B>nogroup</B> (65534).<br><br>
<LI>Edit your <B>/etc/hosts</B> to reflect the name
of your machine. I have the entries
<PRE> 127.0.0.1 localhost
127.0.0.2 Hawklord.Varteg Hawklord </PRE>
<LI>Edit your <B>/etc/hosts.allow</B> I have
<PRE> ALL: 127.0.0.1
ALL: 0.0.0.0
ALL: localhost
ALL: Hawklord.Varteg
</PRE>
</OL>
<P>If Apache is not already installed, find a pre-compiled version
and install it as per the instructions. You should find that
configuration files are placed under <B>/etc/httpd</B>, and other
files are installed under <B>/usr/local/httpd</B>.</P>
<P>The directory <B>/usr/local/httpd/htdocs</B> should contain the
Apache user manual in html format. Actually this directory will
become the root directory of our http site, so you may want to move
this documentation elsewhere eg. <B>/usr/doc/Apache</B>.
</P>
<H3>Plan your http site</H3>
<P>When you log into a http site, eg <B>http://linux.org</B>, you
find yourself at the root of what can be a very complicated directory
structure. You can think of a http site as being a file system just
like your own root file system. Whilst it is true that to a user the
http site will look like a regular file system, the reality on the
servers hard disk(s) can be very different. It is important to
understand the differences and use them to your advantage.
</P>
<P>On my system the document root is at /usr/local/httpd/htdocs, and
this is the directory a user lands in when they access
<B>http://Hawklord.Varteg</B>. But there is only one file and no
sub-directories on my hard disk. I only keep index.html in the
physical location /usr/local/httpd/htdocs. All the documentation
users can access is held in other locations on my hard disks.
</P>
<P>Looking again at /usr/local/httpd you should find other
sub-directories, in particular cgi-bin and icons. These directories
should seem to be located under your document root because they will
contain files that should be available to any html file on your site
that requires them. Though a user should not be able to directly
access these directories. Much of my documentation is under /usr/doc,
so I make that directory appear as /doc to the http server.</P>
<P>What this means is that you can store all your documentation on
the server in locations that seem logical to you, you do not need to
copy files or even make symbolic links to /usr/local/httpd/htdocs.
Instead plan how you want your documentation to appear to a user.
Also you can have directories that users cannot directly access, but
which html documents can access.
</P>
<P>For instance, the directory <B>/usr/doc/</B> contains
</P>
<PRE STYLE="margin-bottom: 0.50cm"> Linux_gazette Howto Ldp java-documentation</PRE>
<P>
I also want to access files under <B>/usr/hobbies/literature</B> and
<B>/usr/src/java/applets</B>
</P>
<P>I want my site to have the following structure:
</P>
<PRE> / ---&gt; cgi-bin
docs ---&gt; Linux_gazette
Howto
Ldp
java-documentation
literature
icons
java_applets
</PRE>
<P>
Planning your http site in this way will save you headaches in the
future!</P>
<H3>httpd.conf</H3>
<P><B>/etc/httpd/httpd.conf</B> is the main configuration file for
Apache. Some versions of Apache and/or Linux distributions recommend
that all configuration information is kept in this file. Other
versions recommend that you use all three files I shall mention
below. If you want to keep all information in one file, simply put
all the information in one file, there is no real difference between
the two methods. You will find that the example files will contain
sufficient comments to enable you to make the best choices for your
system. I am only going to describe the changes you need to make to
get Apache to work for you. Careful reading of the files will let you
configure Apache better for your needs.
</P>
<P>I am aware that a TCL configuration utility called <a href="http://butler.disa.mil/ApacheConfigClient">Comanche</a> exists for Apache.
However, this is still in an early stage of development, so I do not
recommend it for beginners. I found in practice the utility would not
function correctly if you use only <B>httpd.conf</B> to configure
your system. However it could prove useful for experimenting with different
configurations.
</P>
<P><B>For each line in the configuration files you can assume that
your example file has a correct or sensible entry, unless I
specifically mention it. Back up the examples before you make any
changes!</B></P>
<DL>
<DT><B><I>ServerType</I> standalone.</B></DT><DD>
Please use standalone unless you know exactly what you are doing.<br><br></DD><DT>
<B><I>Port</I> 80</B></DT><DD>
Unless you have changed something this is correct, so do not change
it.<br><br></DD><DT>
<B><I>HostnameLookups</I> on</B></DT><DD>
Again, it is probably a mistake to change this unless you know
otherwise.<br><br></DD><DT>
<B><I>User</I> wwwrun</B></DT><DD>
This entry should refer to the user we set up above to be the httpd
administrator.<br><br></DD><DT>
<I><B>Group</B></I></DT><DD>
This entry should refer to the primary group you defined for the
httpd administrator.<br><br></DD><DT>
<B><I>ServerAdmin </I>root@localhost</B></DT><DD>
This is the address Apache will use to send e-mails with details
about problems with the server. Using <B>localhost</B> rather than
<B>Hawklord.Varteg</B> seems to be more reliable.<br><br></DD><DT>
<B><I>ServerRoot</I> /usr/local/httpd</B></DT><DD>
This should point to the location you installed Apache's main files.
By default this is <B>/usr/local/httpd</B><br><br></DD><DT>
<B><I>ServerName</I> Hawklord.Varteg</B></DT><DD>
This should be the fully qualified domain name of the server. It
should be the same as the entry you made in <B>/etc/hosts.allow</B>
and <B>/etc/hosts </B>above.<br><br></DD><DT>
<I><B>Logs</B></I></DT><DD>
Entries concerning log files should probably be left as they are
until you feel confident about changing them. Though you might want
to experiment with the <B><I>loglevel</I></B> entry if you
experience problems.
</DD></DL>
<H3>
srm.conf</H3>
<P>This file contains site specific information. It is where we
define how our site will look to a user.</P>
<DL>
<DT><I><B>DocumentRoot</B></I></DT><DD>
should refer to the directory on our hard disk that will be the root
directory of our site. For our example this is
<B>/usr/local/httpd/htdocs</B><br><br></DD><DT>
<I><B>DirectoryIndex</B></I></DT><DD>
is the name of the file that should be loaded by a browser when a
user enters a directory without specifying a filename, e.g.
<B>http://Hawklord.Varteg/</B> or <B>http://Hawklord.Varteg/docs/</B>.
index.html is a sensible default.<br><br></DD><DT>
<I><B>Alias .....</B></I></DT><DD>
Each line starting <B>Alias </B>will define a virtual directory on
our system. For the example above this should include:
<PRE> Alias /cgi-bin/ /usr/local/httpd/cgi-bin/
Alias /docs/ /usr/doc/
Alias /docs/Linux_gazette/ /usr/doc/Linux_gazette/
Alias /docs/Howto/ /usr/doc/Howto/
Alias /docs/LDP/ /usr/doc/LDP/
Alias /docs/java-documentation/ /usr/doc/java-documentation/
Alias /docs/literature/ /usr/hobbies/literature/
Alias /icons/ /usr/local/httpd/icons/
Alias /java_applets/ /usr/src/java/compiled/
</PRE>
</dD>
<dt><I><B>ErrorDocument</B></I></DT><DD>
Error documents are the response the server will give when the user
types a wrong <B>URL</B>, or tries to access a restricted file or
directory etc. Apache gives good default error documents, but you
can override this behavior and provide your own responses. I keep my
error documents in the directory <B>/usr/local/httpd/error</B></DD></DL>
<H3>
access.conf</H3>
<P>This file contains permissions for our sites directories. If. when
you test your configuration by starting <B>httpd</B> and pointing
your browser to (eg.) <B>http://Hawklord</B>, or <B>http://localhost</B>
(both will work for the above example), you get a file access error
you will need to alter this file. Each directory in your site should
have its own entry.
</P>
<P>By default Apache has a very restricted set of permissions for the
root directory, I have found that changing to:
</P>
<PRE> &lt;Directory /&gt;
Options All
Order allow,deny
Allow from all
Options FollowSymLinks
&lt;/Directory&gt;</PRE>
<P>
solved some problems for me. It is important to realize that a
directory inherits its permissions from its parent directory. So if
you want to allow outside access to your site you need to take great
care when setting up your directory permissions.
</P>
<P><BR><BR>
</P>
<H2><A NAME="bwd3.1"></A>Starting and Testing Apache</H2>
<P>Once you are satisfied that you have correctly installed and
configured Apache, you will want to test it! Log into your machine as
root. At the prompt type:
</P>
<PRE> #: httpd &amp;</PRE>
<P>Now you can log into your machine as any user, start your favorite
browser and enter the <B>URL</B> <B><I>http://localhost</I></B>. If
all goes well you should load the Apache site file <B>index.html</B>.
That is unless you moved the Apache documentation and provided your
own <B>index.html</B> in <B>/usr/local/httpd/htdocs</B>
</P>
<P>Once you ar satisfied that all is well, you will want to have
<B>httpd</B> start at system boot time. Some Linux distributions,
such as Red-Hat or S.u.S.E. will have a script to start Apache in
their <B>init.d</B> directory. If this is the case then you just need
to enable the script for <B>sys V init</B> in the normal manner.
</P>
<P>As an alternative you can put the following line in your
<B>/etc/inittab</B>
</P>
<PRE STYLE="margin-bottom: 0.50cm"> ap:45:once:/bin/su --command=/usr/sbin/httpd</PRE>
<P>
'ap' must be a unique identifier. '45' refers to the runlevels for
which the command will be executed. Once is probably safer to use
than 'respawn', since if there is a mistake in this line you will see
a lot of error messages ;-(
</P>
<P><A NAME="bwd4"></A>The final part of the line '/bin/su
--command=/usr/sbin/httpd', is intended to start up Apache as a
process owned by wwwrun. It would be wise to test this command before
you put it in your <B>/etc/inittab</B>.
</P>
<H2>Search Engines</H2>
<P>If you have Apache running, and a large linux installation, then
you might want to consider implementing a search engine. S.u.S.E.
Linux provides <A HREF="http://htdig/sdsu/edu">htdig</A>, in fact to
gain full benefit from the S.u.S.E. Help System you need to use
something like htdig. The only problem is the disk space you will
need. I have a 1Gig partition devoted to documentation, this may seem
a lot to many users! I have a lot of personal documentation, program
documentation (increasingly this is HTML), all issues of Linux
Gazette, Gimp documentation, java documentation etc. This takes about
500 Meg. The database htdig uses is between 200 - 300 Meg on my
system. To update the database I need 200 - 300 Meg spare under /tmp.
Actually when I update the database I change the location of /tmp
since I do not have enough space on my root partition. Now since I
have arranged all the documentation to be available to Apache, it is
all referenced in htdig's database. If I have a question about any
aspect of linux, or any of my personal subjects, all I have to do is
formulate a suitable search pattern. I cannot adequately describe the
savings in time this has given me! In the past I would have needed to
access newsgroups to find answers to my problems. With htdig I can
avoid this 99.9% of the time! Given the low cost of hard disk space,
the fact that current program documentation is usually given as html,
that most documentation of any kind is available as html, then it
makes good sense to use Apache in conjunction with a search engine in
order to have a most efficient information retrieval system.
</P>
<P>Htdig may not be perfect, if you are used to Infoseek or lycos, it
is a bit annoying because you cannot search for a phrase e.g.
&quot;starting the x server&quot;. Rather a document is searched for
that contains all the words you enter. An advantage is that related
words are searched for as well, e.g. if you search for 'god' you can
also get results for 'gods' and 'godly'. Once you get used to htdig
it becomes an indispensable tool. The time it saves you in looking
for information is well worth the cost in terms of disk space. (on my
system the real cost is about 250 Meg, though I need another
temporary 250Meg when re-building the database).</P>
<a name="bwd5"></a><h2>SGML Support</h2>
<p>Finally I shall mention Linux's <b>SGML</b> (Standard Generalized Markup Language) support, this is not normally concidered part of web page design
since most home users will simpy want to be able to create their own
HTML home pages and have no other use for such documents.
<p>However, a great many people will want to produce documents in many
formats. The same document might need to be available for publication as a book, or as an info page as well as being available as web pages. The linux
documentation project contains many documents that are available in different formats according to users needs.
<p>SGML allows a single source to be used to produce many different kinds of
text format. The following package descriptions are taken directly from
the S.u.S.E. 6.0 distribution, though they should all be available for
other distributions:
<br><hr>
Package "sgmltool"<br><br>
<b>SGML-Tools - a text-formatting package</b><br>
SGML-Tools is a text-formatting package based on SGML (Standard Generalized Markup
Language), which allows you to produce LaTeX, HTML, GNU info, LyX, RTF, and plain ASCII (via
groff) from a single source.
<br><br>
This system is tailored for writing technical software documentation, an example of which are the Linux
HOWTO documents. It should be useful for all kinds of printed and online documentation.
<br><br>
SGML-Tools is not able to process arbitrary SGML documents; in such a case, give jade_dsl a try and
write your own DSSSL scripts (take the docbk30 package as an example).
<hr>
Package "jade_dsl"<br><br>
<b>
DSSSL-Engine for SGML documents</b><br><br>
Jade is an implementation of DSSSL (Document Style, Semantics and Specification Language);
pronounce it as "dissl" -- it rimes with whistle.
<br><br>
It has backends for SGML, RTF, MIF, TeX, and HTML.
<br><br>
The parser "nsgmls" and helper tools like "sgmlnorm", "spam", "spent", and "sx" are now included in
the separate package "sp".
<br><br>
You'll find the documentation at /usr/doc/packages/jade_dsl/.
<br><br><hr>
Package "sp"
<br><br>
<b>
SGML parser tools</b><br><br>
The tools of this package provide the possibility to manage SGML and XML documents.
<br><br>
It contains the parser `nsgmls' and the supporting programs `sgmlnorm', `spam', `spent', and `sx'. `sx' is
useful as a converting tool from SGML to XML, the comming WWW standard. You'll find the
documentation for all the programs under /usr/doc/packages/sp/.
<br><br>
<hr>
Package "sp_libs"
<br><br>
<b>Libries required for sp and jade</b><br><br><hr>
Package "gf"
<br>
<br>
<b>
A "general formatter" for SGML documents
</b><br><br>
`gf' from Gary Houston is short for "general formatter", i.e., it can work on documents which use the
ISO "general" document type definition (DTD). It can convert SGML documents conforming to a small
number of DTDs into various output formats: LaTeX, ASCII, RTF and Texinfo. However not every
output format can be generated for every DTD.
<br><br>
Apart from the general DTD, gf supports the HTML DTD used in the WWW project and Gary's Snafu
DTD. `gf' is not intended as a flexible system for hacking up a formatter for a random DTD, but as a
usable document production system for a few DTDs.
<br><br><hr>
Package "jadetex"
<br>
<br>
<b>
JadeTeX - LaTeX macros to process TeX output from Jade (jade_dsl)
</b><br><br>
With Sebastian Rahtz' macro package `jadetex' is is possible to process the output of the TeX backend
of Jade (jade_dsl). Resulting DVI files are viewable e.g., with `xdvi' or printable like any other DVI file.
<hr>
<p>
I have no real experience with SGML so I will leave the appraisal of these packages to the reader. For some people these will prove indespensible tools for
producing HTML pages.
<!--===================================================================-->
<br><br><hr><br><br>
<center><H5>Copyright &copy; 1999, Chris Gibbs <BR>
Published in Issue 43 of <i>Linux Gazette</i>, July 1999</H5></center>
<!--===================================================================-->
<!--startcut ==========================================================-->
<br><br><hr><br><br>
<A HREF="../index.html"><IMG ALIGN=BOTTOM SRC="../../gx/indexnew.gif"
ALT="[ TABLE OF CONTENTS ]"></A>
<A HREF="../../index.html"><IMG ALIGN=BOTTOM SRC="../../gx/homenew.gif"
ALT="[ FRONT PAGE ]"></A>
<A HREF="../feinberg.html"><IMG SRC="../../gx/back2.gif"
ALT=" Back "></A>
<A HREF="../gm/gm.html"><IMG SRC="../../gx/fwd.gif" ALT=" Next "></A>
<br><br><hr><br><br>
</BODY>
</HTML>
<!--endcut ============================================================-->