912 lines
28 KiB
HTML
912 lines
28 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<title>Using m4 to write HTML.</title>
|
|
</HEAD>
|
|
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#0020F0"
|
|
ALINK="#FF0000">
|
|
<H4>
|
|
"Linux Gazette...<I>making Linux just a little more fun!</I>"
|
|
</H4>
|
|
|
|
<P> <HR> <P>
|
|
<!--===================================================================-->
|
|
|
|
|
|
<!-- Copyright (C) 1997 Bob Hepple
|
|
|
|
This is an example of the use of m4 to pre-process HTML
|
|
code. Using the FSF version of m4, you generate HTML from
|
|
this file with:
|
|
|
|
m4 -P using_m4.m4 >using_m4.html
|
|
|
|
Please don't be put off by the use of nested quotation marks
|
|
in the code examples. This sample is rather an extreme
|
|
stress-test of the idea. Normal usage is much simpler.
|
|
|
|
This program is free software; you can redistribute it
|
|
and/or modify it under the terms of the GNU General Public
|
|
License as published by the Free Software Foundation; either
|
|
version 2 of the License, or (at your option) any later
|
|
version.
|
|
|
|
This program is distributed in the hope that it will be
|
|
useful, but WITHOUT ANY WARRANTY; without even the implied
|
|
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
|
|
PURPOSE. See the GNU General Public License for more
|
|
details.
|
|
|
|
You should have received a copy of the GNU General Public
|
|
License along with this program; if not, write to the Free
|
|
Software Foundation, Inc., 675 Mass Ave, Cambridge, MA
|
|
02139, USA. -->
|
|
|
|
<A NAME="Contents"><H2 align="center">Using <EM>m4</EM> to write
|
|
HTML.</H2></A>
|
|
<h4 align="center">By Bob Hepple
|
|
<a href="mailto:bhepple@pacific.net.sg">bhepple@pacific.net.sg</a>
|
|
</h4>
|
|
<P><HR>
|
|
<P>
|
|
<H4>Contents:</H4>
|
|
|
|
<UL><P><LI><A HREF="#1. Some limitations of HTML">1. Some limitations of HTML</A>
|
|
<LI><A HREF="#2. Using <EM>m4</EM>">2. Using <EM>m4</EM></A>
|
|
<LI><A HREF="#3. Examples of <EM>m4</EM> macros">3. Examples of <EM>m4</EM> macros</A>
|
|
<LI><A HREF="#3.1 Sharing HTML elements across several page">3.1 Sharing HTML elements across several page</A>
|
|
<LI><A HREF="#3.2 Managing HTML elements that often change">3.2 Managing HTML elements that often change</A>
|
|
<LI><A HREF="#3.3 Creating new text styles">3.3 Creating new text styles</A>
|
|
<LI><A HREF="#3.4 Typing and mnemonic aids">3.4 Typing and mnemonic aids</A>
|
|
<LI><A HREF="#3.5 Automatic numbering">3.5 Automatic numbering</A>
|
|
<LI><A HREF="#3.6 Automatic date stamping">3.6 Automatic date stamping</A>
|
|
<LI><A HREF="#3.7 Generating Tables of Contents">3.7 Generating Tables of Contents</A>
|
|
<LI><A HREF="#3.7.1 Simple to understand TOC">3.7.1 Simple to understand TOC</A>
|
|
<LI><A HREF="#3.7.2 Simple to use TOC">3.7.2 Simple to use TOC</A>
|
|
<LI><A HREF="#3.8 Simple tables">3.8 Simple tables</A>
|
|
<LI><A HREF="#4. <EM>m4</EM> gotchas">4. <EM>m4</EM> gotchas</A>
|
|
<LI><A HREF="#4.1 Gotcha 1 - quotes">4.1 Gotcha 1 - quotes</A>
|
|
<LI><A HREF="#4.2 Gotcha 2 - Word swallowing">4.2 Gotcha 2 - Word swallowing</A>
|
|
<LI><A HREF="#4.3 Gotcha 3 - Comments">4.3 Gotcha 3 - Comments</A>
|
|
<LI><A HREF="#4.4 Gotcha 4 - Debugging">4.4 Gotcha 4 - Debugging</A>
|
|
<LI><A HREF="#5. Conclusion">5. Conclusion</A>
|
|
<LI><A HREF="#6. Files to download">6. Files to download</A>
|
|
</UL><P>
|
|
|
|
|
|
|
|
<P><HR><SMALL><CENTER>This page last updated on Thu Sep 18 22:46:54 HKT 1997
|
|
<BR>
|
|
$Revision: 1.2 $</CENTER></SMALL><HR><P>
|
|
<A NAME="1. Some limitations of HTML"><H2>1. Some limitations of HTML</H2></A>
|
|
|
|
It's amazing how easy it is to write simple HTML pages - and
|
|
the availability of <EM>WYSIWYG</EM> HTML editors like
|
|
<EM>NETSCAPE GOLD</EM> lulls one into a mood of <EM>"don't
|
|
worry, be happy"</EM>. However, managing multiple,
|
|
interrelated pages of HTML rapidly gets very, very
|
|
difficult. I recently had a slightly complex set of pages
|
|
to put together and it started me thinking - <EM>"there has
|
|
to be an easier way"</EM>.
|
|
|
|
<P>
|
|
|
|
I immediately turned to the WWW and looked up all sorts of
|
|
tools - but quite honestly I was rather disappointed. Mostly,
|
|
they were what I would call <EM>Typing Aids</EM> - instead of
|
|
having to remember arcane incantations like <CODE><a
|
|
href="link">text</a></CODE>, you are given a button or a
|
|
magic keychord like ALT-CTRL-j which remembers the syntax and
|
|
does all that nasty typing for you.
|
|
|
|
<P>
|
|
|
|
<EM>Linux</EM> to the rescue! HTML is built as ordinary text
|
|
files and therefore the normal <EM>Linux</EM> text management
|
|
tools can be used. This includes the revision control tools
|
|
such as <EM>RCS</EM> and the text manipulation tools like
|
|
<EM>awk, perl, etc.</EM> These offer significant help in
|
|
version control and managing development by multiple users
|
|
as well as in automating the process of extracting from a
|
|
database and displaying the results (the classic <CODE>"grep
|
|
|sort |awk"</CODE> pipeline).
|
|
|
|
<P>
|
|
|
|
The use of these tools with HTML is documented elsewhere,
|
|
e.g. see Jim Weinrich's article in <EM>Linux Journal</EM> Issue
|
|
36, April 1997, "Using Perl to Check Web Links" which I'd
|
|
highly recommend as yet another way to really flex those
|
|
<EM>Linux</EM> muscles when writing HTML.
|
|
|
|
<P>
|
|
|
|
What I will cover here is a little work I've done recently
|
|
with using <EM>m4</EM> in maintaining HTML. The ideas can
|
|
probably be extended to the more general SGML case very
|
|
easily.
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
<A NAME="2. Using <EM>m4</EM>"><H2>2. Using <EM>m4</EM></H2></A>
|
|
|
|
I decided to use <EM>m4</EM> after looking at various other
|
|
pre-processors including <EM>cpp</EM>, the <EM>C</EM>
|
|
front-end. While <EM>cpp</EM> is perhaps a little too
|
|
<EM>C</EM>-specific to be very useful with HTML, <EM>m4</EM> is a
|
|
very generic and clean macro expansion program - and it's
|
|
available under most Unices including <EM>Linux</EM>.
|
|
|
|
<P>
|
|
|
|
Instead of editing <EM>*.html</EM> files, I create
|
|
<EM>*.m4</EM> files with my favourite text editor.
|
|
These look something like this:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_include(stdlib.m4)
|
|
_HEADER(`This is my header')
|
|
<P>This is some plain text<P>
|
|
_HEAD1(`This is a main heading')
|
|
<P>This is some more plain text<P>
|
|
_TRAILER</CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
|
|
The format is simple - just HTML code but you can now
|
|
include files and add macros rather like in <EM>C</EM>. I use
|
|
a convention that my new macros are in capitals and start
|
|
with "_" to make them stand out from HTML language and to
|
|
avoid name-space collisions.
|
|
|
|
<P>
|
|
|
|
The <EM>m4</EM> file is then processed as follows to create an
|
|
<EM>.html</EM> file e.g.
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4 -P <file.m4 >file.html</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
This is especially easy if you create a "makefile" to
|
|
automate this in the usual way. Something like:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>.SUFFIXES: .m4 .html
|
|
.m4.html:
|
|
m4 -P $*.m4 >$*.html
|
|
default: index.html
|
|
*.html: stdlib.m4
|
|
all: default PROJECT1 PROJECT2
|
|
PROJECT1:
|
|
(cd project2; make all)
|
|
PROJECT2:
|
|
(cd project2; make all)</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
The most useful commands in <EM>m4</EM> include the following
|
|
which are very similar to the <EM>cpp</EM> equivalents (shown in
|
|
brackets):
|
|
|
|
<DL>
|
|
<DT><CODE>m4_include</CODE>:
|
|
<DD>includes a common file into your HTML (<CODE>#include</CODE>)
|
|
<DT><CODE>m4_define</CODE>:
|
|
<DD>defines an <EM>m4</EM> variable (<CODE>#define</CODE>)
|
|
<DT><CODE>m4_ifdef</CODE>:
|
|
<DD>a conditional (<CODE>#ifdef</CODE>)
|
|
</DL>
|
|
|
|
<P>
|
|
|
|
Some other commands which are useful are:
|
|
|
|
<DL>
|
|
<DT><CODE>m4_changecom</CODE>:
|
|
<DD>change the <EM>m4</EM> comment character (normally #)
|
|
<DT><CODE>m4_debugmode</CODE>:
|
|
<DD>control error disgnostics
|
|
<DT><CODE>m4_traceon/off</CODE>:
|
|
<DD>turn tracing on and off
|
|
<DT><CODE>m4_dnl</CODE>:
|
|
<DD>comment
|
|
<DT><CODE>m4_incr, m4_decr</CODE>:
|
|
<DD>simple arithmetic
|
|
<DT><CODE>m4_eval</CODE>:
|
|
<DD>more general arithmetic
|
|
<DT><CODE>m4_esyscmd</CODE>:
|
|
<DD>execute a <EM>Linux</EM> command and use the output
|
|
<DT><CODE>m4_divert(i)</CODE>:
|
|
|
|
<DD>This is a little complicated, so skip on first reading. It is a
|
|
way of storing text for output at the end of normal processing - it
|
|
will come in useful later, when we get to automatic numbering of
|
|
headings. It sends output from <EM>m4</EM> to a temporary file number
|
|
<EM>i</EM>. At the end of processing, any text which was diverted is then
|
|
output, in the order of the file number <EM>i</EM>. File number -1 is the
|
|
bit bucket and can be used to comment out chunks of comments. File
|
|
number 0 is the normal output stream. Thus, for example, you can
|
|
`m4_divert' text to file 1 and it will only be output at the end.
|
|
|
|
</DL>
|
|
|
|
<P>
|
|
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
<A NAME="3. Examples of <EM>m4</EM> macros"><H2>3. Examples of <EM>m4</EM> macros</H2></A>
|
|
<A NAME="3.1 Sharing HTML elements across several page"><H2>3.1 Sharing HTML elements across several page</H2></A>
|
|
<P>
|
|
|
|
In many "nests" of HTML pages, each page shares elements
|
|
such as a button bar like this:
|
|
|
|
<P>
|
|
<A name="nil"></A>
|
|
<BLOCKQUOTE><a href="#nil">[Home]</a> <a href="#nil">[Next]</a> <a
|
|
href="#nil">[Prev]</a> <a href="#nil">[Index]</a></BLOCKQUOTE>
|
|
<P>
|
|
|
|
This is fairly easy to create in each page - the trouble is
|
|
that if you make a change in the "standard" button-bar then
|
|
you then have the tedious job of finding each occurance of
|
|
it in every file and then manually make the changes.
|
|
|
|
<P>
|
|
|
|
With <EM>m4</EM> we can more easily do this by putting the
|
|
shared elements into an <CODE>m4_include</CODE> statement, just like
|
|
<EM>C</EM>.
|
|
|
|
<P>
|
|
|
|
While I'm at it, I might as well also automate the naming of
|
|
pages, perhaps by putting the following into an include
|
|
file, say <CODE>"button_bar.m4"</CODE>:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(`_BUTTON_BAR',
|
|
<a href="homepage.html">[Home]</a>
|
|
<a href="$1">[Next]</a>
|
|
<a href="$2">[Prev]</a>
|
|
<a href="indexpage.html">[Index]</a>)</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
and then in the document itself:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_include button_bar.m4
|
|
_BUTTON_BAR(`page_after_this.html',
|
|
`page_before_this.html')</CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
|
|
The $1 and $2 parameters in the macro definition are
|
|
replaced by the strings in the macro call.
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
|
|
<P>
|
|
|
|
<A NAME="3.2 Managing HTML elements that often change"><H2>3.2 Managing HTML elements that often change</H2></A>
|
|
|
|
<P>
|
|
|
|
It is very troublesome to have items change in multiple HTML
|
|
pages. For example, if your email address changes then you
|
|
will need to change all references to the new
|
|
address. Instead, with <EM>m4</EM> you can do something like
|
|
this in your <CODE>stdlib.m4</CODE> file:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(`_EMAIL_ADDRESS', `MyName@foo.bar.com')</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
and then just put <CODE>_EMAIL_ADDRESS</CODE> in your
|
|
<EM>m4</EM> files.
|
|
|
|
<P>
|
|
|
|
A more substantial example comes from building strings up
|
|
with multiple components, any of which may change as the
|
|
page is developed. If, like me, you develop on one machine,
|
|
test out the page and then upload to another machine with a
|
|
totally different address then you could use the
|
|
<CODE>m4_ifdef</CODE> command in your <CODE>stdlib.m4</CODE> file (just
|
|
like the <CODE>#ifdef</CODE> command in <EM>cpp</EM>):
|
|
|
|
<P>
|
|
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(`_LOCAL')
|
|
.
|
|
.
|
|
m4_define(`_HOMEPAGE',
|
|
m4_ifdef(`_LOCAL', `//127.0.0.1/~YourAccount',
|
|
`http://ISP.com/~YourAccount'))
|
|
|
|
m4_define(`_PLUG', `<A REF="http://www.ssc.com/linux/">
|
|
<IMG SRC="_HOMEPAGE/gif/powered.gif"
|
|
ALT="[Linux Information]"> </A>')</CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
|
|
Note the careful use of quotes to prevent the variable
|
|
<CODE>_LOCAL</CODE> from being expanded. <CODE>_HOMEPAGE</CODE>
|
|
takes on different values according to whether the variable
|
|
<CODE>_LOCAL</CODE> is defined or not. This can then ripple
|
|
through the entire project as you <EM>make</EM> the pages.
|
|
|
|
<P>
|
|
|
|
In this example, <CODE>_PLUG</CODE> is a macro to advertise
|
|
<EM>Linux</EM>. When you are testing your pages, you use the
|
|
local version of <CODE>_HOMEPAGE</CODE>. When you are ready to
|
|
upload, you can remove or comment out the <CODE>_LOCAL</CODE>
|
|
definition like this:
|
|
|
|
<P>
|
|
|
|
<BLOCKQUOTE><PRE><CODE>m4_dnl m4_define(`_LOCAL')</CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
|
|
... and then <EM>re-make</EM>.
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
|
|
<P>
|
|
<A NAME="3.3 Creating new text styles"><H2>3.3 Creating new text styles</H2></A>
|
|
|
|
Styles built into HTML include things like <CODE><EM></CODE> for emphasis and <CODE><CITE></CODE> for citations. With <EM>m4</EM> you can define your own, new styles like this:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(`_MYQUOTE',
|
|
<BLOCKQUOTE><EM>$1</EM></BLOCKQUOTE>)</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
If, later, you decide you prefer <CODE><STRONG></CODE> instead
|
|
of <CODE><EM></CODE> it is a simple matter to change the
|
|
definition and then every <CODE>_MYQUOTE</CODE> paragraph falls
|
|
into line with a quick <CODE>make</CODE>.
|
|
|
|
<P>
|
|
|
|
The classic guides to good HTML writing say things like "It
|
|
is strongly recommended that you employ the logical styles
|
|
such as <CODE><EM>...</EM></CODE> rather than the physical
|
|
styles such as <CODE><I>...</I></CODE> in your documents."
|
|
Curiously, the <EM>WYSIWYG</EM> editors for HTML generate purely
|
|
physical styles. Using these <EM>m4</EM> styles may be a good
|
|
way to keep on using logical styles.
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
<A NAME="3.4 Typing and mnemonic aids"><H2>3.4 Typing and mnemonic aids</H2></A>
|
|
<P>
|
|
|
|
I don't depend on <EM>WYSIWYG</EM> editing (having been brought
|
|
up on <EM>troff</EM>) but all the same I'm not averse to using
|
|
help where it's available. There is a choice (and maybe it's
|
|
a fine line) to be made between:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE><BLOCKQUOTE><PRE><CODE>Some code you want to display.
|
|
</CODE></PRE></BLOCKQUOTE></CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
and:
|
|
<P>
|
|
|
|
<BLOCKQUOTE><PRE><CODE>_CODE(Some code you want to display.)</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
In this case, you would define <CODE>_CODE</CODE> like this:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(`_CODE',
|
|
<BLOCKQUOTE><PRE><CODE>$1</CODE></PRE></BLOCKQUOTE>)</CODE></PRE></BLOCKQUOTE><P>
|
|
|
|
Which version you prefer is a matter of taste and convenience
|
|
although the <EM>m4</EM> macro certainly saves some typing and
|
|
ensures that HTML codes are not interleaved. Another example
|
|
I like to use (I can never remember the syntax for links) is:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(`_LINK', <a href="$1">$2</a>)</CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
|
|
Then, <P>
|
|
|
|
<CODE><a href="URL_TO_SOMEWHERE">Click here to get to SOMEWHERE </a></CODE> <P>
|
|
|
|
becomes: <P>
|
|
|
|
<CODE>_LINK(`URL_TO_SOMEWHERE', `Click here to get to SOMEWHERE')</CODE>
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
<A NAME="3.5 Automatic numbering"><H2>3.5 Automatic numbering</H2></A>
|
|
<P>
|
|
|
|
<EM>m4</EM> has a simple arithmetic facility with two operators
|
|
<CODE>m4_incr</CODE> and <CODE>m4_decr</CODE> which act as you might
|
|
expect - this can be used to create automatic numbering,
|
|
perhaps for headings, e.g.:
|
|
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(_CARDINAL,0)
|
|
|
|
m4_define(_H, `m4_define(`_CARDINAL',
|
|
m4_incr(_CARDINAL))<H2>_CARDINAL.0 $1</H2>')
|
|
|
|
_H(First Heading)
|
|
_H(Second Heading)</CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
|
|
This produces:
|
|
|
|
<BLOCKQUOTE><PRE><CODE><H2>1.0 First Heading</H2>
|
|
<H2>2.0 Second Heading</H2></CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
<A NAME="3.6 Automatic date stamping"><H2>3.6 Automatic date stamping</H2></A>
|
|
|
|
For simple, datestamping of HTML pages I use the
|
|
<CODE>m4_esyscmd</CODE> command to maintain an automatic
|
|
timestamp on every page:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>This page was updated on m4_esyscmd(date)</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
which produces:
|
|
|
|
<P>
|
|
This page was last updated on Fri May 9 10:35:03 HKT 1997
|
|
<P>
|
|
|
|
Of course, you could also use the date, revision and other
|
|
facilities of revision control systems like <EM>RCS</EM> or
|
|
<EM>SCCS</EM>, e.g. <CODE>$Date: 2002/10/09 22:24:22 $</CODE>.
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
<A NAME="3.7 Generating Tables of Contents"><H2>3.7 Generating Tables of Contents</H2></A>
|
|
<P>
|
|
|
|
Using <EM>m4</EM> allows you to define commonly repeated
|
|
phrases and use them consistently - I hate repeating myself
|
|
because I am lazy and because I make mistakes, so I find
|
|
this feature absolutely key.
|
|
|
|
<P>
|
|
|
|
A good example of the power of <EM>m4</EM> is in building a
|
|
table of contents in a big page (like this one). This
|
|
involves repeating the heading title in the table of
|
|
contents and then in the text itself. This is tedious and
|
|
error-prone especially when you change the titles. There are
|
|
specialised tools for generating tables of contents from
|
|
HTML pages but the simple facility provided by <EM>m4</EM> is
|
|
irresistable to me.
|
|
|
|
<P>
|
|
<A NAME="3.7.1 Simple to understand TOC"><H2>3.7.1 Simple to understand TOC</H2></A>
|
|
<P>
|
|
|
|
The following example is a fairly simple-minded Table of
|
|
Contents generator. First, create some useful macros in
|
|
<CODE>stdlib.m4</CODE>:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(`_LINK_TO_LABEL', <A HREF="#$1">$1</A>)
|
|
m4_define(`_SECTION_HEADER', <A NAME="$1"><H2>$1</H2></A>)</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
Then define all the section headings in a table at the
|
|
start of the page body:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(`_DIFFICULTIES', `The difficulties of HTML')
|
|
m4_define(`_USING_M4', `Using <EM>m4</EM>')
|
|
m4_define(`_SHARING', `Sharing HTML Elements Across Several Pages')</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
Then build the table:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE><UL><P>
|
|
<LI> _LINK_TO_LABEL(_DIFFICULTIES)
|
|
<LI> _LINK_TO_LABEL(_USING_M4)
|
|
<LI> _LINK_TO_LABEL(_SHARING)
|
|
<UL></CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
Finally, write the text:
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>.
|
|
.
|
|
_SECTION_HEADER(_DIFFICULTIES)
|
|
.
|
|
.</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
The advantages of this approach are that if you change
|
|
your headings you only need to change them in one place
|
|
and the table of contents is automatically regenerated;
|
|
also the links are guaranteed to work.
|
|
|
|
<P>
|
|
|
|
Hopefully, that simple version was fairly easy to understand.
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
|
|
<P>
|
|
<A NAME="3.7.2 Simple to use TOC"><H2>3.7.2 Simple to use TOC</H2></A>
|
|
<P>
|
|
|
|
The Table of Contents generator that I normally use is a bit
|
|
more complex and will require a little more study, but is
|
|
much easier to use. It not only builds the Table, but it
|
|
also automatically numbers the headings on the fly - up to 4
|
|
levels of numbering (e.g. section 3.2.1.3 - although this
|
|
can be easily extended). It is very simple to use as
|
|
follows:
|
|
|
|
<P>
|
|
<OL>
|
|
<LI>Where you want the table to appear, call <CODE>Start_TOC</CODE>
|
|
<LI>at every heading use <CODE>_H1(`Heading for level 1')</CODE> or <CODE>_H2(`Heading for level 2')</CODE> as appropriate.
|
|
<LI>After the very last HTML code (probably after </HTML>), call <CODE>End_TOC</CODE>
|
|
<LI> and that's all!
|
|
</OL>
|
|
<P>
|
|
|
|
The code for these macros is a little complex, so hold your breath:
|
|
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(_Start_TOC,`<UL><P>m4_divert(-1)
|
|
m4_define(`_H1_num',0)
|
|
m4_define(`_H2_num',0)
|
|
m4_define(`_H3_num',0)
|
|
m4_define(`_H4_num',0)
|
|
m4_divert(1)')
|
|
|
|
m4_define(_H1, `m4_divert(-1)
|
|
m4_define(`_H1_num',m4_incr(_H1_num))
|
|
m4_define(`_H2_num',0)
|
|
m4_define(`_H3_num',0)
|
|
m4_define(`_H4_num',0)
|
|
m4_define(`_TOC_label',`_H1_num. $1')
|
|
m4_divert(0)<LI><A HREF="#_TOC_label">_TOC_label</A>
|
|
m4_divert(1)<A NAME="_TOC_label">
|
|
<H2>_TOC_label</H2></A>')
|
|
.
|
|
.
|
|
[definitions for _H2, _H3 and _H4 are similar and are
|
|
in the downloadable version of stdlib.m4]
|
|
.
|
|
.
|
|
|
|
m4_define(_End_TOC,`m4_divert(0)</UL><P>')</CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
|
|
One restriction is that you should not use diversions within
|
|
your text, unless you preserve the diversion to file 1 used
|
|
by this TOC generator.
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
<A NAME="3.8 Simple tables"><H2>3.8 Simple tables</H2></A>
|
|
<P>
|
|
|
|
Other than Tables of Contents, many browsers support tabular
|
|
information. Here are some funky macros as a short cut to
|
|
producing these tables. First, an example of their use:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE><CENTER>
|
|
_Start_Table(BORDER=5)
|
|
_Table_Hdr(,Apples, Oranges, Lemons)
|
|
_Table_Row(England,100,250,300)
|
|
_Table_Row(France,200,500,100)
|
|
_Table_Row(Germany,500,50,90)
|
|
_Table_Row(Spain,,23,2444)
|
|
_Table_Row(Denmark,,,20)
|
|
_End_Table
|
|
</CENTER></CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
<CENTER>
|
|
<TABLE BORDER=5>
|
|
<tr><th></th><th>Apples</th><th>Oranges</th><th>Lemons</th></tr>
|
|
<tr><td>England</td><td>100</td><td>250</td><td>300</td></tr>
|
|
<tr><td>France</td><td>200</td><td>500</td><td>100</td></tr>
|
|
<tr><td>Germany</td><td>500</td><td>50</td><td>90</td></tr>
|
|
<tr><td>Spain</td><td></td><td>23</td><td>2444</td></tr>
|
|
<tr><td>Denmark</td><td></td><td></td><td>20</td></tr>
|
|
</TABLE>
|
|
</CENTER><P>
|
|
|
|
...and now the code. Note that this example utilises <EM>m4's</EM>
|
|
ability to recurse:
|
|
|
|
|
|
<BLOCKQUOTE><PRE><CODE>m4_dnl _Start_Table(Columns,TABLE parameters)
|
|
m4_dnl defaults are BORDER=1 CELLPADDING="1" CELLSPACING="1"
|
|
m4_dnl WIDTH="n" pixels or "n%" of screen width
|
|
m4_define(_Start_Table,`<TABLE $1>')
|
|
|
|
m4_define(`_Table_Hdr_Item', `<th>$1</th>
|
|
m4_ifelse($#,1,,`_Table_Hdr_Item(m4_shift($@))')')
|
|
|
|
m4_define(`_Table_Row_Item', `<td>$1</td>
|
|
m4_ifelse($#,1,,`_Table_Row_Item(m4_shift($@))')')
|
|
|
|
m4_define(`_Table_Hdr',`<tr>_Table_Hdr_Item($@)</tr>')
|
|
m4_define(`_Table_Row',`<tr>_Table_Row_Item($@)</tr>')
|
|
|
|
m4_define(`_End_Table',</TABLE>)</CODE></PRE></BLOCKQUOTE>
|
|
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
<A NAME="4. <EM>m4</EM> gotchas"><H2>4. <EM>m4</EM> gotchas</H2></A>
|
|
<P>
|
|
|
|
Unfortunately, <EM>m4</EM> is not unremitting sweetness and
|
|
light - it needs some taming and a little time spent on
|
|
familiarisation will pay dividends. Definitive documentation
|
|
is available (for example in <EM>emacs' info</EM> documentation
|
|
system) but, without being a complete tutorial, here are a
|
|
few tips based on my fiddling about with the thing.
|
|
|
|
<P>
|
|
|
|
|
|
<A NAME="4.1 Gotcha 1 - quotes"><H2>4.1 Gotcha 1 - quotes</H2></A>
|
|
<P>
|
|
|
|
<EM>m4's</EM> quotation characters are the
|
|
<EM>grave</EM> accent ` which starts the quote, and the
|
|
<EM>acute</EM> accent ' which ends it. It may
|
|
help to put all arguments to macros in quotes, e.g.
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>_HEAD1(`This is a heading')</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
The main reason for this is in case there are commas in an
|
|
argument to a macro - <EM>m4</EM> uses commas to separate macro
|
|
parameters, e.g. <CODE>_CODE(foo, bar)</CODE> would print the
|
|
<CODE>foo</CODE> but not the <CODE>bar</CODE>. <CODE>_CODE(`foo,
|
|
bar')</CODE> works properly.
|
|
|
|
<P>
|
|
|
|
This becomes a little complicated when you nest macro
|
|
calls as in the <EM>m4</EM> source code for the examples in this
|
|
paper - but that is rather an extreme case and normally you
|
|
would not have to stoop to that level.
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
|
|
<A NAME="4.2 Gotcha 2 - Word swallowing"><H2>4.2 Gotcha 2 - Word swallowing</H2></A>
|
|
<P>
|
|
|
|
The worst problem with <EM>m4</EM> is that some versions of
|
|
it "swallow" key words that it recognises, such as
|
|
"include", "format", "divert", "file", "gnu", "line",
|
|
"regexp", "shift", "unix", "builtin" and "define". You
|
|
can protect these words by putting them in <EM>m4</EM> quotes,
|
|
for example:
|
|
|
|
<P>
|
|
|
|
<BLOCKQUOTE><PRE><CODE>Smart people `include' Linux in their list
|
|
of computer essentials.</CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
|
|
The trouble is, this is a royal pain to do - and you're
|
|
likely to forget which words need protecting.
|
|
|
|
<P>
|
|
|
|
Another, safer way to protect keywords (my preference) is to
|
|
invoke <EM>m4</EM> with the <CODE>-P</CODE> or
|
|
<CODE>--prefix-builtins</CODE> option. Then, all builtin macro
|
|
names are modified so they all start with the prefix
|
|
<CODE>m4_</CODE> and ordinary words are left alone. For example,
|
|
using this option, one should write <CODE>m4_define</CODE> instead
|
|
of <CODE>define</CODE> (as shown in the examples in this
|
|
article).
|
|
|
|
<P>
|
|
|
|
The only trouble is that not all versions of
|
|
<EM>m4</EM> support this option - notably some PC versions under
|
|
M$-DOS. Maybe that's just another reason to steer clear of
|
|
hack code on M$-DOS and stay with <EM>Linux!</EM>
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
|
|
<A NAME="4.3 Gotcha 3 - Comments"><H2>4.3 Gotcha 3 - Comments</H2></A>
|
|
<P>
|
|
|
|
Comments in <EM>m4</EM> are introduced with the # character -
|
|
everything from the # to the end of the line is ignored by
|
|
<EM>m4</EM> and simply passed unchanged to the output. If you
|
|
want to use # in the HTML page then you would need to quote it
|
|
like this - `#'. Another option (my preference) is to
|
|
change the <EM>m4</EM> comment character to something exotic
|
|
like this: <CODE>m4_changecom(`[[[[')</CODE> and not have to
|
|
worry about `#' symbols in your text.
|
|
|
|
<P>
|
|
|
|
If you want to use comments in the <EM>m4</EM> file which do not
|
|
appear in the final HTML file, then the macro
|
|
<CODE>m4_dnl</CODE> (dnl = <STRONG>D</STRONG>elete to <STRONG>N</STRONG>ew <STRONG>L</STRONG>ine) is for you. This suppresses everything
|
|
until the next newline.
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_define(_NEWMACRO, `foo bar') m4_dnl This is a comment</CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
|
|
Yet another way to have source code ignored is the
|
|
<CODE>m4_divert</CODE> command. The main purpose of
|
|
<CODE>m4_divert</CODE> is to save text in a temporary buffer for
|
|
inclusion in the file later on - for example, in building a
|
|
table of contents or index. However, if you divert to "-1"
|
|
it just goes to limbo-land. This is useful for getting rid
|
|
of the whitespace generated by the <CODE>m4_define</CODE>
|
|
command, e.g.:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_divert(-1) diversion on
|
|
m4_define(this ...)
|
|
m4_define(that ...)
|
|
m4_divert diversion turned off</CODE></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
|
|
<A NAME="4.4 Gotcha 4 - Debugging"><H2>4.4 Gotcha 4 - Debugging</H2></A>
|
|
<P>
|
|
|
|
Another tip for when things go wrong is to increase the
|
|
amount of error diagnostics that <EM>m4</EM> emits. The
|
|
easiest way to do this is to add the following to your
|
|
<EM>m4</EM> file as debugging commands:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE>m4_debugmode(e)
|
|
m4_traceon
|
|
.
|
|
.
|
|
buggy lines
|
|
.
|
|
.
|
|
m4_traceoff</CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
<P>
|
|
|
|
<A NAME="5. Conclusion"><H2>5. Conclusion</H2></A>
|
|
|
|
"ah ha!", I hear you say. "HTML 3.0 already has an include
|
|
statement". Yes it has, and it looks like this:
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><CODE><!--#include file="junk.html" --></CODE></PRE></BLOCKQUOTE>
|
|
<P>
|
|
|
|
The problem is that:
|
|
<UL>
|
|
|
|
<LI> The work of including and interpreting the
|
|
include is done on the server-side before downloading and
|
|
adds a big overhead as the server has to scan files for
|
|
`include' statements.
|
|
|
|
<LI> Consequently most servers (especially public ISP's) deactivate this feature.
|
|
|
|
<LI> `include' is all you get - no macro
|
|
substitution, no parameters to macros, no ifdef, etc, etc.
|
|
|
|
</UL>
|
|
|
|
<P>
|
|
|
|
There are several other features of <EM>m4</EM> that I have not
|
|
yet exploited in my HTML ramblings so far, such as regular
|
|
expressions and doubtless many others. It might be
|
|
interesting to create a "standard" <CODE>stdlib.m4</CODE> for
|
|
general use with nice macros for general text processing and
|
|
HTML functions. By all means download my version of
|
|
<CODE>stdlib.m4</CODE> as a base for your own hacking. I would be
|
|
interested in hearing of useful macros and if there is
|
|
enough interest, maybe a Mini-HOWTO could evolve from this
|
|
paper.
|
|
|
|
<P>
|
|
|
|
There are many additional advantages in using <EM>Linux</EM> to
|
|
develop HTML pages, far beyond the simple assistance given
|
|
by the typical <EM>Typing Aids</EM> and <EM>WYSIWYG</EM> tools.
|
|
|
|
<P>
|
|
|
|
Certainly, this little hacker will go on using <EM>m4</EM> until
|
|
HTML catches up - I will then do my last <EM>make</EM> and drop
|
|
back to using pure HTML.
|
|
|
|
<P>
|
|
|
|
I hope you enjoy these little tricks and encourage you to
|
|
contribute your own. Happy hacking!
|
|
|
|
<P>
|
|
<A NAME="6. Files to download"><H2>6. Files to download</H2></A>
|
|
|
|
You can get the HTML and the <EM>m4</EM> source code for this
|
|
article here (for the sake of completeness, they're
|
|
copylefted under GPL 2):
|
|
|
|
<P>
|
|
<BLOCKQUOTE><PRE><A HREF="./using_m4.html">using_m4.html</A> :this file
|
|
<A HREF="./using_m4.m4">using_m4.m4</A> :<EM>m4</EM> source
|
|
<A HREF="./stdlib.m4">stdlib.m4</A> :Include file
|
|
<A HREF="./makefile_m4">makefile</A></PRE></BLOCKQUOTE>
|
|
|
|
<P>
|
|
<A HREF="#Contents">Contents</A>
|
|
|
|
<P>
|
|
|
|
<!--===================================================================-->
|
|
<P> <hr> <P>
|
|
<center><H5>Copyright © 1997, Bob Hepple <BR>
|
|
Published in Issue 22 of the Linux Gazette, October 1997</H5></center>
|
|
|
|
<!--===================================================================-->
|
|
<P> <hr> <P>
|
|
<A HREF="./index.html"><IMG ALIGN=BOTTOM SRC="../gx/indexnew.gif"
|
|
ALT="[ TABLE OF CONTENTS ]"></A>
|
|
<A HREF="../index.html"><IMG ALIGN=BOTTOM SRC="../gx/homenew.gif"
|
|
ALT="[ FRONT PAGE ]"></A>
|
|
<A HREF="./notes-mode.html"><IMG SRC="../gx/back2.gif"
|
|
ALT=" Back "></A>
|
|
<A HREF="./conn.html"><IMG SRC="../gx/fwd.gif" ALT=" Next "></A>
|
|
<P> <hr> <P>
|
|
</BODY>
|
|
</HTML>
|