old-www/LDP/LG/issue22/words.html

395 lines
21 KiB
HTML

<!--startcut ==========================================================-->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<title>Word Processing vs. Text Processing?</title>
</head>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#003380"
ALINK="#FF0000">
<!--endcut ============================================================-->
<H4>
&quot;Linux Gazette...<I>making Linux just a little more fun!</I>&quot;
</H4>
<P> <HR> <P>
<!--===================================================================-->
<center><h1>Word Processing and Text Processing</h1></center>
<center>
<h4><a href="mailto: layers@marktwain.net">by Larry Ayers</a></h4>
</center>
<P><HR><P>
One of the most common questions posted in the various Linux newsgroups is
"Where can I find a good word-processor for Linux?". This question has several
interesting ramifications:<br>
<ul>
<li>There is an unspoken assumption that a word processor is a vital
application for an operating system.
<li>The query implies that the questioner has investigated the text
processing capabilities readily available for Linux and has either
found them too daunting to learn and/or not suited to the tasks at
hand, or...
<li>The questioner is a recent migrant from one of the commercial OS's and
is accustomed to a standard word processor.
</ul>
<hr>
<center><h3> Vital For Some...</h3></center>
<p>A notion has become prevalent in the minds of many computer users these
days: the idea that a complex word processor is the only tool suitable for
creating text on a computer. I've talked with several people who think of an
editor as a primitive relic of the bad old DOS days, a type of software which
has been superseded by the modern word-processor. There is an element of
truth to this, especially in a business environment in which even the simplest
memos are distributed in one of several proprietary word-processor formats.
But when it is unnecessary to use one of these formats, a good text editor has
more power to manipulate text and is faster and more responsive.
<p>The ASCII format, intended to be a universal means of representing
and transferring text, does have several limitations. The fonts used are
determined by the terminal type and capability rather than by the application,
normally a fixed, monospace font. These limitations in one sense are virtues,
though, as this least-common-denominator approach to representing text assures
readability by everyone on all platforms. This is why ASCII is still the core
format of e-mail and usenet messages, though there is a tendency in the large
software firms to promote HTML as a replacement. Unfortunately, HTML can now
be written so that it is essentially unreadable by anything other than a
modern graphical browser. Of course, HTML is ASCII-based as well, but is
meant to be interpreted or parsed rather than read directly.
<p>Working with ASCII text directly has many advantages. The output is
compact and easily stored, and separating the final formatting from actual
writing allows the writer to focus on content rather than appearance. An
ASCII document is not dependent on one application; the simplest of editors or
even <b>cat</b> can access its content. There is an interesting parallel,
perhaps coincidental, between the Unix use of ASCII and other OS's use of
binary formats. All configuration files in a Linux or any Unix system are
generally in plain ASCII format: compact,editable, and easily backed-up or
transferred. Many programmers use Linux; source code is written in ASCII
format, so perhaps using the format for other forms of text is a natural
progression. The main configuration files for Win95, NT and OS/2 are in
binary format, easily corruptible and not easily edited. Perhaps this is one
reason users of these systems tend towards proprietary word-processing formats
which, while not necessarily in binary format, aren't readable by ASCII-based
editors or even other word-processors. But I digress...
<p>There are several methods of producing professional-looking printable
documents from ASCII input, the most popular being LaTeX, Lout, and Groff.
<hr>
<center><h3>Text Formatting with Mark-Up Languages</h3></center>
<center><h4>LaTeX</h4></center>
<p>LaTeX, Leslie Lamport's macro package for the TeX low-level formatting
system, is widely used in the academic world. It has become a standard, and
has been refined to the point that bugs are rare. Its ability to represent
mathematical equations is unparalleled, but this very fact has deterred some
potential users. Mentioning LaTeX to people will often elicit a response such
as: "Isn't that mainly used by scientists and mathematicians? I have no need
to include equations in my writing, so why should I use it?" A full-featured
word-processor (such as WordPerfect) also includes an equation editor, but (as
with LaTeX) just because a feature exists doesn't mean you have to use it.
LaTeX is well-suited to creating a wide variety of documents, from a simple
business letter to articles, reports or full-length books. A wealth of
documentation is available, including documents bundled with the distribution
as well as those available on the internet. A good source is this
<a href="ftp://ftp.cdrom.com/.1/tex/ctan/info">ftp site</a>, which is a mirror
of CTAN, the largest on-line repository of TeX and LaTeX material.
<p>LaTeX is easily installed from any Linux distribution, and in my experience
works well "out of the box". Hardened LaTeX users type the formatting tagging
directly, but there are several alternative approaches which can expedite the
process, especially for novices. There <em>is</em> quite a learning curve
involved in learning LaTeX from scratch, but using an intermediary interface
will allow the immediate creation of usable documents by a beginner.
<p>AucTeX is a package for either GNU Emacs or XEmacs which has a multitude of
useful features helpful in writing LaTeX documents. Not only does the package
provide hot-keys and menu-items for tags and environments, but it also allows
easy movement through the document. You can run LaTeX or TeX interactively
from Emacs, and even view the resulting output DVI file with xdvi. Emacs
provides excellent syntax highlighting for LaTeX files, which greatly improves
their readability. In effect AucTeX turns Emacs into a "front-end" for LaTeX.
If you don't like the overhead incurred when running Emacs or especially
XEmacs, John Davis' Jed and Xjed editors have a very functional LaTeX/TeX mode
which is patterned after AucTeX. The console-mode Jed editor does
syntax-highlighting of TeX files well without extensive fiddling with config
files, which is rare in a console editor.
<p>If you don't use Emacs or its variants there is a Tcl/Tk based front-end
for LaTeX available called <b>xtem</b>. It can be set up to use any editor;
the September 1996 issue of <b>Linux Journal</b> has a good introductory
article on the package. <b>Xtem</b> has one feature which is useful for LaTeX
beginners: on-line syntax help-files for the various LaTeX commands.
The
<a href="http://pax.st.usm.edu/~kolibal/tex_html/xtem_html/xtem_texmenu.html">
homepage</a> for the package can be visited if you're interested.
<p>It is fairly easy to produce documents if the default formats included with
a TeX installation are used; more knowledge is needed to produce customized
formats. Luckily TeX has a large base of users, many of whom have contributed
a variety of style-formatting packages, some of which are included in the
distribution, while others are freely available from TeX archive sites
such as CTAN.
<p>At a further remove from raw LaTeX is the LyX document processor. This
program (still under development, but very usable) at first seems to be a
WYSIWYG interface for LaTeX, but this isn't quite true. The text you type
doesn't have visible LaTeX tagging, but it is formatted to fit the window on
your screen which doesn't necessarily reflect the document's appearance
when printed or viewed with GV or Ghostscript. In other words, the appearance
of the text you type is just a user convenience. There are several things
which can be done with a document typed in LyX. You can let LyX handle the
entire LaTeX conversion process with a DVI or Postscript file as a result,
which is similar to using a word-processor. I don't like to do it this way;
one of the reasons I use Linux is because I'm interested in the underlying
processes and how they work, and Linux is transparent. If I'm curious as to
how something is happening in a Linux session I can satisfy that curiosity to
whatever depth I like. Another option LyX offers is more to my taste: LyX can
convert the document's format from the LaTeX-derived internal format to
standard LaTeX, which is readable and can be loaded into an editor.
<p>Load a LyX-created LaTeX file into an Emacs/Auctex session (if you have
AucTeX set up right it will be called whenever a file with the <i>.tex</i>
suffix is loaded), and your document will be displayed with new LaTeX tags
interspersed throughout the text. The syntax-highlighting can make the text
easier to read if you have font-locking set up to give a subdued color to the
tagging (backslashes (\) and $ signs). This is an effective way to learn
something about how LaTeX documents are written. Changes can be made from
within the editor and you can let AucTeX call the LaTeX program to format the
document, or you can continue with LyX. In effect this is using LyX as a
preprocessor for AucTeX. This expands the user's options; if you are having
trouble convincing LyX to do what you want, perhaps AucTeX can do it more
easily.
<p>Like many Linux software projects LyX is still in a state of flux. The
release of beta version 0.12 is imminent; after that release the developers
are planning to switch to another GUI toolkit (the current versions use the
XForms toolkit). The 0.11.38 version I've been using has been working
dependably for me (hint: if it won't compile, give the configure script the
switch <kbd>--disable-nls</kbd>. This disables the internationalization
support).
<hr>
<center><h4>YODL</h4></center>
<p>YODL (Yet One-Other Document Language) is another way of interacting with
LaTeX. This system has a simplified tagging format which isn't hard to
learn. The advantage of YODL is that from one set of marked-up source
documents, output can be generated in LaTeX, HTML, and Groff man and ms
formats. The package is well-documented. I wrote a short introduction to
YODL in issue #9 of the Gazette. The current source for the package is this
<a href="ftp://ftp.icce.rug.nl/pub/unix/">ftp site</a>.
<hr>
<center><h3>Lout</h3></center>
<p>About thirteen years ago Jeffrey Kingston (of the University of Sydney,
Australia) began to develop a document formatting system which became known as
Lout. This system bears quite a bit of resemblance to LaTeX: it uses
formatting tags (using the @ symbol rather than \) and its output is
Postscript. Mr. Kingston calls Lout a high-level language with some
similarities to Algol, and claims that user extensions and modifications are
much easier to implement than in LaTeX. The package comes with hundreds of
pages of Postscript documentation along with the Lout source files which were
used to generate those book-length documents.
<p>The Lout system is still maintained and developed, and in my trials seemed
to work well, but there are some drawbacks. I'm sure Lout has nowhere near as
many users as LaTeX. LaTeX is installed on enough machines that if you should
want to e-mail a TeX file to someone (especially someone in academia) chances
are that that person will have access to a machine with Tex installed and will
be able to format and print or view it. LaTeX's large user-base also has
resulted in a multitude of contributed formatting packages.
<p>Another drawback (for me, at least) is the lack of available front-ends or
editor-macro packages for Lout. I don't mind using markup languages if I can
use, say, an Emacs mode with key-bindings and highlighting set up for the
language. There may be such packages out there for Lout, but I haven't run
across them.
<p>Lout does have the advantage of being much more compact than a typical Tex
installation. If you have little use for some of the more esoteric aspects
of LaTeX, Lout might be just the thing. It can include tables, various types
of lists, graphics, foot- and marginal notes, and equations in a document, and
the Postscript output is the equal of what LaTeX generates.
<p>Both RedHat and Debian have Lout packages available, and the
source/documentation package is available from the Lout
<a href="ftp://ftp.cs.su.oz.au/jeff/lout/">home FTP site</a>.
<hr>
<center><h4>Groff</h4></center>
<p>Groff is an older system than TeX/LaTeX, dating back to the early days of
unix. Often a first-time Linux user will neglect to install the Groff
package, only to find that the <i>man</i> command won't work and that the
man-pages are therefore inaccessible. As well as in day-to-day invocation
by the <i>man</i> command, Groff is used in the publishing industry to
produce books, though other formatting systems such as SGML are more common.
<p>Groff is the epitome of the non-user-friendly and cryptic unix command-line
tool. There are several man-pages covering various of Groff's components, but
they seem to assume a level of prior knowledge without any hint as to where
that knowledge might be acquired. I found them to be nearly incomprehensible.
A search on the internet didn't turn up any introductory documents or
tutorials, though there may be some out there. I suspect more complete
documentation might be supplied with some of the commercial unix
implementations; the original and now-proprietary version is called troff, and
a later version is nroff; Groff is short for GNU roff.
<p>Groff can generate Postscript, DVI, HP LaserJet4, and ASCII text formats.
<p>Learning to use Groff on a Linux system might be an uphill battle, though
Linux software developers must have learned enough of it at one time or other,
as most programs come with Groff-tagged man-page files. Groff's apparent
opacity and difficulty make LaTeX look easy in contrast!
<hr>
<center><h3>A Change in Mind-Set</h3></center>
<p>Processing text with a mark-up language requires a different mode of
thought concerning documents. On the one hand, writing blocks of ASCII is
convenient and no thought needs to be given to the marking-up process until the
end. A good editor provides so many features to deal with text that using any
word-processor afterwards can feel constrictive. Many users, though, are
attracted by the integration of functions in a word processor, using one
application to produce a document without intermediary steps.
<p>Though there <em>are</em> projects underway (such as <b>Wurd</b>)
which may eventually result in a native Linux word-processor, there may be a
reason why this type of application is still rare in the Linux world.
Adapting oneself to Linux, or any unix-variant, is an adaptation to what has
been called "the Unix philosophy", the practice of using several
highly-refined and specific tools to accomplish a task, rather than one tool
which tries to do it all. I get the impression that programmers attracted to
free software projects prefer working on smaller specialized programs. As an
example look at the plethora of mail- and news-readers available compared to
the dearth of all-in-one internet applications. Linux itself is really just
the kernel, which has attracted to itself all of the GNU and other software
commonly distributed with it in the form of a distribution.
<p>Christopher B. Browne has written an essay titled <b>An Opinionated Rant
About Word-Processors</b> which deals with some of the issues discussed in
this article; it's available at <a href="http://www.hex.net/~cbbrowne/wp.html">
this site</a>.
<p>The StarOffice suite is an interesting case, one of the few instances of a
large software firm (StarDivision) releasing a Linux version of an office
productivity suite. The package has been available for some time now, first
in several time-limited beta versions and now in a freely available release.
It's a large download but it's also available on CDROM from
<a href="http://www.caldera.com">Caldera</a>.
You would think that users would be flocking to it if the demand is really that
high for such an application suite for Linux. Judging by the relatively
sparse usenet postings I've seen, StarOffice hasn't exactly swept the Linux
world by storm. I can think of a few possible reasons:
<ul>
<li>Many hard-core Linux users aren't working in a corporate office setting
in which such a product would be valuable; they are scientists,
engineers or academics who are perfectly happy with LaTeX, Lout, Groff,
et al.
<li>Then there are the users who have dual or multiple-boot set-ups; if they
need to use MS Word they just boot from their Win95 or NT partitions.
<li>Another group of users run Linux at home and whatever OS their job
requires at work.
<li>StarOffice is written with a cross-platform development tool-kit; this
may be responsible for its bulk and lack of speed.
</ul>
<hr>
<p>I remember the first time I started up the StarOffice word-processor. It
was slow to load on a Pentium 120 with 32 mb. of RAM (and I thought XEmacs was
slow!), and once the main window appeared it occurred to me that it just
didn't look "at home" on a Linux desktop. All those icons and button-bars!
It seemed to work well, but with the lack of English documentation (and not
being able to convince it to print anything!) I eventually lost interest in
using it. I realized that I prefer my familiar editors, and learning a little
LaTeX seemed to be easier than trying to puzzle out the workings of an
undocumented suite of programs. This may sound pretty negative, and I don't
wish to denigrate the efforts of the StarDivision team responsible for the
Linux porting project. If you're a StarOffice user happy with the suite
(especially if you speak German and therefore can read the docs) and
would like to present a dissenting view, write a piece on it for the Gazette!
<p>Two other commercial word-processors for Linux are Applix and WordPerfect.
Applix, available from <a href="http://www.redhat.com">RedHat</a>,
has received favorable reviews from many Linux users.
<p>A company called SDCorp in Utah has ported Corel's WordPerfect 7 to
Linux, and a (huge!) demo is available now from both the SDCorp
<a href="ftp://ftp.sdcorp.com/pub/demos/linux/">ftp</a> site and
<a href="ftp://ftp.corel.com/pub/WordPerfect/wpunix/demos/linux/">Corel's</a>.
Unfortunately both FTP servers are unable to resume interrupted
downloads (usually indicating an NT server) so the CDROM version, available
from the SDCorp <a href="http://www.sdcorp.com/wplinux">website</a>,
is probably the way to go, if you'd like to try it out. The demo can be
transformed into a registered program by paying for it, in which case a key is
e-mailed to you which registers the program, but only for the machine it is
installed on.
<p>Addendum: I recently had an exchange of e-mail with Brad Caldwell, product
manager for the SDCorp WordPerfect port. I complained about the difficulty of
downloading the 36 mb. demo, and a couple of days later I was informed that
the file has been split into nine parts, and that they were investigating the
possibility of changing to an FTP server which supports interrupted
downloads. The smaller files are available from
<a href="http://www.sdcorp.com/demos/smallftp.htm">this web page</a>.
<hr>
<p>There exists a curious dichotomous attitude these days in the Linux
community. I assume most people involved with Linux would like the operating
system to gain more users and perhaps move a little closer to the mainstream.
Linux advocates bemoan the relative lack of "productivity apps" for Linux,
which would make the OS more acceptable in corporate or business
environments. But how many of these advocates would use the applications if
they were more common? Often the change of mindset discussed above mitigates
against acceptance of Windows-like programs, with no source code available and
limited access to the developers. Linux has strong roots in the GNU and free
software movements (not always synonymous) and this background might be a
barrier towards development of a thriving commercial software market.
<!--===================================================================-->
<P> <hr> <P>
<center>
<h5>Copyright &copy; 1997, Larry Ayers <br>
Published in Issue 22 of the Linux Gazette, October 1997</h5>
</center>
<!--===================================================================-->
<P> <hr> <P>
<A HREF="./index.html"><IMG ALIGN=BOTTOM SRC="../gx/indexnew.gif"
ALT="[ TABLE OF CONTENTS ]"></A>
<A HREF="../index.html"><IMG ALIGN=BOTTOM SRC="../gx/homenew.gif"
ALT="[ FRONT PAGE ]"></A>
<A HREF="./bench.html"><IMG SRC="../gx/back2.gif"
ALT=" Back "></A>
<A HREF="./new_emacs.html"><IMG SRC="../gx/fwd.gif" ALT=" Next "></A>
<P> <hr> <P>
<!--startcut ==========================================================-->
</BODY>
</HTML>
<!--endcut ============================================================-->