395 lines
21 KiB
HTML
395 lines
21 KiB
HTML
<!--startcut ==========================================================-->
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<title>Word Processing vs. Text Processing?</title>
|
|
</head>
|
|
|
|
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#003380"
|
|
ALINK="#FF0000">
|
|
<!--endcut ============================================================-->
|
|
|
|
<H4>
|
|
"Linux Gazette...<I>making Linux just a little more fun!</I>"
|
|
</H4>
|
|
|
|
<P> <HR> <P>
|
|
<!--===================================================================-->
|
|
|
|
|
|
<center><h1>Word Processing and Text Processing</h1></center>
|
|
|
|
<center>
|
|
<h4><a href="mailto: layers@marktwain.net">by Larry Ayers</a></h4>
|
|
</center>
|
|
|
|
<P><HR><P>
|
|
|
|
|
|
One of the most common questions posted in the various Linux newsgroups is
|
|
"Where can I find a good word-processor for Linux?". This question has several
|
|
interesting ramifications:<br>
|
|
|
|
<ul>
|
|
<li>There is an unspoken assumption that a word processor is a vital
|
|
application for an operating system.
|
|
<li>The query implies that the questioner has investigated the text
|
|
processing capabilities readily available for Linux and has either
|
|
found them too daunting to learn and/or not suited to the tasks at
|
|
hand, or...
|
|
<li>The questioner is a recent migrant from one of the commercial OS's and
|
|
is accustomed to a standard word processor.
|
|
</ul>
|
|
<hr>
|
|
|
|
<center><h3> Vital For Some...</h3></center>
|
|
|
|
<p>A notion has become prevalent in the minds of many computer users these
|
|
days: the idea that a complex word processor is the only tool suitable for
|
|
creating text on a computer. I've talked with several people who think of an
|
|
editor as a primitive relic of the bad old DOS days, a type of software which
|
|
has been superseded by the modern word-processor. There is an element of
|
|
truth to this, especially in a business environment in which even the simplest
|
|
memos are distributed in one of several proprietary word-processor formats.
|
|
But when it is unnecessary to use one of these formats, a good text editor has
|
|
more power to manipulate text and is faster and more responsive.
|
|
|
|
<p>The ASCII format, intended to be a universal means of representing
|
|
and transferring text, does have several limitations. The fonts used are
|
|
determined by the terminal type and capability rather than by the application,
|
|
normally a fixed, monospace font. These limitations in one sense are virtues,
|
|
though, as this least-common-denominator approach to representing text assures
|
|
readability by everyone on all platforms. This is why ASCII is still the core
|
|
format of e-mail and usenet messages, though there is a tendency in the large
|
|
software firms to promote HTML as a replacement. Unfortunately, HTML can now
|
|
be written so that it is essentially unreadable by anything other than a
|
|
modern graphical browser. Of course, HTML is ASCII-based as well, but is
|
|
meant to be interpreted or parsed rather than read directly.
|
|
|
|
<p>Working with ASCII text directly has many advantages. The output is
|
|
compact and easily stored, and separating the final formatting from actual
|
|
writing allows the writer to focus on content rather than appearance. An
|
|
ASCII document is not dependent on one application; the simplest of editors or
|
|
even <b>cat</b> can access its content. There is an interesting parallel,
|
|
perhaps coincidental, between the Unix use of ASCII and other OS's use of
|
|
binary formats. All configuration files in a Linux or any Unix system are
|
|
generally in plain ASCII format: compact,editable, and easily backed-up or
|
|
transferred. Many programmers use Linux; source code is written in ASCII
|
|
format, so perhaps using the format for other forms of text is a natural
|
|
progression. The main configuration files for Win95, NT and OS/2 are in
|
|
binary format, easily corruptible and not easily edited. Perhaps this is one
|
|
reason users of these systems tend towards proprietary word-processing formats
|
|
which, while not necessarily in binary format, aren't readable by ASCII-based
|
|
editors or even other word-processors. But I digress...
|
|
|
|
<p>There are several methods of producing professional-looking printable
|
|
documents from ASCII input, the most popular being LaTeX, Lout, and Groff.
|
|
|
|
<hr>
|
|
|
|
<center><h3>Text Formatting with Mark-Up Languages</h3></center>
|
|
|
|
<center><h4>LaTeX</h4></center>
|
|
|
|
<p>LaTeX, Leslie Lamport's macro package for the TeX low-level formatting
|
|
system, is widely used in the academic world. It has become a standard, and
|
|
has been refined to the point that bugs are rare. Its ability to represent
|
|
mathematical equations is unparalleled, but this very fact has deterred some
|
|
potential users. Mentioning LaTeX to people will often elicit a response such
|
|
as: "Isn't that mainly used by scientists and mathematicians? I have no need
|
|
to include equations in my writing, so why should I use it?" A full-featured
|
|
word-processor (such as WordPerfect) also includes an equation editor, but (as
|
|
with LaTeX) just because a feature exists doesn't mean you have to use it.
|
|
LaTeX is well-suited to creating a wide variety of documents, from a simple
|
|
business letter to articles, reports or full-length books. A wealth of
|
|
documentation is available, including documents bundled with the distribution
|
|
as well as those available on the internet. A good source is this
|
|
<a href="ftp://ftp.cdrom.com/.1/tex/ctan/info">ftp site</a>, which is a mirror
|
|
of CTAN, the largest on-line repository of TeX and LaTeX material.
|
|
|
|
<p>LaTeX is easily installed from any Linux distribution, and in my experience
|
|
works well "out of the box". Hardened LaTeX users type the formatting tagging
|
|
directly, but there are several alternative approaches which can expedite the
|
|
process, especially for novices. There <em>is</em> quite a learning curve
|
|
involved in learning LaTeX from scratch, but using an intermediary interface
|
|
will allow the immediate creation of usable documents by a beginner.
|
|
|
|
<p>AucTeX is a package for either GNU Emacs or XEmacs which has a multitude of
|
|
useful features helpful in writing LaTeX documents. Not only does the package
|
|
provide hot-keys and menu-items for tags and environments, but it also allows
|
|
easy movement through the document. You can run LaTeX or TeX interactively
|
|
from Emacs, and even view the resulting output DVI file with xdvi. Emacs
|
|
provides excellent syntax highlighting for LaTeX files, which greatly improves
|
|
their readability. In effect AucTeX turns Emacs into a "front-end" for LaTeX.
|
|
If you don't like the overhead incurred when running Emacs or especially
|
|
XEmacs, John Davis' Jed and Xjed editors have a very functional LaTeX/TeX mode
|
|
which is patterned after AucTeX. The console-mode Jed editor does
|
|
syntax-highlighting of TeX files well without extensive fiddling with config
|
|
files, which is rare in a console editor.
|
|
|
|
<p>If you don't use Emacs or its variants there is a Tcl/Tk based front-end
|
|
for LaTeX available called <b>xtem</b>. It can be set up to use any editor;
|
|
the September 1996 issue of <b>Linux Journal</b> has a good introductory
|
|
article on the package. <b>Xtem</b> has one feature which is useful for LaTeX
|
|
beginners: on-line syntax help-files for the various LaTeX commands.
|
|
The
|
|
<a href="http://pax.st.usm.edu/~kolibal/tex_html/xtem_html/xtem_texmenu.html">
|
|
homepage</a> for the package can be visited if you're interested.
|
|
|
|
<p>It is fairly easy to produce documents if the default formats included with
|
|
a TeX installation are used; more knowledge is needed to produce customized
|
|
formats. Luckily TeX has a large base of users, many of whom have contributed
|
|
a variety of style-formatting packages, some of which are included in the
|
|
distribution, while others are freely available from TeX archive sites
|
|
such as CTAN.
|
|
|
|
<p>At a further remove from raw LaTeX is the LyX document processor. This
|
|
program (still under development, but very usable) at first seems to be a
|
|
WYSIWYG interface for LaTeX, but this isn't quite true. The text you type
|
|
doesn't have visible LaTeX tagging, but it is formatted to fit the window on
|
|
your screen which doesn't necessarily reflect the document's appearance
|
|
when printed or viewed with GV or Ghostscript. In other words, the appearance
|
|
of the text you type is just a user convenience. There are several things
|
|
which can be done with a document typed in LyX. You can let LyX handle the
|
|
entire LaTeX conversion process with a DVI or Postscript file as a result,
|
|
which is similar to using a word-processor. I don't like to do it this way;
|
|
one of the reasons I use Linux is because I'm interested in the underlying
|
|
processes and how they work, and Linux is transparent. If I'm curious as to
|
|
how something is happening in a Linux session I can satisfy that curiosity to
|
|
whatever depth I like. Another option LyX offers is more to my taste: LyX can
|
|
convert the document's format from the LaTeX-derived internal format to
|
|
standard LaTeX, which is readable and can be loaded into an editor.
|
|
|
|
<p>Load a LyX-created LaTeX file into an Emacs/Auctex session (if you have
|
|
AucTeX set up right it will be called whenever a file with the <i>.tex</i>
|
|
suffix is loaded), and your document will be displayed with new LaTeX tags
|
|
interspersed throughout the text. The syntax-highlighting can make the text
|
|
easier to read if you have font-locking set up to give a subdued color to the
|
|
tagging (backslashes (\) and $ signs). This is an effective way to learn
|
|
something about how LaTeX documents are written. Changes can be made from
|
|
within the editor and you can let AucTeX call the LaTeX program to format the
|
|
document, or you can continue with LyX. In effect this is using LyX as a
|
|
preprocessor for AucTeX. This expands the user's options; if you are having
|
|
trouble convincing LyX to do what you want, perhaps AucTeX can do it more
|
|
easily.
|
|
|
|
<p>Like many Linux software projects LyX is still in a state of flux. The
|
|
release of beta version 0.12 is imminent; after that release the developers
|
|
are planning to switch to another GUI toolkit (the current versions use the
|
|
XForms toolkit). The 0.11.38 version I've been using has been working
|
|
dependably for me (hint: if it won't compile, give the configure script the
|
|
switch <kbd>--disable-nls</kbd>. This disables the internationalization
|
|
support).
|
|
|
|
<hr>
|
|
|
|
<center><h4>YODL</h4></center>
|
|
|
|
<p>YODL (Yet One-Other Document Language) is another way of interacting with
|
|
LaTeX. This system has a simplified tagging format which isn't hard to
|
|
learn. The advantage of YODL is that from one set of marked-up source
|
|
documents, output can be generated in LaTeX, HTML, and Groff man and ms
|
|
formats. The package is well-documented. I wrote a short introduction to
|
|
YODL in issue #9 of the Gazette. The current source for the package is this
|
|
<a href="ftp://ftp.icce.rug.nl/pub/unix/">ftp site</a>.
|
|
|
|
<hr>
|
|
|
|
<center><h3>Lout</h3></center>
|
|
|
|
<p>About thirteen years ago Jeffrey Kingston (of the University of Sydney,
|
|
Australia) began to develop a document formatting system which became known as
|
|
Lout. This system bears quite a bit of resemblance to LaTeX: it uses
|
|
formatting tags (using the @ symbol rather than \) and its output is
|
|
Postscript. Mr. Kingston calls Lout a high-level language with some
|
|
similarities to Algol, and claims that user extensions and modifications are
|
|
much easier to implement than in LaTeX. The package comes with hundreds of
|
|
pages of Postscript documentation along with the Lout source files which were
|
|
used to generate those book-length documents.
|
|
|
|
<p>The Lout system is still maintained and developed, and in my trials seemed
|
|
to work well, but there are some drawbacks. I'm sure Lout has nowhere near as
|
|
many users as LaTeX. LaTeX is installed on enough machines that if you should
|
|
want to e-mail a TeX file to someone (especially someone in academia) chances
|
|
are that that person will have access to a machine with Tex installed and will
|
|
be able to format and print or view it. LaTeX's large user-base also has
|
|
resulted in a multitude of contributed formatting packages.
|
|
|
|
<p>Another drawback (for me, at least) is the lack of available front-ends or
|
|
editor-macro packages for Lout. I don't mind using markup languages if I can
|
|
use, say, an Emacs mode with key-bindings and highlighting set up for the
|
|
language. There may be such packages out there for Lout, but I haven't run
|
|
across them.
|
|
|
|
<p>Lout does have the advantage of being much more compact than a typical Tex
|
|
installation. If you have little use for some of the more esoteric aspects
|
|
of LaTeX, Lout might be just the thing. It can include tables, various types
|
|
of lists, graphics, foot- and marginal notes, and equations in a document, and
|
|
the Postscript output is the equal of what LaTeX generates.
|
|
|
|
<p>Both RedHat and Debian have Lout packages available, and the
|
|
source/documentation package is available from the Lout
|
|
<a href="ftp://ftp.cs.su.oz.au/jeff/lout/">home FTP site</a>.
|
|
|
|
<hr>
|
|
|
|
<center><h4>Groff</h4></center>
|
|
|
|
<p>Groff is an older system than TeX/LaTeX, dating back to the early days of
|
|
unix. Often a first-time Linux user will neglect to install the Groff
|
|
package, only to find that the <i>man</i> command won't work and that the
|
|
man-pages are therefore inaccessible. As well as in day-to-day invocation
|
|
by the <i>man</i> command, Groff is used in the publishing industry to
|
|
produce books, though other formatting systems such as SGML are more common.
|
|
|
|
<p>Groff is the epitome of the non-user-friendly and cryptic unix command-line
|
|
tool. There are several man-pages covering various of Groff's components, but
|
|
they seem to assume a level of prior knowledge without any hint as to where
|
|
that knowledge might be acquired. I found them to be nearly incomprehensible.
|
|
A search on the internet didn't turn up any introductory documents or
|
|
tutorials, though there may be some out there. I suspect more complete
|
|
documentation might be supplied with some of the commercial unix
|
|
implementations; the original and now-proprietary version is called troff, and
|
|
a later version is nroff; Groff is short for GNU roff.
|
|
|
|
<p>Groff can generate Postscript, DVI, HP LaserJet4, and ASCII text formats.
|
|
|
|
<p>Learning to use Groff on a Linux system might be an uphill battle, though
|
|
Linux software developers must have learned enough of it at one time or other,
|
|
as most programs come with Groff-tagged man-page files. Groff's apparent
|
|
opacity and difficulty make LaTeX look easy in contrast!
|
|
|
|
<hr>
|
|
|
|
<center><h3>A Change in Mind-Set</h3></center>
|
|
|
|
<p>Processing text with a mark-up language requires a different mode of
|
|
thought concerning documents. On the one hand, writing blocks of ASCII is
|
|
convenient and no thought needs to be given to the marking-up process until the
|
|
end. A good editor provides so many features to deal with text that using any
|
|
word-processor afterwards can feel constrictive. Many users, though, are
|
|
attracted by the integration of functions in a word processor, using one
|
|
application to produce a document without intermediary steps.
|
|
|
|
<p>Though there <em>are</em> projects underway (such as <b>Wurd</b>)
|
|
which may eventually result in a native Linux word-processor, there may be a
|
|
reason why this type of application is still rare in the Linux world.
|
|
Adapting oneself to Linux, or any unix-variant, is an adaptation to what has
|
|
been called "the Unix philosophy", the practice of using several
|
|
highly-refined and specific tools to accomplish a task, rather than one tool
|
|
which tries to do it all. I get the impression that programmers attracted to
|
|
free software projects prefer working on smaller specialized programs. As an
|
|
example look at the plethora of mail- and news-readers available compared to
|
|
the dearth of all-in-one internet applications. Linux itself is really just
|
|
the kernel, which has attracted to itself all of the GNU and other software
|
|
commonly distributed with it in the form of a distribution.
|
|
|
|
<p>Christopher B. Browne has written an essay titled <b>An Opinionated Rant
|
|
About Word-Processors</b> which deals with some of the issues discussed in
|
|
this article; it's available at <a href="http://www.hex.net/~cbbrowne/wp.html">
|
|
this site</a>.
|
|
|
|
<p>The StarOffice suite is an interesting case, one of the few instances of a
|
|
large software firm (StarDivision) releasing a Linux version of an office
|
|
productivity suite. The package has been available for some time now, first
|
|
in several time-limited beta versions and now in a freely available release.
|
|
It's a large download but it's also available on CDROM from
|
|
<a href="http://www.caldera.com">Caldera</a>.
|
|
You would think that users would be flocking to it if the demand is really that
|
|
high for such an application suite for Linux. Judging by the relatively
|
|
sparse usenet postings I've seen, StarOffice hasn't exactly swept the Linux
|
|
world by storm. I can think of a few possible reasons:
|
|
|
|
<ul>
|
|
<li>Many hard-core Linux users aren't working in a corporate office setting
|
|
in which such a product would be valuable; they are scientists,
|
|
engineers or academics who are perfectly happy with LaTeX, Lout, Groff,
|
|
et al.
|
|
<li>Then there are the users who have dual or multiple-boot set-ups; if they
|
|
need to use MS Word they just boot from their Win95 or NT partitions.
|
|
<li>Another group of users run Linux at home and whatever OS their job
|
|
requires at work.
|
|
<li>StarOffice is written with a cross-platform development tool-kit; this
|
|
may be responsible for its bulk and lack of speed.
|
|
</ul>
|
|
<hr>
|
|
|
|
<p>I remember the first time I started up the StarOffice word-processor. It
|
|
was slow to load on a Pentium 120 with 32 mb. of RAM (and I thought XEmacs was
|
|
slow!), and once the main window appeared it occurred to me that it just
|
|
didn't look "at home" on a Linux desktop. All those icons and button-bars!
|
|
It seemed to work well, but with the lack of English documentation (and not
|
|
being able to convince it to print anything!) I eventually lost interest in
|
|
using it. I realized that I prefer my familiar editors, and learning a little
|
|
LaTeX seemed to be easier than trying to puzzle out the workings of an
|
|
undocumented suite of programs. This may sound pretty negative, and I don't
|
|
wish to denigrate the efforts of the StarDivision team responsible for the
|
|
Linux porting project. If you're a StarOffice user happy with the suite
|
|
(especially if you speak German and therefore can read the docs) and
|
|
would like to present a dissenting view, write a piece on it for the Gazette!
|
|
|
|
<p>Two other commercial word-processors for Linux are Applix and WordPerfect.
|
|
Applix, available from <a href="http://www.redhat.com">RedHat</a>,
|
|
has received favorable reviews from many Linux users.
|
|
|
|
<p>A company called SDCorp in Utah has ported Corel's WordPerfect 7 to
|
|
Linux, and a (huge!) demo is available now from both the SDCorp
|
|
<a href="ftp://ftp.sdcorp.com/pub/demos/linux/">ftp</a> site and
|
|
<a href="ftp://ftp.corel.com/pub/WordPerfect/wpunix/demos/linux/">Corel's</a>.
|
|
Unfortunately both FTP servers are unable to resume interrupted
|
|
downloads (usually indicating an NT server) so the CDROM version, available
|
|
from the SDCorp <a href="http://www.sdcorp.com/wplinux">website</a>,
|
|
is probably the way to go, if you'd like to try it out. The demo can be
|
|
transformed into a registered program by paying for it, in which case a key is
|
|
e-mailed to you which registers the program, but only for the machine it is
|
|
installed on.
|
|
|
|
<p>Addendum: I recently had an exchange of e-mail with Brad Caldwell, product
|
|
manager for the SDCorp WordPerfect port. I complained about the difficulty of
|
|
downloading the 36 mb. demo, and a couple of days later I was informed that
|
|
the file has been split into nine parts, and that they were investigating the
|
|
possibility of changing to an FTP server which supports interrupted
|
|
downloads. The smaller files are available from
|
|
<a href="http://www.sdcorp.com/demos/smallftp.htm">this web page</a>.
|
|
|
|
<hr>
|
|
<p>There exists a curious dichotomous attitude these days in the Linux
|
|
community. I assume most people involved with Linux would like the operating
|
|
system to gain more users and perhaps move a little closer to the mainstream.
|
|
Linux advocates bemoan the relative lack of "productivity apps" for Linux,
|
|
which would make the OS more acceptable in corporate or business
|
|
environments. But how many of these advocates would use the applications if
|
|
they were more common? Often the change of mindset discussed above mitigates
|
|
against acceptance of Windows-like programs, with no source code available and
|
|
limited access to the developers. Linux has strong roots in the GNU and free
|
|
software movements (not always synonymous) and this background might be a
|
|
barrier towards development of a thriving commercial software market.
|
|
|
|
<!--===================================================================-->
|
|
<P> <hr> <P>
|
|
<center>
|
|
<h5>Copyright © 1997, Larry Ayers <br>
|
|
Published in Issue 22 of the Linux Gazette, October 1997</h5>
|
|
</center>
|
|
|
|
<!--===================================================================-->
|
|
<P> <hr> <P>
|
|
<A HREF="./index.html"><IMG ALIGN=BOTTOM SRC="../gx/indexnew.gif"
|
|
ALT="[ TABLE OF CONTENTS ]"></A>
|
|
<A HREF="../index.html"><IMG ALIGN=BOTTOM SRC="../gx/homenew.gif"
|
|
ALT="[ FRONT PAGE ]"></A>
|
|
<A HREF="./bench.html"><IMG SRC="../gx/back2.gif"
|
|
ALT=" Back "></A>
|
|
<A HREF="./new_emacs.html"><IMG SRC="../gx/fwd.gif" ALT=" Next "></A>
|
|
<P> <hr> <P>
|
|
<!--startcut ==========================================================-->
|
|
</BODY>
|
|
</HTML>
|
|
<!--endcut ============================================================-->
|
|
|
|
|
|
|
|
|
|
|
|
|