old-www/LDP/LG/issue39/ayers2.html

254 lines
12 KiB
HTML

<!--startcut ==========================================================-->
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html><head>
<title>A Remembrance of Text Past</title>
</head>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#0000AF"
ALINK="#FF0000" >
<!--endcut ============================================================-->
<H4>
"Linux Gazette...<I>making Linux just a little more fun!</I>"
</H4>
<P> <HR> <P>
<!--===================================================================-->
<center><font color="maroon"><h1>A Remembrance of Text Past</h1>
</font></center>
<center><font color="maroon">
<h3>The Remembrance Agent As An Aid To Writers</h3>
</font>
</center>
<center>
<h4><a href="mailto: layers@marktwain.net">by Larry Ayers</a></h4>
</center>
<hr>
<center><font color="maroon"><h3>Introduction</font></h3></center>
<p>In 1930 a writer named C.E. Montague offered this speculative view of
one aspect of the writer's craft:
<blockquote>
So, to a writer happily engaged on his work and excited by it, there may come
a curious extension of his ordinary faculties; he will find portions of
knowledge floating back into his brain, available for use, which he had
supposed to be thrown away long ago on the rubbish-heap outside the back door
of his mind; relevant passages will quote themselves to his mind from books he
scarcely remembers to have ever read; and he suddenly sees germane connections
where in his ordinary state of mind he would see nothing. The field of
consciousness has expanded again. People of strong social instinct often
derive the same experience from animated conversation; the exercise of their
own vivacity stirs latent powers of apprehension in them; the area upon which
they are able to draw for those piquant incongruities, which are the chief
material of wit, is for the moment widened; the field of comic consciousness
is enlarged.
</blockquote>
<p>Excerpted from
<cite> A Writer's Notes On His Trade, by C.E. Montague</cite>
<p>In Mr. Montague's day a writer had to rely on good memory and serendipity,
then hope for the best. The advent of the personal computer has provided
writers with new methods of recalling previously read texts which pertain to a
work-in-progress. After all, the pleasant state of being &quot;happily
engaged ... and excited&quot; is unpredictable and difficult to summon at
will.
One function of a computer is to act as an extension of human memory. The
serendipity of chance thoughts and ideas is missing, but searching and indexing
along with regular expressions can add a new dimension to the retrieval of
information.
<p>Scott Rosenberg, in a recent issue of the on-line magazine Salon, wrote
about a variety of PIM software which he and his readers have used to organize
personal notes and random saved text. (The article is available
<a href="http://www.salonmagazine.com/21st/rose/1999/03/23straight.html">here.</a>)
Rosenberg writes that several popular pieces of organizing software have
been orphaned by their parent companies (always a risk when using proprietary
programs), leaving users in the unfortunate position of depending on
unmaintained and static software. At the end of the article, after discussing
the pros and cons of various organizing methods, he mentions that several
free-software users had e-mailed him descriptions of how Emacs can be used to
simplify access to collections of textual information, which brings me to the
subject of this article, a system written by Bradley Rhodes and several
undergraduate students at MIT.
<hr>
<center><font color="maroon"><h3>The Remembrance Agent</h3>
</font></center>
<p>Imagine typing an essay or letter with an attentive and devoted servant
peering over your shoulder. Imagine as well that convenient to this servant's
reach is a well-ordered filing cabinet containing reams of documents which
concern your various interests. This servant's sole duty is to notice the
subject matter of the sentence you are typing, then to rapidly find all
documents which contain any mention of that subject and array them upon a
table convenient to your occasional glance.
<p>Aside from the distraction caused by this bustling servant hovering about,
the above fantasy is unlikely; no-one is as patient and nimble-fingered as
this servant would need to be. The Remembrance Agent is a software system
which in effect acts as this superhuman servant. It is composed of the
following components:
<ul>
<li>A compiled C program, <i>ra-index</i>, which constructs an indexed
binary database of the text files in a directory
<li>Another executable, <i>ra-retrieve</i>, which will find documents
matching keywords in the text being currently typed
<li>An Emacs-Lisp file, <i>remem.el</i>, which provides an Emacs or XEmacs
interface to the first two utilities
<li>Another Lisp file, <i>remem-custom.el</i>, which allows a user to easily
customize the directories to be searched, the default screen colors, and
the time intervals between searchs.
</ul>
<p>The ra-index program is the core of the system. It is run with this
syntax:<br>
<pre><kbd>
ra-index [-v] &lt;base-dir&gt; &lt;source1&gt; [&lt;source2&gt;] ...
[-e &lt;excludee1&gt; [&lt;excludee2&gt;] ...]
</kbd></pre>
<p>The &quot;basedir&quot; in the command is by default a subdirectory
of<kbd>~/RA-indexes</kbd>, while &quot;source1&quot;, etc. can be either
individual files or entire directories. The files can be saved e-mail messages
or news-group postings, HTML files, or any other ASCII text files. The
optional &quot;excludees&quot; are subdirectories which ra-index will ignore.
<p>The &nbsp;<kbd>-v</kbd>&nbsp; switch at the beginning of the command turns
on verbose mode, letting you see just what the program is doing. The result
is a binary database of keywords along with some index files. As an example I
ran the command on a directory containing all of the back-issues of the
Gazette. Afterwards a listing of the contents of
<kbd>~/RA-indexes/linux_gazette</kbd> looked like this:<br>
<pre><kbd>
-rw-r--r-- 1 liatris liatris 41796 Mar 27 13:12 doclens
-rw-r--r-- 1 liatris liatris 34894 Mar 27 13:12 doclocs
-rw-r--r-- 1 liatris liatris 23220 Mar 27 13:12 doclocs_offs
-rw-r--r-- 1 liatris liatris 60270 Mar 27 13:12 titles
-rw-r--r-- 1 liatris liatris 4644 Mar 27 13:12 titles_offs
-rw-r--r-- 1 liatris liatris 2794112 Mar 27 13:12 wordvecs
-rw-r--r-- 1 liatris liatris 571152 Mar 27 13:12 wvoffs
</kbd></pre>
<p>As a rough idea of the relation of database to source, the directory listed
above occupies three and one-half mb., while the LG back-issue directory is
nearly forty mb.; in this case at least the database is nearly nine percent as
large as the source.
<p>Ra-index deals intelligently with several common file formats. Headers of
e-mail and usenet messages are ignored as well as the tags in HTML files.
<p>The corresponding retrieval program, ra-retrieve, is normally run by the
Emacs front-end to Remembrance Agent. Typing <b>C-c r r</b> in an Emacs
session activates RA; a new Agent window appears and the database is searched
for matches to a configurable number of words surrounding the point position
in the file being edited.
<p>Here is a screenshot which shows the Agent in action; this HTML file is
being edited in XEmacs, with the RA window showing files in a directory
containing past issues of LG:<br>
<p><img alt="Agent and XEmacs" src="./gx/ayers/remembrance.gif">
<p>Each file shown contains one or more keywords which match a word within
three hundred characters of the cursor position in the file being edited. The
number in the leftmost field can be selected with the mouse and that file will
be loaded in the main editing window; the decimal fractions in the next field
are the file's &quot;score&quot;, with 1.00 the high end of the range.
Clicking the right mouse button on a file's line in the RA window will enable
the user to see (in a small pop-up box) what keywords relate the file to the
file in the main window. Every few seconds the listings in the RA window are
updated to correspond to what is currently being written. Depending on typing
speed this update interval can be a configurable number of seconds.
<hr>
<center><font color="maroon"><h3>Installing RA</h3></font></center>
<p>After the two executables have been compiled and moved to
<kbd>/usr/local/bin</kbd> or another binary directory, installation consists
of copying the two Lisp files to a location Emacs knows about, such as a
site-lisp directory. Then the <kbd>remem-custom.el</kbd> file needs to be
edited. In this file are options which set the location of the two
executables, databases to be searched, and font-lock highlighting colors.
The last step is insertion of these lines in your .emacs file:<br>
<kbd><pre>
(load "remem.el")
(load "remem-custom.el")
</pre></kbd>
<p>Before starting Emacs (after RA is installed) thought needs to be given to
choosing files and directories to be indexed. Too many files will result in
bulky databases and slower searches. I tried indexing some old mailbox files
and discovered that I save too many messages.
It all depends on the type of writing and subject matter planned. Run
ra-index on some varied material, creating several subdirectories in
<kbd>~/RA-indexes</kbd>, and try them separately, editing the remem-custom.el
in order to let the Emacs interface know which database to use.
<hr>
<center><font color="maroon"><h3>Conclusion</h3></font></center>
<p>It takes some practice to be able to use RA effectively. At first the
periodically-refreshed Agent window is distracting; I'd find myself wondering
just what connection a retrieved filename had to what I was writing and lose
my train of thought while investigating. One technique is to toggle RA off
when actively writing (the <b>C-c rr</b> command both starts and stops RA),
then toggle it back on while reviewing newly-typed material. Choosing
appropriate files to index makes a big difference in RA's usefulness and this
takes experimentation as well.
<p>I noticed some problems (mainly with the mouse) when running RA under
XEmacs. The current version (2.01) is the first to support XEmacs at all, so
possibly future releases will fix the aberrant behavior. Under GNU Emacs the
package is well-behaved. The documentation at this point is minimal but
adequate; enough is supplied to install and configure the package but little
explanation of internals and advanced usage, though the source is
well-commented.
<p>Aside from these minor complaints, the Remembrance Agent is that <i>rara
avis</i> in the software world, a software package which truly breaks new
ground. The concept is innovative enough to explain why using RA is rather
disconcerting at first and requires some user adaptation. Normal user-level software
responds to a command, then quiescently waits for the next one. RA resembles
a daemon process which goes about its work in the background, but daemons
don't display the fruits of this activity in an editor window!
<p>If you would like to give the Remembrance Agent a try, the current version
is available from the
<a href="http://www.media.mit.edu/~rhodes/RA/">RA home web-site.</a>
<hr>
<!-- hhmts start -->
Last modified: Sun Mar 28 10:26:43 CST 1999
<!-- hhmts end -->
<!--===================================================================-->
<P> <hr> <P>
<center><H5>Copyright &copy; 1999, Larry Ayers <BR>
Published in Issue 39 of <i>Linux Gazette</i>, April 1999</H5></center>
<!--===================================================================-->
<P> <hr> <P>
<A HREF="./index.html"><IMG ALIGN=BOTTOM SRC="../gx/indexnew.gif"
ALT="[ TABLE OF CONTENTS ]"></A>
<A HREF="../index.html"><IMG ALIGN=BOTTOM SRC="../gx/homenew.gif"
ALT="[ FRONT PAGE ]"></A>
<A HREF="./ayers1.html"><IMG SRC="../gx/back2.gif"
ALT=" Back "></A>
<A HREF="./rogers.html"><IMG SRC="../gx/fwd.gif" ALT=" Next "></A>
<P> <hr> <P>
<!--startcut ==========================================================-->
</BODY>
</HTML>
<!--endcut ============================================================-->