254 lines
12 KiB
HTML
254 lines
12 KiB
HTML
<!--startcut ==========================================================-->
|
|
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
|
|
<html><head>
|
|
<title>A Remembrance of Text Past</title>
|
|
</head>
|
|
|
|
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#0000AF"
|
|
ALINK="#FF0000" >
|
|
<!--endcut ============================================================-->
|
|
|
|
<H4>
|
|
"Linux Gazette...<I>making Linux just a little more fun!</I>"
|
|
</H4>
|
|
|
|
<P> <HR> <P>
|
|
<!--===================================================================-->
|
|
|
|
<center><font color="maroon"><h1>A Remembrance of Text Past</h1>
|
|
</font></center>
|
|
<center><font color="maroon">
|
|
<h3>The Remembrance Agent As An Aid To Writers</h3>
|
|
</font>
|
|
</center>
|
|
|
|
<center>
|
|
<h4><a href="mailto: layers@marktwain.net">by Larry Ayers</a></h4>
|
|
</center>
|
|
<hr>
|
|
|
|
<center><font color="maroon"><h3>Introduction</font></h3></center>
|
|
|
|
<p>In 1930 a writer named C.E. Montague offered this speculative view of
|
|
one aspect of the writer's craft:
|
|
|
|
<blockquote>
|
|
So, to a writer happily engaged on his work and excited by it, there may come
|
|
a curious extension of his ordinary faculties; he will find portions of
|
|
knowledge floating back into his brain, available for use, which he had
|
|
supposed to be thrown away long ago on the rubbish-heap outside the back door
|
|
of his mind; relevant passages will quote themselves to his mind from books he
|
|
scarcely remembers to have ever read; and he suddenly sees germane connections
|
|
where in his ordinary state of mind he would see nothing. The field of
|
|
consciousness has expanded again. People of strong social instinct often
|
|
derive the same experience from animated conversation; the exercise of their
|
|
own vivacity stirs latent powers of apprehension in them; the area upon which
|
|
they are able to draw for those piquant incongruities, which are the chief
|
|
material of wit, is for the moment widened; the field of comic consciousness
|
|
is enlarged.
|
|
</blockquote>
|
|
<p>Excerpted from
|
|
<cite> A Writer's Notes On His Trade, by C.E. Montague</cite>
|
|
|
|
<p>In Mr. Montague's day a writer had to rely on good memory and serendipity,
|
|
then hope for the best. The advent of the personal computer has provided
|
|
writers with new methods of recalling previously read texts which pertain to a
|
|
work-in-progress. After all, the pleasant state of being "happily
|
|
engaged ... and excited" is unpredictable and difficult to summon at
|
|
will.
|
|
|
|
One function of a computer is to act as an extension of human memory. The
|
|
serendipity of chance thoughts and ideas is missing, but searching and indexing
|
|
along with regular expressions can add a new dimension to the retrieval of
|
|
information.
|
|
|
|
<p>Scott Rosenberg, in a recent issue of the on-line magazine Salon, wrote
|
|
about a variety of PIM software which he and his readers have used to organize
|
|
personal notes and random saved text. (The article is available
|
|
<a href="http://www.salonmagazine.com/21st/rose/1999/03/23straight.html">here.</a>)
|
|
Rosenberg writes that several popular pieces of organizing software have
|
|
been orphaned by their parent companies (always a risk when using proprietary
|
|
programs), leaving users in the unfortunate position of depending on
|
|
unmaintained and static software. At the end of the article, after discussing
|
|
the pros and cons of various organizing methods, he mentions that several
|
|
free-software users had e-mailed him descriptions of how Emacs can be used to
|
|
simplify access to collections of textual information, which brings me to the
|
|
subject of this article, a system written by Bradley Rhodes and several
|
|
undergraduate students at MIT.
|
|
<hr>
|
|
|
|
|
|
<center><font color="maroon"><h3>The Remembrance Agent</h3>
|
|
</font></center>
|
|
|
|
<p>Imagine typing an essay or letter with an attentive and devoted servant
|
|
peering over your shoulder. Imagine as well that convenient to this servant's
|
|
reach is a well-ordered filing cabinet containing reams of documents which
|
|
concern your various interests. This servant's sole duty is to notice the
|
|
subject matter of the sentence you are typing, then to rapidly find all
|
|
documents which contain any mention of that subject and array them upon a
|
|
table convenient to your occasional glance.
|
|
|
|
<p>Aside from the distraction caused by this bustling servant hovering about,
|
|
the above fantasy is unlikely; no-one is as patient and nimble-fingered as
|
|
this servant would need to be. The Remembrance Agent is a software system
|
|
which in effect acts as this superhuman servant. It is composed of the
|
|
following components:
|
|
|
|
<ul>
|
|
<li>A compiled C program, <i>ra-index</i>, which constructs an indexed
|
|
binary database of the text files in a directory
|
|
<li>Another executable, <i>ra-retrieve</i>, which will find documents
|
|
matching keywords in the text being currently typed
|
|
<li>An Emacs-Lisp file, <i>remem.el</i>, which provides an Emacs or XEmacs
|
|
interface to the first two utilities
|
|
<li>Another Lisp file, <i>remem-custom.el</i>, which allows a user to easily
|
|
customize the directories to be searched, the default screen colors, and
|
|
the time intervals between searchs.
|
|
</ul>
|
|
|
|
<p>The ra-index program is the core of the system. It is run with this
|
|
syntax:<br>
|
|
|
|
<pre><kbd>
|
|
ra-index [-v] <base-dir> <source1> [<source2>] ...
|
|
[-e <excludee1> [<excludee2>] ...]
|
|
</kbd></pre>
|
|
|
|
<p>The "basedir" in the command is by default a subdirectory
|
|
of<kbd>~/RA-indexes</kbd>, while "source1", etc. can be either
|
|
individual files or entire directories. The files can be saved e-mail messages
|
|
or news-group postings, HTML files, or any other ASCII text files. The
|
|
optional "excludees" are subdirectories which ra-index will ignore.
|
|
|
|
<p>The <kbd>-v</kbd> switch at the beginning of the command turns
|
|
on verbose mode, letting you see just what the program is doing. The result
|
|
is a binary database of keywords along with some index files. As an example I
|
|
ran the command on a directory containing all of the back-issues of the
|
|
Gazette. Afterwards a listing of the contents of
|
|
<kbd>~/RA-indexes/linux_gazette</kbd> looked like this:<br>
|
|
|
|
<pre><kbd>
|
|
-rw-r--r-- 1 liatris liatris 41796 Mar 27 13:12 doclens
|
|
-rw-r--r-- 1 liatris liatris 34894 Mar 27 13:12 doclocs
|
|
-rw-r--r-- 1 liatris liatris 23220 Mar 27 13:12 doclocs_offs
|
|
-rw-r--r-- 1 liatris liatris 60270 Mar 27 13:12 titles
|
|
-rw-r--r-- 1 liatris liatris 4644 Mar 27 13:12 titles_offs
|
|
-rw-r--r-- 1 liatris liatris 2794112 Mar 27 13:12 wordvecs
|
|
-rw-r--r-- 1 liatris liatris 571152 Mar 27 13:12 wvoffs
|
|
</kbd></pre>
|
|
|
|
<p>As a rough idea of the relation of database to source, the directory listed
|
|
above occupies three and one-half mb., while the LG back-issue directory is
|
|
nearly forty mb.; in this case at least the database is nearly nine percent as
|
|
large as the source.
|
|
|
|
<p>Ra-index deals intelligently with several common file formats. Headers of
|
|
e-mail and usenet messages are ignored as well as the tags in HTML files.
|
|
|
|
<p>The corresponding retrieval program, ra-retrieve, is normally run by the
|
|
Emacs front-end to Remembrance Agent. Typing <b>C-c r r</b> in an Emacs
|
|
session activates RA; a new Agent window appears and the database is searched
|
|
for matches to a configurable number of words surrounding the point position
|
|
in the file being edited.
|
|
|
|
<p>Here is a screenshot which shows the Agent in action; this HTML file is
|
|
being edited in XEmacs, with the RA window showing files in a directory
|
|
containing past issues of LG:<br>
|
|
|
|
<p><img alt="Agent and XEmacs" src="./gx/ayers/remembrance.gif">
|
|
|
|
<p>Each file shown contains one or more keywords which match a word within
|
|
three hundred characters of the cursor position in the file being edited. The
|
|
number in the leftmost field can be selected with the mouse and that file will
|
|
be loaded in the main editing window; the decimal fractions in the next field
|
|
are the file's "score", with 1.00 the high end of the range.
|
|
Clicking the right mouse button on a file's line in the RA window will enable
|
|
the user to see (in a small pop-up box) what keywords relate the file to the
|
|
file in the main window. Every few seconds the listings in the RA window are
|
|
updated to correspond to what is currently being written. Depending on typing
|
|
speed this update interval can be a configurable number of seconds.
|
|
<hr>
|
|
|
|
<center><font color="maroon"><h3>Installing RA</h3></font></center>
|
|
|
|
<p>After the two executables have been compiled and moved to
|
|
<kbd>/usr/local/bin</kbd> or another binary directory, installation consists
|
|
of copying the two Lisp files to a location Emacs knows about, such as a
|
|
site-lisp directory. Then the <kbd>remem-custom.el</kbd> file needs to be
|
|
edited. In this file are options which set the location of the two
|
|
executables, databases to be searched, and font-lock highlighting colors.
|
|
The last step is insertion of these lines in your .emacs file:<br>
|
|
|
|
<kbd><pre>
|
|
(load "remem.el")
|
|
(load "remem-custom.el")
|
|
</pre></kbd>
|
|
|
|
<p>Before starting Emacs (after RA is installed) thought needs to be given to
|
|
choosing files and directories to be indexed. Too many files will result in
|
|
bulky databases and slower searches. I tried indexing some old mailbox files
|
|
and discovered that I save too many messages.
|
|
|
|
It all depends on the type of writing and subject matter planned. Run
|
|
ra-index on some varied material, creating several subdirectories in
|
|
<kbd>~/RA-indexes</kbd>, and try them separately, editing the remem-custom.el
|
|
in order to let the Emacs interface know which database to use.
|
|
<hr>
|
|
|
|
<center><font color="maroon"><h3>Conclusion</h3></font></center>
|
|
|
|
<p>It takes some practice to be able to use RA effectively. At first the
|
|
periodically-refreshed Agent window is distracting; I'd find myself wondering
|
|
just what connection a retrieved filename had to what I was writing and lose
|
|
my train of thought while investigating. One technique is to toggle RA off
|
|
when actively writing (the <b>C-c rr</b> command both starts and stops RA),
|
|
then toggle it back on while reviewing newly-typed material. Choosing
|
|
appropriate files to index makes a big difference in RA's usefulness and this
|
|
takes experimentation as well.
|
|
|
|
<p>I noticed some problems (mainly with the mouse) when running RA under
|
|
XEmacs. The current version (2.01) is the first to support XEmacs at all, so
|
|
possibly future releases will fix the aberrant behavior. Under GNU Emacs the
|
|
package is well-behaved. The documentation at this point is minimal but
|
|
adequate; enough is supplied to install and configure the package but little
|
|
explanation of internals and advanced usage, though the source is
|
|
well-commented.
|
|
|
|
<p>Aside from these minor complaints, the Remembrance Agent is that <i>rara
|
|
avis</i> in the software world, a software package which truly breaks new
|
|
ground. The concept is innovative enough to explain why using RA is rather
|
|
disconcerting at first and requires some user adaptation. Normal user-level software
|
|
responds to a command, then quiescently waits for the next one. RA resembles
|
|
a daemon process which goes about its work in the background, but daemons
|
|
don't display the fruits of this activity in an editor window!
|
|
|
|
<p>If you would like to give the Remembrance Agent a try, the current version
|
|
is available from the
|
|
<a href="http://www.media.mit.edu/~rhodes/RA/">RA home web-site.</a>
|
|
|
|
<hr>
|
|
|
|
<!-- hhmts start -->
|
|
Last modified: Sun Mar 28 10:26:43 CST 1999
|
|
<!-- hhmts end -->
|
|
<!--===================================================================-->
|
|
<P> <hr> <P>
|
|
<center><H5>Copyright © 1999, Larry Ayers <BR>
|
|
Published in Issue 39 of <i>Linux Gazette</i>, April 1999</H5></center>
|
|
|
|
<!--===================================================================-->
|
|
<P> <hr> <P>
|
|
<A HREF="./index.html"><IMG ALIGN=BOTTOM SRC="../gx/indexnew.gif"
|
|
ALT="[ TABLE OF CONTENTS ]"></A>
|
|
<A HREF="../index.html"><IMG ALIGN=BOTTOM SRC="../gx/homenew.gif"
|
|
ALT="[ FRONT PAGE ]"></A>
|
|
<A HREF="./ayers1.html"><IMG SRC="../gx/back2.gif"
|
|
ALT=" Back "></A>
|
|
<A HREF="./rogers.html"><IMG SRC="../gx/fwd.gif" ALT=" Next "></A>
|
|
<P> <hr> <P>
|
|
<!--startcut ==========================================================-->
|
|
</BODY>
|
|
</HTML>
|
|
<!--endcut ============================================================-->
|