893 lines
36 KiB
HTML
893 lines
36 KiB
HTML
<!--startcut ==========================================================-->
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<title>A Glimpse of Icon LG #27</title>
|
|
</HEAD>
|
|
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#A000A0"
|
|
ALINK="#FF0000">
|
|
<!--endcut ============================================================-->
|
|
|
|
<H4>
|
|
"Linux Gazette...<I>making Linux just a little more fun!</I>"
|
|
</H4>
|
|
|
|
<P> <HR> <P>
|
|
<!--===================================================================-->
|
|
|
|
<center>
|
|
<H1>A Glimpse of Icon</H1>
|
|
<H4>By <a href="mailto:jeffery@cs.utsa.edu">Clinton Jeffery</a>
|
|
and <A HREF="mailto:spm@drones.com">Shamim Mohamed</A></H4>
|
|
<img src="./gx/jeffery/cube192.gif">
|
|
</center>
|
|
<P> <HR> <P>
|
|
|
|
<h3> Motivation </h3>
|
|
<P> <HR> <P>
|
|
|
|
Many languages introduce special capabilities for specific kinds of
|
|
applications, but few present us with more powerful control structures or
|
|
programming paradigms. You may be comfortable sticking with a language
|
|
you already know, but if you are challenged to write complex programs and
|
|
are short on time, you need the best language for the job. Icon is a
|
|
high-level programming language that looks like many other programming
|
|
languages but offers many advanced features that add up to big gains
|
|
in productivity. Before we get to all that, let us write the canonical
|
|
first program:
|
|
<PRE>procedure main()
|
|
write("Hello, world!")
|
|
end</PRE>
|
|
If you've installed Linux Icon,
|
|
Save this in a file named <code>hello.icn</code> and run <I>icont</I>, the
|
|
Icon translator on it:
|
|
<PRE>icont hello -x</PRE>
|
|
<I> icont</I> performs some syntax checking on <code>hello.icn</code> and
|
|
transforms the code into
|
|
instructions for the Icon virtual machine, which will be saved in
|
|
<code>hello</code>. The <code>-x</code> option tells <code>icont</code> to
|
|
execute the program also.
|
|
|
|
<p>
|
|
We are introducing many concepts, so don't expect to understand
|
|
everything the first time through -- the only way to learn a language is to
|
|
write programs in it; so get Icon, and take it for a test drive.
|
|
|
|
<h4><u>Genealogy</u></h4>
|
|
|
|
Despite its name, Icon is not a visual programming language -- its
|
|
look-and-feel descends from Pascal and C. The important thing about
|
|
Icon is not that its syntax is easy to learn. Icon's semantics, which
|
|
generalize ideas from SNOBOL4, succeed in adding
|
|
considerable power to the familiar notation found in most
|
|
programming languages. This is noteworthy because most languages that add real
|
|
power (APL, Lisp, and SmallTalk are examples) do so with a syntax that
|
|
is so different that programmers must learn a new way of thinking to use
|
|
them. Icon adds power `under the hood' of a notation most programmers are
|
|
already comfortable with.<p>
|
|
|
|
Icon was developed over several years at the University of Arizona by a team
|
|
led by Ralph Griswold. Today, it runs on many platforms and is used by
|
|
researchers in algorithms, compilers, and linguistics as well as system
|
|
administrators and hobbyists. The implementation and source code are
|
|
in the public domain.
|
|
|
|
<P> <HR> <P>
|
|
<H3>Variables, Expressions, and Type</H3>
|
|
<P> <HR> <P>
|
|
|
|
Icon's expression syntax starts out much as do most languages.
|
|
For example, <code>i+j</code>
|
|
represents the arithmetic addition of the values stored in the variables
|
|
<code>i</code> and <code>j</code>, <code>f(x)</code> is a call
|
|
to <code>f</code> with argument <code>x</code>, variables may be global
|
|
or local to a procedure, and so on.
|
|
|
|
<p>
|
|
|
|
Variable declarations are not required, and variables can hold any type of
|
|
value. However, Icon is a strongly typed language; it knows the type of each
|
|
value and it does not allow you to mix invalid types in expressions. The
|
|
basic scalar types are integers, real numbers, strings, and sets of
|
|
characters (csets). Integers and strings can be arbitrarily large; and
|
|
strings can contain any characters. There are also structured types: lists,
|
|
associative arrays, sets and records. Icon performs automatic storage
|
|
management.
|
|
|
|
<h4><u>Goal-directed Expression Evaluation</u></h4>
|
|
|
|
Icon's major innovation is its expression evaluation mechanism. It avoids
|
|
certain problems widely found in conventional languages in which each
|
|
expression always computes one result. In such languages, if no valid
|
|
result is possible, a sentinel value such as -1, NULL, or EOF (end-of-file)
|
|
is returned instead. The program must check for such errors using boolean
|
|
logic and if-then tests, and the programmer must remember many different
|
|
sentinel values used in different circumstances. This is cumbersome.
|
|
Alternative ideas such as exceptions have been developed in
|
|
some languages, but they introduce complexity and problems of their own.
|
|
|
|
|
|
<h4><u>Success and Failure</u></h4>
|
|
|
|
In Icon, expressions are <em>goal-directed</em>. When it is evaluated, every
|
|
expression has a goal of producing results for the surrounding expression.
|
|
If an expression succeeds in producing a result, the surrounding expression
|
|
executes as intended, but if an expression cannot produce a result, it is
|
|
said to <em>fail</em> and the surrounding expression cannot be performed and
|
|
in turn fails. This powerful concept subsumes Boolean logic and the
|
|
use of sentinel values, and allows a host of further improvements.
|
|
As an example, consider the expression <code>i > 3</code>
|
|
-- if the value <code>i</code> is greater than 3 the expression succeeds,
|
|
otherwise it fails.
|
|
<P>
|
|
|
|
Control structures such as <code>if</code> check for success, so
|
|
<PRE> if i > 3 then ...</PRE> does the expected thing. Since the
|
|
expression semantics are not encumbered with the need to propagate
|
|
boolean (or 0 and 1) values, comparison operators can instead propagate
|
|
a useful value (their right operand), allowing expressions such as
|
|
<code>3 > i > 7</code>
|
|
which is standard in mathematics, but doesn't work in most languages.
|
|
<p>
|
|
|
|
Since functions that fail do not need to return an error code separately
|
|
from the results, detecting cases such as end-of-file is simpler, as in:
|
|
<PRE> if line := read() then write(process(line))</PRE>
|
|
On end-of-file, <code>read()</code> fails, causing the assignment expression
|
|
tested in the if-part to fail. When the test fails, the <i>then</i> branch
|
|
is not executed so the call to <code>write()</code> does not occur.
|
|
Since failure propagates through an expression, the above example is equivalent
|
|
to
|
|
<PRE> write(process(read())</PRE>
|
|
|
|
|
|
<h4><u>Generators</u></h4>
|
|
|
|
Some expressions can naturally produce more than one result. These
|
|
expressions are called <em>generators</em>.
|
|
Consider the task of searching for a substring within a string, and returning
|
|
the position at which the substring occurs, as in Icon's <code>find()</code>
|
|
function:
|
|
<PRE> find("or", "horror")</PRE>
|
|
In conventional languages, this would return one of the possible return
|
|
values, usually the first or the last. In Icon, this expression is capable
|
|
of returning all the values, depending on the execution context.
|
|
If the surrounding expression only needs one value, as in the case of an
|
|
<i>if</i> test or an assignment, only the first value of a generator is
|
|
produced. If a generator is part of a more complex expression, then
|
|
the return values are produced in sequence until the whole expression
|
|
produces a value. In the expression
|
|
<PRE> find("or", "horror") > 3 </PRE>
|
|
the first value produced by <code>find()</code>, a 2, causes the
|
|
<code>></code> operation to fail. Icon <em>resumes</em> the call to find(),
|
|
which produces a 4, and the expression succeeds.
|
|
<p>
|
|
The most obvious generator is the alternation operator |. The expression
|
|
<pre> expr<sub>1</sub> | expr<sub>2</sub></pre>
|
|
is a generator that produces its lefthand side followed by its righthand
|
|
side, if needed by the surrounding expression. Consider
|
|
<code>f(1|2)</code> -- <code>f</code> is first invoked with the value 1;
|
|
if that does not produce a value, the generator is resumed for another
|
|
result and <code>f</code> will be called again with the value 2.
|
|
As another example of the same operator,
|
|
|
|
<pre> x = (3 | 5)</pre>
|
|
|
|
is equivalent to but more concise than C's (x == 3) || (x == 5).
|
|
When more than
|
|
one generator is present in an expression, they are resumed in a LIFO manner.
|
|
|
|
<pre> (x | y) = (3 | 5)</pre>
|
|
|
|
is the Icon equivalent of C's
|
|
|
|
<pre> (x == 3) || (x == 5) || (y == 3) || (y == 5)</pre>
|
|
<p>
|
|
|
|
In addition to <code> | </code>, Icon has a <i>generate</i> operator
|
|
<code> ! </code> that generates elements of data structures, and a
|
|
generator <code>to</code> that produces ranges of integers. For example,
|
|
<code>!L</code> generates the elements of list L, and <code>1 to 10</code>
|
|
generates the first ten positive integers.
|
|
|
|
Besides these operators that generate results, most generators in Icon
|
|
take the form of calls to built-in and user-defined procedures.
|
|
Procedures are discussed below.
|
|
|
|
|
|
<h4><u>Iteration</u></h4>
|
|
|
|
Icon has the ordinary <code>while</code> loop where the control expression is
|
|
evaluated before each iteration. For generators, an alternative loop is
|
|
available
|
|
where the loop body executes once per result produced by a single evaluation
|
|
of the control expression. The alternative loop uses the reserved word
|
|
<code>every</code> and can be used in conjunction with the <code>to</code>
|
|
operator to provide the equivalent of a <code>for</code>-loop:
|
|
<pre> every i := 1 to 10 do ...</pre>
|
|
The point of <code>every</code> and <code>to</code> is not that you can use
|
|
them to implement a for-loop; Icon's generator mechanism is quite a
|
|
bit more general. The <code>every</code> loop lets you
|
|
walk through all the results of a generator giving you iterators
|
|
for free. And <code>every</code> isn't limited to sequences of numbers or
|
|
traversals of specific data structures like iterators in some languages; it
|
|
works on any expression that contains generators.
|
|
|
|
<pre> every f(1 | 1 | 2 | 3 | 5 | 8)</pre>
|
|
executes the function <code>f</code> with the first few
|
|
fibonacci numbers, and
|
|
the example could be generalized to a user-defined generator procedure
|
|
that produced the entire fibonacci sequence.
|
|
Using generators requires a bit of practice, but then it is fun!
|
|
<P> <HR> <P>
|
|
<H3>Procedures</H3>
|
|
<P> <HR> <P>
|
|
|
|
Procedures are a basic building block in most languages, including Icon.
|
|
Like C, an Icon program is organized as a collection of procedures and
|
|
execution starts from a procedure named <code>main()</code>.
|
|
Here is an example of an ordinary procedure. This one generates and sums
|
|
the elements of a list <code>L</code>, whose elements had better be numbers
|
|
(or convertible to numbers).
|
|
<PRE>procedure sum(L)
|
|
total := 0
|
|
every total +:= !L
|
|
return total
|
|
end</PRE>
|
|
<P>
|
|
|
|
A user can write her own generator by including a
|
|
<pre> suspend <em>expr</em></pre>
|
|
in a procedure where a result should be produced. When a procedure
|
|
suspends, it transfers a result to the caller, but remains available to
|
|
continue where it left off and generate more results. If the expression
|
|
from which it is called needs more or different results in order to succeed,
|
|
the procedure will be resumed.
|
|
The following example generates the elements from parameter <code>L</code>,
|
|
but filters out the zeros.
|
|
<PRE>procedure nonzero(L)
|
|
every i := !L do
|
|
if i ~= 0 then suspend i
|
|
end</PRE>
|
|
The <code>fail</code> expression makes the procedure fail, i.e. causes control
|
|
to go back to the calling procedure without returning a value. A procedure
|
|
also fails implicitly
|
|
if control flows off the end of the procedure's body.
|
|
|
|
<P> <HR> <P>
|
|
<h3> String Processing </h3>
|
|
<P><HR> <P>
|
|
|
|
Besides expression evaluation, Icon offers compelling features to reduce the
|
|
effort required to write complex programs. From Icon's ancestor SNOBOL4, the
|
|
granddaddy of all string processing languages, Icon inherits some of the
|
|
most flexible and readable built-in data structures found in any language.
|
|
|
|
<h4><u>Strings</u></h4>
|
|
|
|
Parts of Icon strings are accessed using the subscript operator. Indexes
|
|
denote the positions <em>between</em> characters, and pairs of indexes
|
|
are used to pick out substrings. If <code>s</code> is the string
|
|
<code>"hello, world"</code> then the expressions
|
|
<pre> s[7] := " linux "
|
|
s[14:19] := "journal"</pre>
|
|
change <code>s</code> into <code>"hello, linux journal"</code>,
|
|
illustrating
|
|
the ease with which insertions and substitutions are made. A myriad of
|
|
built-in functions operate on strings; among them are the operators for
|
|
concatenation (<code>s1 || s2</code>) and size
|
|
(<code>*s</code>).
|
|
|
|
<H4><u>String Scanning</u></H4>
|
|
|
|
The string analysis facility of Icon is called <em>scanning</em>. A scanning
|
|
environment is set up by the <code>?</code> operator:
|
|
<PRE> s ? expr</PRE>
|
|
A scanning environment has a
|
|
string and a current position in it. <em>Matching functions</em> change this
|
|
position, and return the substring between the old and new positions. Here
|
|
is a simple example:
|
|
<PRE> text ? {
|
|
while move(1) do
|
|
write(move(1))
|
|
}</PRE>
|
|
<code>move</code> is a function that advances the position by its argument; so
|
|
this code writes out every alternate character of the string in
|
|
<code>text</code>. Another matching function is <code>tab</code>, which sets
|
|
the position to its argument.
|
|
<em>String analysis</em> functions examine a string and generate the
|
|
interesting positions in it. We have already seen
|
|
<code>find</code>, which looks
|
|
for substrings. These functions default their subject to the string being
|
|
scanned. Here is a procedure that produces the words from the input:
|
|
<PRE>procedure getword()
|
|
while line := read() do
|
|
line ? while tab(upto(wchar)) do {
|
|
word := tab(many(wchar))
|
|
suspend word
|
|
}
|
|
end</PRE>
|
|
<code>upto(c)</code> returns the next position of a character from the cset
|
|
<code>c</code>; and <code>many(c)</code>
|
|
returns the position after a sequence of characters from <code>c</code>.
|
|
The expression <code>tab(upto(wchar))</code>
|
|
advances the position to a character
|
|
from <code>wchar</code>, the set of characters that make up words; then
|
|
<code>tab(many(wchar))</code> moves the position to the end of the word and
|
|
returns the word that is found.
|
|
|
|
<P> <HR> <P>
|
|
<H3>Regular Expressions</H3>
|
|
<P> <HR> <P>
|
|
|
|
The Icon Program Library (included with the distribution) provides regular
|
|
expression matching functions. To use it, include the line
|
|
<code>link regexp</code> at the top of the program. Here is an example of
|
|
`search-and-replace':
|
|
<PRE>procedure re_sub(str, re, repl)
|
|
result := ""
|
|
str ? {
|
|
while j := ReFind(re) do {
|
|
result ||:= tab(j) || repl
|
|
tab(ReMatch(re))
|
|
}
|
|
result ||:= tab(0)
|
|
}
|
|
return result
|
|
end</PRE>
|
|
<P>
|
|
|
|
<P> <HR> <P>
|
|
<H3>Structures</H3>
|
|
<P> <HR> <P>
|
|
|
|
Icon has several structured (or non-scalar) types as well that help organize
|
|
and store collections of arbitrary (and possibly mixed) types of values. A
|
|
<em>table</em> is an associative array, where values are stored indexed by
|
|
keys which may be of arbitrary type; a list is a group of values accessed
|
|
by integer indices as well as stack and queue operations; a set is an
|
|
unordered group of values, etc.
|
|
|
|
<H4><u>Tables</u></H4>
|
|
|
|
A table is created with the <code>table</code> function. It takes one argument:
|
|
the default value, i.e. the value to return when lookup fails. Here is a
|
|
code fragment to print a word count of the input (assuming the
|
|
<code>getword</code>
|
|
function generates words of interest):
|
|
<PRE> wordcount := table(0)
|
|
every word := getword() do
|
|
wordcount[word] +:= 1
|
|
every word := key(wordcount) do
|
|
write(word, "\t", wordcount[word])</PRE>
|
|
(The <code>key</code> function generates the keys with which values have been
|
|
stored.) Since the default value for the table is 0, when a new word is
|
|
inserted, the default value gets incremented and the new value (i.e. 1) is
|
|
stored with the new word. Tables grow automatically as new elements are
|
|
inserted.
|
|
|
|
<H4><u>Lists</u></H4>
|
|
|
|
A list can be created by enumerating its members:
|
|
<PRE> L := ["linux", 2.0, "unix"]</PRE>
|
|
Lists are dynamic; they grow or shrink through calls to list manipulation
|
|
routines like <code>pop()</code> etc. Elements of the list can be obtained either
|
|
through list manipulation functions or by subscripting:
|
|
<PRE> write(L[3])</PRE>
|
|
There is no restriction on the kinds of values
|
|
that may be stored in a list.
|
|
|
|
<H4><u>Records and Sets</u></H4>
|
|
|
|
A record is like a struct in C, except that there is no restriction
|
|
on the types that can be stored in the fields. After a record is declared:
|
|
<PRE>record complex(re, im)</pre>
|
|
instances of that record are created using a constructor procedure with
|
|
the name of the record type, and on such instances, fields are accessed
|
|
by name:
|
|
<pre> i := complex(0, 0)
|
|
j := complex(1, -1)
|
|
if a.re = b.re then ...</PRE>
|
|
<P>
|
|
A set is an unordered collection of values with the uniqueness property
|
|
i.e. an element can only be present in a set once.
|
|
<PRE> S := set(["rock lobster", 'B', 52])</PRE>
|
|
The functions <code>member</code>, <code>insert</code>, and
|
|
<code>delete</code> do what their
|
|
names suggest. Set intersection, union and difference are provided by
|
|
operators. A set can contain any value (including itself, thereby neatly
|
|
sidestepping Russell's paradox!).
|
|
|
|
<P> <HR> <P>
|
|
<H3>Graphs</H3>
|
|
<P> <HR> <P>
|
|
|
|
Since there is no restriction on the types of values in a list, they can be
|
|
other lists too. Here's an example
|
|
of a how a graph or tree may be implemented with lists:
|
|
<PRE>record node(label, links)
|
|
...
|
|
barney := node("Barney", list())
|
|
betty := node("Betty", list())
|
|
bambam := node("Bam-Bam", list())
|
|
put(bambam.links, barney, betty)</PRE>
|
|
|
|
<H4><u>An Example</u></H4>
|
|
|
|
Let us now do a little example to illustrate the above concepts. Here is
|
|
a program to read a file, and generate a concordance i.e. for every
|
|
word, print a list of the lines it occurs on. We want to skip short words
|
|
like `the' though, so we only count the words longer than 3 characters.
|
|
<PRE>global wchar
|
|
procedure main(args)
|
|
wchar := &ucase ++ &lcase
|
|
(*args = 1) | stop("Need a file!")
|
|
f := open(args[1]) | stop("Couldn't open ", args[1])
|
|
wordlist := table()
|
|
lineno := 0
|
|
while line := read(f) do {
|
|
lineno +:= 1
|
|
every word := getword(line) do
|
|
if *word > 3 then {
|
|
# if <em>word</em> isn't in the table, set entry to empty list
|
|
/wordlist[word] := list()
|
|
put(wordlist[word], lineno)
|
|
}
|
|
}
|
|
L := sort(wordlist)
|
|
every l := !L do {
|
|
writes(l[1], "\t")
|
|
linelist := ""
|
|
# Collect line numbers into a string
|
|
every linelist ||:= (!l[2] || ", ")
|
|
write(linelist[1:-2])
|
|
}
|
|
end
|
|
|
|
procedure getword(s)
|
|
s ? while tab(upto(wchar)) do {
|
|
word := tab(many(wchar))
|
|
suspend word
|
|
}
|
|
end</PRE>
|
|
If we run this program on this input:
|
|
<PRE>Sing, Mother, sing.
|
|
Can Mother sing?
|
|
Mother can sing.
|
|
Sing, Mother, sing!</PRE>
|
|
the program writes this output:
|
|
<PRE>Mother 1, 2, 3, 4
|
|
Sing 1, 4
|
|
sing 1, 2, 3, 4</PRE>
|
|
While we may not have covered all the features used in this program, it
|
|
should give you a feeling for the flavour of the language.
|
|
|
|
<P> <HR> <P>
|
|
<H3>Co-expressions</H3>
|
|
<P> <HR> <P>
|
|
|
|
Another novel control facility in Icon is the co-expression, which is an
|
|
expression encapsulated in a thread-like execution context where its results
|
|
can be picked apart one at a time. Co-expressions are are more portable
|
|
and more fine-grained than comparable facilities found in most languages.
|
|
Co-expressions let you `capture' generators and then use their results
|
|
from multiple places in your code. Co-expressions are created by
|
|
<pre> create <i>expr</i></pre>
|
|
and each result of the co-expression is requested using the <code>@</code>
|
|
operator.
|
|
<p>
|
|
|
|
As a small example, suppose you have a procedure <code>prime()</code> that
|
|
generates an infinite sequence of prime numbers, and want to number each
|
|
prime as you print them out, one per line. Icon's <code>seq()</code>
|
|
function will generate the numbers to precede the primes, but there is no
|
|
way to generate elements from the two generators in tandem; no way except
|
|
using co-expressions, as in the following:
|
|
<pre> numbers := create seq()
|
|
primes := create prime()
|
|
every write(@numbers, ": ", @primes)
|
|
</pre>
|
|
|
|
More information about co-expressions
|
|
can be found at <code> <a href="http://www.drones.com/coexp/">
|
|
http://www.drones.com/coexp/</a></code>
|
|
and a complete description is in the Icon language book mentioned below.
|
|
|
|
|
|
|
|
<P> <HR> <P>
|
|
<h3> Graphics </h3>
|
|
<P><HR> <P>
|
|
|
|
Icon features high-level graphics facilities that are portable across
|
|
platforms. The most robust implementations are X Window and Microsoft
|
|
Windows; Presentation Manager, Macintosh, and Amiga ports are in various
|
|
stages of progress. The most important characteristics of
|
|
the graphics facilities are:
|
|
<ul>
|
|
<li> simplicity, ease of learning
|
|
<li> windows are integrated with Icon's existing I/O functions
|
|
<li> straightforward input event model
|
|
</ul>
|
|
As a short example, the following program opens a window and allows the user
|
|
to type text and draw freehand on it using the left mouse button, until an
|
|
ESC char is pressed. Clicking the right button moves the text cursor to a
|
|
new location. Mode <code>"g"</code> in the call to open stands
|
|
for "graphics". <code>&window</code> is a special global
|
|
variable that serves as a default window for graphics functions.
|
|
<code>&lpress</code>, <code>&ldrag</code>, and
|
|
<code>&rpress</code> are special constants that denote left mouse button
|
|
press and drag, and right mouse button press, respectively.
|
|
<code>&x</code> and <code>&y</code> are special global variables
|
|
that hold the mouse position associated with the most recent user action
|
|
returned by Event(). <code>"\e"</code> is a one-letter Icon
|
|
string containing the escape character.
|
|
<pre>procedure main()
|
|
&window := open("LJ example","g")
|
|
repeat case e := Event() of {
|
|
&lpress | &ldrag : DrawPoint(&x,&y)
|
|
&rpress : GotoXY(&x,&y)
|
|
"\e" : break
|
|
default : if type(e)=="string" then writes(&window, e)
|
|
}
|
|
end
|
|
</pre>
|
|
A complete description of the graphics facilities is available on the web at
|
|
<A href="http://www.cs.arizona.edu/icon/docs/ipd281.html">
|
|
<code>http://www.cs.arizona.edu/icon/docs/ipd281.html</code></A>
|
|
|
|
<P> <HR> <P>
|
|
<H3>POSIX Functions</H3>
|
|
<P> <HR> <P>
|
|
|
|
An Icon program that uses the POSIX functions should include the header file
|
|
<code>posixdef.icn</code>. On error, the POSIX functions fail and set the
|
|
keyword
|
|
<code>&errno</code>; the corresponding printable error string is
|
|
obtained by
|
|
calling <code>sys_errstr()</code>.
|
|
<P>
|
|
Unix functions that return a C struct (or a list, in Perl) return records
|
|
in Icon. The
|
|
fields in the return values have names similar to the Unix counterparts:
|
|
<I>stat()</I> returns a record with fields <I>ino</I>, <I>
|
|
nlink</I>, <I>mode</I> etc.
|
|
<P>
|
|
A complete description of the POSIX interfaces is included in the
|
|
distribution; an HTML version is available on the web, at
|
|
<A HREF="http://www.drones.com/unicon/">http://www.drones.com/unicon/</A>.
|
|
We look at a few small examples here.
|
|
|
|
<P> <HR> <P>
|
|
<H3>An Implementation of <TT>ls</TT></H3>
|
|
<P> <HR> <P>
|
|
|
|
Let us look at how a simple version of the Unix <TT>ls</TT> command may be
|
|
written. What we need to do is to read the directory, and perform a
|
|
<TT>stat</TT> call on each name we find. In Icon, opening a directory is
|
|
exactly the same as opening a file for reading; every <I>read</I>
|
|
returns one filename.
|
|
<PRE> f := open(dir) | stop(name, ":", sys_errstr(&errno))
|
|
names := list()
|
|
while name := read(f) do
|
|
push(names, name)
|
|
every name := !sort(names) do
|
|
write(format(lstat(name), name, dir))</PRE>
|
|
The <I>lstat</I> function returns a record that has all the information
|
|
that <TT>lstat(2)</TT> returns. One difference between the Unix version and
|
|
the Icon version is
|
|
that the <TT>mode</TT> field is
|
|
converted to a human readable string -- not an integer on which you
|
|
have to do bitwise magic on. (And in Icon, string manipulation is
|
|
as natural as a bitwise operation.)
|
|
<P>
|
|
The function to format the information is simple; it also checks
|
|
to see if the name is a symbolic link, in which case it prints the value of
|
|
the link also.
|
|
<PRE>link printf
|
|
procedure format(p, name, dir)
|
|
s := sprintf("%7s %4s %s %3s %8s %8s %8s %s %s",
|
|
p.ino, p.blocks, p.mode, p.nlink,
|
|
p.uid, p.gid, p.size, ctime(p.mtime)[5:17], name)
|
|
|
|
if p.mode[1] == "l" then
|
|
s ||:= " -> " || readlink(dir || "/" || name)
|
|
|
|
return s
|
|
end</PRE>
|
|
|
|
<P> <HR> <P>
|
|
<H3>Polymorphism and other pleasant things</H3>
|
|
<P> <HR> <P>
|
|
|
|
It's not just <I>stat</I> that uses human-readable values --
|
|
<I>chmod</I> can accept an integer that represents a bit pattern
|
|
to set the file mode to, but it also takes a string just like the shell
|
|
command:
|
|
<PRE> chmod(f, "a+r")</PRE>
|
|
And the first argument: it can be either an opened file or a path to a file.
|
|
Since Icon values are typed, the function knows what kind of value it's
|
|
dealing with -- no more <TT>fchmod</TT> or <TT>fstat</TT>. The same applies to
|
|
other functions -- for example, the Unix functions <TT>getpwnam</TT>,
|
|
<TT>getpwuid</TT> and <TT>getpwent</TT> are all subsumed by the Icon function
|
|
<I>getpw</I> which does the appropriate thing depending on the type of
|
|
the argument:
|
|
<PRE> owner := getpw("ickenham")
|
|
root := getpw(0)
|
|
while u := getpw() do ...</PRE>
|
|
Similarly, <I>trap</I> and <I>kill</I> can accept a signal number
|
|
or name; <I>wait</I> returns human-readable status; <I>chown</I>
|
|
takes a username or uid; and <I>select</I> takes a list of files.
|
|
|
|
<P> <HR> <P>
|
|
<H3>Using <code>select</code></H3>
|
|
<P><HR> <P>
|
|
The <code>select()</code> function waits for input to become
|
|
available on a set of
|
|
files. Here is an example of the usage -- this program waits for typed
|
|
input or for a
|
|
window event, with a timeout of 1000 milliseconds:
|
|
<P>
|
|
<PRE> repeat {
|
|
while *(L := select([&input, &window], 1000)) = 0 do
|
|
... handle timeout
|
|
if &errno ~= 0 then
|
|
stop("Select failed: ", sys_errstr(&errno))
|
|
|
|
every f := !L do
|
|
case f of {
|
|
&input : handle_input()
|
|
&window : handle_evt()
|
|
}
|
|
}</PRE>
|
|
If called with no timeout value, select will wait forever. A timeout of 0
|
|
performs a poll.
|
|
|
|
<P> <HR> <P>
|
|
<H3>Networking</H3>
|
|
<P><HR> <P>
|
|
|
|
Icon provides a much simpler interface to BSD-style sockets.
|
|
Instead of the four different system calls that are required to
|
|
start a TCP/IP server using Perl, only one is needed in Icon--the
|
|
<I>open</I> function opens network connections as well as files. The first
|
|
argument to <I>open</I>
|
|
is the network address to connect to -- <I>host:port</I> for Internet
|
|
domain connections, and a filename for Unix domain sockets.
|
|
The second argument specifies the type of connection.
|
|
<P>
|
|
Here is an Internet domain TCP server listening on port 1888:
|
|
<PRE>procedure main()
|
|
while f := open(":1888", "na") do
|
|
if fork() = 0 then {
|
|
service_request(f)
|
|
exit()
|
|
} else
|
|
close(f)
|
|
stop("Open failed: ", sys_errstr(&errno))
|
|
end</PRE>
|
|
The <I>"na"</I> flags indicate that this is a <em>network
|
|
accept</em>. Each call to <code>open</code> waits for a network connection
|
|
and then returns a file for that connection.
|
|
To connect to this server, the <I>"n"</I> (network connect) flag is
|
|
used with <code>open</code>. Here's a function that connects to a
|
|
`finger' server:
|
|
<PRE>procedure finger(name, host)
|
|
static fserv
|
|
initial fserv := getserv("finger") |
|
|
stop("Couldn't get service: ", sys_errstr(&errno))
|
|
|
|
f := open(host || ":" || fserv.port, "n") | fail
|
|
write(f, name) | fail
|
|
while line := read(f) do
|
|
write(line)
|
|
end</PRE>
|
|
Nice and simple, isn't it? One might even call it Art! On the other hand,
|
|
writing socket code in Perl is not much
|
|
different from writing it in C, except that you have to perform weird
|
|
machinations with <TT>pack</TT>. No more! Eschew obfuscation, do it in Icon.
|
|
|
|
<H4><u>UDP</u></H4>
|
|
|
|
UDP networking is similar; using <I>"nu"</I> as the second
|
|
argument to <I>open</I>
|
|
signifies a UDP connection. A datagram is sent either with <I>write</I>
|
|
or <I>send</I>, and is received with <I>receive</I>.
|
|
Here is a simple client for the UDP `daytime' service, something like
|
|
<TT>rdate(1)</TT>:
|
|
<PRE> s := getserv("daytime", "udp")
|
|
|
|
f := open(host || ":" || s.port, "nu") |
|
|
stop("Open failed: ", sys_errstr(&errno))
|
|
|
|
writes(f, " ")
|
|
|
|
if *select([f], 5000) = 0 then
|
|
stop("Connection timed out.")
|
|
|
|
r := receive(f)
|
|
write("Time on ", host, " is ", r.msg)</PRE>
|
|
Since UDP is not reliable, the <I>receive</I>
|
|
is guarded with <I>select</I> (timeout of 5000 ms), or the program
|
|
might hang forever if the reply is lost.
|
|
<P>
|
|
|
|
<P> <HR> <P>
|
|
<h3> Icon and other languages </h3>
|
|
<P> <HR> <P>
|
|
|
|
The popular languages Perl and Java have been covered in LJ, and we think
|
|
it is worth discussing how Icon stacks up against these dreadnaughts.
|
|
|
|
<H4><u>Perl and Icon</u></H4>
|
|
|
|
Perl and Icon are both used for similar purposes. Both
|
|
languages offer high-level data structures like lists, associative arrays,
|
|
etc. Both make it easy to write short prototypes by not requiring
|
|
extensive declarations; and both were intended by their designers to be
|
|
`user friendly' i.e. intended to make programming easier for the user
|
|
rather than to prove some theoretical point.
|
|
<p>
|
|
|
|
But when it comes to language design, Perl and Icon are not at all
|
|
alike. Perl has been designed with very little structure -- or,
|
|
as Larry Wall puts it, it's more like a natural language than a programming
|
|
language.
|
|
Perl looks strange
|
|
but underneath the loose syntax its semantics are those of a conventional
|
|
imperative language. Icon,
|
|
on the other hand, looks more like a conventional imperative language but
|
|
has richer semantics.
|
|
|
|
<h4><u>Advantages of Perl</u></h4>
|
|
|
|
Perl's pattern matching, while not as general a mechanism as Icon's
|
|
string scanning, is more concise for recognizing those patterns that can
|
|
be expressed as regular expressions. Perl's syntax looks and feels natural
|
|
to long-time die-hard UNIX gurus and system administrators,
|
|
who have been using utilities such as <i>sh</i>, <i>sed</i>,
|
|
and <i>awk</i>. For
|
|
some people, Perl is especially appealing because mastering its idiosyncracies
|
|
places one in an elite group.
|
|
|
|
<H4><u>Some misfeatures of Perl</u></H4>
|
|
|
|
Let us look at some things that are (in our opinion) undesirable qualities
|
|
of Perl. These problems do not negate Perl's ingenious features, they
|
|
merely illustrate that Perl is no panacea.
|
|
<P>
|
|
|
|
Namespace confusion: it is a bad idea to allow scalar variables, vector
|
|
variables and functions to have the same name. This seems like a useful
|
|
thing to do, but it leads to write-only code. We think this is primarily
|
|
why it's hard to maintain Perl programs. A couple of things are
|
|
beyond belief -- <TT>$foo</TT> and <TT>%foo</TT> are different things, but
|
|
the expression <TT>$foo{bar}</TT> actually refers to an element of
|
|
<TT>%foo</TT>!
|
|
<P>
|
|
|
|
Parameter passing is a mess. Passing arrays by name is just too confusing!
|
|
Even after careful study and substantial practice, we still are not absolutely
|
|
certain about how to use <TT>*foo</TT> in Perl. As if to make up for the
|
|
difficulty of passing arrays by reference, <em>all</em> scalars are passed by
|
|
reference! That's very unaesthetic.
|
|
<P>
|
|
Why are there no formal parameters? Instead, one has to resort to something
|
|
that looks like a function call to declare local variables and assign
|
|
<TT>@_</TT> to it.
|
|
Allowing the parentheses to be left off subroutine calls is also unfortunate;
|
|
it is another `easy to write, hard to read' construct. And the distinction
|
|
between built-in functions and user-defined subroutines is ugly.
|
|
<P>
|
|
Variables like <TT>$`</TT> are a bad idea. We think of special
|
|
characters as punctuation, we don't expect them to be (borrowing
|
|
Wall's terminology) nouns. And the mnemonics that are required are evidence
|
|
that these variables place an additional burden of memorization upon
|
|
the programmer. (Quick, if you write Perl programs: What's `<TT>$(</TT>'?)
|
|
<P>
|
|
The distinction between array and scalar contexts also leads to obfuscated
|
|
code. Certainly after you've been writing Perl for a while, you get used to
|
|
it (and might even like it), but again, this is just confusing. All the
|
|
warnings in the Perl documentation about being certain you are evaluating in
|
|
the right context is evidence of this.
|
|
<P>
|
|
|
|
|
|
<h4><u>Java and Icon</u></h4>
|
|
|
|
Java takes the middle road in between C/C++ and the class of `very high level
|
|
languages' such as Icon and Perl. Java and Icon use a similar
|
|
virtual machine (VM) model. Java's VM is both lower-level and more
|
|
machine-independent than the Icon VM, but these are implementation artifacts
|
|
and it would be possible to implement Java on the Icon VM or Icon on the
|
|
Java VM. <p>
|
|
|
|
The important differences between Java and Icon are differences of philosophy.
|
|
The Java philosophy is that everything is an object, nothing is built-in to
|
|
the language, and programmers should learn class libraries for all non-trivial
|
|
structures and algorithms. Java's lack of operator overloading means that
|
|
its object-oriented notation allows no "shorthand" as does C++.
|
|
Java's
|
|
simplicity is a welcome relief after C++, but its expressive power is so
|
|
weak compared to Icon (and several other very high level languages) that
|
|
it is hard to argue that Java can supplant these languages. Most of Java's
|
|
market share is being carved out of the C and C++ industries.
|
|
|
|
<h4><u>How to Improve Java</u></h4>
|
|
|
|
Java has few bad mistakes. The Sumatra Project has itemized some of them in
|
|
The Java Hall of Shame at
|
|
<code>http://www.cs.arizona.edu/sumatra/hallofshame/</code>.
|
|
Most of Java's misfeatures are sins of omission, because the
|
|
language designers were trying to be elegant and minimal. We would
|
|
like to see a Java dialect with features such as Icon's goal-directed
|
|
evaluation, Perl's pattern matching, and APL's array-at-a-time numeric
|
|
operators; a description of such a dialect is at
|
|
<A HREF="http://segfault.cs.utsa.edu/godiva/">
|
|
http://segfault.cs.utsa.edu/godiva/</A>.
|
|
|
|
<P> <HR> <P>
|
|
<h3> Getting Icon </h3>
|
|
<P><HR> <P>
|
|
|
|
<p> Users who become serious
|
|
about the language will want a copy of `The Icon Programming Language', by
|
|
Ralph and Madge Griswold, Peer-to-Peer Communications 1997,
|
|
ISBN 1-57398-001-3.
|
|
|
|
<p>
|
|
Lots of documentation for Icon is available from the University of Arizona, at
|
|
<A HREF="http://www.cs.arizona.edu/icon/">http://www.cs.arizona.edu/icon/</A>
|
|
There is also a newsgroup on Usenet: comp.lang.icon.
|
|
|
|
<p>
|
|
The Icon source distribution is at:<br>
|
|
<A HREF="ftp://ftp.cs.arizona.edu/icon/packages/unix/unix.tgz">ftp://ftp.cs.arizona.edu/icon/packages/unix/unix.tgz</A><br>
|
|
The POSIX functions are in the following patch that you need to apply
|
|
if you wish to build from sources:
|
|
<br>
|
|
<A
|
|
HREF="ftp://ftp.drones.com/unicon-patches.tar.gz">ftp://ftp.drones.com/unicon-patches.tar.gz</A>
|
|
<P>
|
|
|
|
Linux binaries (kernel 2.0 ELF, libgdbm 2.0.0, libX11 6.0, libdl 1.7.14,
|
|
libm 5.0.0 and libc 5.2.18)
|
|
for Icon (with X11 and POSIX support) are available at
|
|
<P>
|
|
|
|
<table>
|
|
<tr><td>
|
|
</td></tr><tr><td>
|
|
<a
|
|
HREF="ftp://ftp.drones.com/icon-9.3-3.i386.rpm">ftp://ftp.drones.com/icon-9.3-3.i386.rpm</A></td><td> Icon
|
|
<a
|
|
HREF="ftp://ftp.drones.com/icon-ipl-9.3-3.i386.rpm">ftp://ftp.drones.com/icon-ipl-9.3-3.i386.rpm</A></td><td> Icon Program Library
|
|
</td></tr><tr><td>
|
|
<a HREF="ftp://ftp.drones.com/icon-idol-9.3-3.i386.rpm">ftp://ftp.drones.com/icon-idol-9.3-3.i386.rpm</A></td><td> Idol: Object-oriented Icon
|
|
</td></tr><tr><td>
|
|
<a HREF="ftp://ftp.drones.com/icon-vib-9.3-3.i386.rpm">ftp://ftp.drones.com/icon-vib-9.3-3.i386.rpm</A></td><td> VIB: The Visual Interface Builder
|
|
</td></tr><tr><td>
|
|
<a HREF="ftp://ftp.drones.com/icon-docs-9.3-3.i386.rpm">ftp://ftp.drones.com/icon-docs-9.3-3.i386.rpm</A></td><td> Documentation
|
|
</td></tr>
|
|
</table>
|
|
|
|
<!--===================================================================-->
|
|
<P> <hr> <P>
|
|
<center><H5>Copyright © 1998, Clinton Jeffery and Shamim Mohamed<BR>
|
|
Published in Issue 27 of <i>Linux Gazette</i>, April 1998</H5></center>
|
|
|
|
<!--===================================================================-->
|
|
<P> <hr> <P>
|
|
<A HREF="./index.html"><IMG ALIGN=BOTTOM SRC="../gx/indexnew.gif"
|
|
ALT="[ TABLE OF CONTENTS ]"></A>
|
|
<A HREF="../index.html"><IMG ALIGN=BOTTOM SRC="../gx/homenew.gif"
|
|
ALT="[ FRONT PAGE ]"></A>
|
|
<A HREF="./wagle.html"><IMG SRC="../gx/back2.gif"
|
|
ALT=" Back "></A>
|
|
<A HREF="./gm.html"><IMG SRC="../gx/fwd.gif" ALT=" Next "></A>
|
|
<P> <hr> <P>
|
|
<!--startcut ==========================================================-->
|
|
</BODY>
|
|
</HTML>
|
|
<!--endcut ============================================================-->
|