old-www/LDP/LG/issue27/jeffery.html

893 lines
36 KiB
HTML

<!--startcut ==========================================================-->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<title>A Glimpse of Icon LG #27</title>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#A000A0"
ALINK="#FF0000">
<!--endcut ============================================================-->
<H4>
"Linux Gazette...<I>making Linux just a little more fun!</I>"
</H4>
<P> <HR> <P>
<!--===================================================================-->
<center>
<H1>A Glimpse of Icon</H1>
<H4>By <a href="mailto:jeffery@cs.utsa.edu">Clinton Jeffery</a>
and <A HREF="mailto:spm@drones.com">Shamim Mohamed</A></H4>
<img src="./gx/jeffery/cube192.gif">
</center>
<P> <HR> <P>
<h3> Motivation </h3>
<P> <HR> <P>
Many languages introduce special capabilities for specific kinds of
applications, but few present us with more powerful control structures or
programming paradigms. You may be comfortable sticking with a language
you already know, but if you are challenged to write complex programs and
are short on time, you need the best language for the job. Icon is a
high-level programming language that looks like many other programming
languages but offers many advanced features that add up to big gains
in productivity. Before we get to all that, let us write the canonical
first program:
<PRE>procedure main()
write(&quot;Hello, world!&quot;)
end</PRE>
If you've installed Linux Icon,
Save this in a file named <code>hello.icn</code> and run <I>icont</I>, the
Icon translator on it:
<PRE>icont hello -x</PRE>
<I> icont</I> performs some syntax checking on <code>hello.icn</code> and
transforms the code into
instructions for the Icon virtual machine, which will be saved in
<code>hello</code>. The <code>-x</code> option tells <code>icont</code> to
execute the program also.
<p>
We are introducing many concepts, so don't expect to understand
everything the first time through -- the only way to learn a language is to
write programs in it; so get Icon, and take it for a test drive.
<h4><u>Genealogy</u></h4>
Despite its name, Icon is not a visual programming language -- its
look-and-feel descends from Pascal and C. The important thing about
Icon is not that its syntax is easy to learn. Icon's semantics, which
generalize ideas from SNOBOL4, succeed in adding
considerable power to the familiar notation found in most
programming languages. This is noteworthy because most languages that add real
power (APL, Lisp, and SmallTalk are examples) do so with a syntax that
is so different that programmers must learn a new way of thinking to use
them. Icon adds power `under the hood' of a notation most programmers are
already comfortable with.<p>
Icon was developed over several years at the University of Arizona by a team
led by Ralph Griswold. Today, it runs on many platforms and is used by
researchers in algorithms, compilers, and linguistics as well as system
administrators and hobbyists. The implementation and source code are
in the public domain.
<P> <HR> <P>
<H3>Variables, Expressions, and Type</H3>
<P> <HR> <P>
Icon's expression syntax starts out much as do most languages.
For example, <code>i+j</code>
represents the arithmetic addition of the values stored in the variables
<code>i</code> and <code>j</code>, <code>f(x)</code> is a call
to <code>f</code> with argument <code>x</code>, variables may be global
or local to a procedure, and so on.
<p>
Variable declarations are not required, and variables can hold any type of
value. However, Icon is a strongly typed language; it knows the type of each
value and it does not allow you to mix invalid types in expressions. The
basic scalar types are integers, real numbers, strings, and sets of
characters (csets). Integers and strings can be arbitrarily large; and
strings can contain any characters. There are also structured types: lists,
associative arrays, sets and records. Icon performs automatic storage
management.
<h4><u>Goal-directed Expression Evaluation</u></h4>
Icon's major innovation is its expression evaluation mechanism. It avoids
certain problems widely found in conventional languages in which each
expression always computes one result. In such languages, if no valid
result is possible, a sentinel value such as -1, NULL, or EOF (end-of-file)
is returned instead. The program must check for such errors using boolean
logic and if-then tests, and the programmer must remember many different
sentinel values used in different circumstances. This is cumbersome.
Alternative ideas such as exceptions have been developed in
some languages, but they introduce complexity and problems of their own.
<h4><u>Success and Failure</u></h4>
In Icon, expressions are <em>goal-directed</em>. When it is evaluated, every
expression has a goal of producing results for the surrounding expression.
If an expression succeeds in producing a result, the surrounding expression
executes as intended, but if an expression cannot produce a result, it is
said to <em>fail</em> and the surrounding expression cannot be performed and
in turn fails. This powerful concept subsumes Boolean logic and the
use of sentinel values, and allows a host of further improvements.
As an example, consider the expression <code>i &gt; 3</code>
-- if the value <code>i</code> is greater than 3 the expression succeeds,
otherwise it fails.
<P>
Control structures such as <code>if</code> check for success, so
<PRE> if i &gt; 3 then ...</PRE> does the expected thing. Since the
expression semantics are not encumbered with the need to propagate
boolean (or 0 and 1) values, comparison operators can instead propagate
a useful value (their right operand), allowing expressions such as
<code>3 &gt; i &gt; 7</code>
which is standard in mathematics, but doesn't work in most languages.
<p>
Since functions that fail do not need to return an error code separately
from the results, detecting cases such as end-of-file is simpler, as in:
<PRE> if line := read() then write(process(line))</PRE>
On end-of-file, <code>read()</code> fails, causing the assignment expression
tested in the if-part to fail. When the test fails, the <i>then</i> branch
is not executed so the call to <code>write()</code> does not occur.
Since failure propagates through an expression, the above example is equivalent
to
<PRE> write(process(read())</PRE>
<h4><u>Generators</u></h4>
Some expressions can naturally produce more than one result. These
expressions are called <em>generators</em>.
Consider the task of searching for a substring within a string, and returning
the position at which the substring occurs, as in Icon's <code>find()</code>
function:
<PRE> find(&quot;or&quot;, &quot;horror&quot;)</PRE>
In conventional languages, this would return one of the possible return
values, usually the first or the last. In Icon, this expression is capable
of returning all the values, depending on the execution context.
If the surrounding expression only needs one value, as in the case of an
<i>if</i> test or an assignment, only the first value of a generator is
produced. If a generator is part of a more complex expression, then
the return values are produced in sequence until the whole expression
produces a value. In the expression
<PRE> find(&quot;or&quot;, &quot;horror&quot;) &gt; 3 </PRE>
the first value produced by <code>find()</code>, a 2, causes the
<code>&gt;</code> operation to fail. Icon <em>resumes</em> the call to find(),
which produces a 4, and the expression succeeds.
<p>
The most obvious generator is the alternation operator |. The expression
<pre> expr<sub>1</sub> | expr<sub>2</sub></pre>
is a generator that produces its lefthand side followed by its righthand
side, if needed by the surrounding expression. Consider
<code>f(1|2)</code> -- <code>f</code> is first invoked with the value 1;
if that does not produce a value, the generator is resumed for another
result and <code>f</code> will be called again with the value 2.
As another example of the same operator,
<pre> x = (3 | 5)</pre>
is equivalent to but more concise than C's (x == 3) || (x == 5).
When more than
one generator is present in an expression, they are resumed in a LIFO manner.
<pre> (x | y) = (3 | 5)</pre>
is the Icon equivalent of C's
<pre> (x == 3) || (x == 5) || (y == 3) || (y == 5)</pre>
<p>
In addition to <code> | </code>, Icon has a <i>generate</i> operator
<code> ! </code> that generates elements of data structures, and a
generator <code>to</code> that produces ranges of integers. For example,
<code>!L</code> generates the elements of list L, and <code>1 to 10</code>
generates the first ten positive integers.
Besides these operators that generate results, most generators in Icon
take the form of calls to built-in and user-defined procedures.
Procedures are discussed below.
<h4><u>Iteration</u></h4>
Icon has the ordinary <code>while</code> loop where the control expression is
evaluated before each iteration. For generators, an alternative loop is
available
where the loop body executes once per result produced by a single evaluation
of the control expression. The alternative loop uses the reserved word
<code>every</code> and can be used in conjunction with the <code>to</code>
operator to provide the equivalent of a <code>for</code>-loop:
<pre> every i := 1 to 10 do ...</pre>
The point of <code>every</code> and <code>to</code> is not that you can use
them to implement a for-loop; Icon's generator mechanism is quite a
bit more general. The <code>every</code> loop lets you
walk through all the results of a generator giving you iterators
for free. And <code>every</code> isn't limited to sequences of numbers or
traversals of specific data structures like iterators in some languages; it
works on any expression that contains generators.
<pre> every f(1 | 1 | 2 | 3 | 5 | 8)</pre>
executes the function <code>f</code> with the first few
fibonacci numbers, and
the example could be generalized to a user-defined generator procedure
that produced the entire fibonacci sequence.
Using generators requires a bit of practice, but then it is fun!
<P> <HR> <P>
<H3>Procedures</H3>
<P> <HR> <P>
Procedures are a basic building block in most languages, including Icon.
Like C, an Icon program is organized as a collection of procedures and
execution starts from a procedure named <code>main()</code>.
Here is an example of an ordinary procedure. This one generates and sums
the elements of a list <code>L</code>, whose elements had better be numbers
(or convertible to numbers).
<PRE>procedure sum(L)
total := 0
every total +:= !L
return total
end</PRE>
<P>
A user can write her own generator by including a
<pre> suspend <em>expr</em></pre>
in a procedure where a result should be produced. When a procedure
suspends, it transfers a result to the caller, but remains available to
continue where it left off and generate more results. If the expression
from which it is called needs more or different results in order to succeed,
the procedure will be resumed.
The following example generates the elements from parameter <code>L</code>,
but filters out the zeros.
<PRE>procedure nonzero(L)
every i := !L do
if i ~= 0 then suspend i
end</PRE>
The <code>fail</code> expression makes the procedure fail, i.e. causes control
to go back to the calling procedure without returning a value. A procedure
also fails implicitly
if control flows off the end of the procedure's body.
<P> <HR> <P>
<h3> String Processing </h3>
<P><HR> <P>
Besides expression evaluation, Icon offers compelling features to reduce the
effort required to write complex programs. From Icon's ancestor SNOBOL4, the
granddaddy of all string processing languages, Icon inherits some of the
most flexible and readable built-in data structures found in any language.
<h4><u>Strings</u></h4>
Parts of Icon strings are accessed using the subscript operator. Indexes
denote the positions <em>between</em> characters, and pairs of indexes
are used to pick out substrings. If <code>s</code> is the string
<code>&quot;hello, world&quot;</code> then the expressions
<pre> s[7] := &quot; linux &quot;
s[14:19] := &quot;journal&quot;</pre>
change <code>s</code> into <code>&quot;hello, linux journal&quot;</code>,
illustrating
the ease with which insertions and substitutions are made. A myriad of
built-in functions operate on strings; among them are the operators for
concatenation (<code>s1 || s2</code>) and size
(<code>*s</code>).
<H4><u>String Scanning</u></H4>
The string analysis facility of Icon is called <em>scanning</em>. A scanning
environment is set up by the <code>?</code> operator:
<PRE> s ? expr</PRE>
A scanning environment has a
string and a current position in it. <em>Matching functions</em> change this
position, and return the substring between the old and new positions. Here
is a simple example:
<PRE> text ? {
while move(1) do
write(move(1))
}</PRE>
<code>move</code> is a function that advances the position by its argument; so
this code writes out every alternate character of the string in
<code>text</code>. Another matching function is <code>tab</code>, which sets
the position to its argument.
<em>String analysis</em> functions examine a string and generate the
interesting positions in it. We have already seen
<code>find</code>, which looks
for substrings. These functions default their subject to the string being
scanned. Here is a procedure that produces the words from the input:
<PRE>procedure getword()
while line := read() do
line ? while tab(upto(wchar)) do {
word := tab(many(wchar))
suspend word
}
end</PRE>
<code>upto(c)</code> returns the next position of a character from the cset
<code>c</code>; and <code>many(c)</code>
returns the position after a sequence of characters from <code>c</code>.
The expression <code>tab(upto(wchar))</code>
advances the position to a character
from <code>wchar</code>, the set of characters that make up words; then
<code>tab(many(wchar))</code> moves the position to the end of the word and
returns the word that is found.
<P> <HR> <P>
<H3>Regular Expressions</H3>
<P> <HR> <P>
The Icon Program Library (included with the distribution) provides regular
expression matching functions. To use it, include the line
<code>link regexp</code> at the top of the program. Here is an example of
`search-and-replace':
<PRE>procedure re_sub(str, re, repl)
result := &quot;&quot;
str ? {
while j := ReFind(re) do {
result ||:= tab(j) || repl
tab(ReMatch(re))
}
result ||:= tab(0)
}
return result
end</PRE>
<P>
<P> <HR> <P>
<H3>Structures</H3>
<P> <HR> <P>
Icon has several structured (or non-scalar) types as well that help organize
and store collections of arbitrary (and possibly mixed) types of values. A
<em>table</em> is an associative array, where values are stored indexed by
keys which may be of arbitrary type; a list is a group of values accessed
by integer indices as well as stack and queue operations; a set is an
unordered group of values, etc.
<H4><u>Tables</u></H4>
A table is created with the <code>table</code> function. It takes one argument:
the default value, i.e. the value to return when lookup fails. Here is a
code fragment to print a word count of the input (assuming the
<code>getword</code>
function generates words of interest):
<PRE> wordcount := table(0)
every word := getword() do
wordcount[word] +:= 1
every word := key(wordcount) do
write(word, &quot;\t&quot;, wordcount[word])</PRE>
(The <code>key</code> function generates the keys with which values have been
stored.) Since the default value for the table is 0, when a new word is
inserted, the default value gets incremented and the new value (i.e. 1) is
stored with the new word. Tables grow automatically as new elements are
inserted.
<H4><u>Lists</u></H4>
A list can be created by enumerating its members:
<PRE> L := [&quot;linux&quot;, 2.0, &quot;unix&quot;]</PRE>
Lists are dynamic; they grow or shrink through calls to list manipulation
routines like <code>pop()</code> etc. Elements of the list can be obtained either
through list manipulation functions or by subscripting:
<PRE> write(L[3])</PRE>
There is no restriction on the kinds of values
that may be stored in a list.
<H4><u>Records and Sets</u></H4>
A record is like a struct in C, except that there is no restriction
on the types that can be stored in the fields. After a record is declared:
<PRE>record complex(re, im)</pre>
instances of that record are created using a constructor procedure with
the name of the record type, and on such instances, fields are accessed
by name:
<pre> i := complex(0, 0)
j := complex(1, -1)
if a.re = b.re then ...</PRE>
<P>
A set is an unordered collection of values with the uniqueness property
i.e. an element can only be present in a set once.
<PRE> S := set([&quot;rock lobster&quot;, 'B', 52])</PRE>
The functions <code>member</code>, <code>insert</code>, and
<code>delete</code> do what their
names suggest. Set intersection, union and difference are provided by
operators. A set can contain any value (including itself, thereby neatly
sidestepping Russell's paradox!).
<P> <HR> <P>
<H3>Graphs</H3>
<P> <HR> <P>
Since there is no restriction on the types of values in a list, they can be
other lists too. Here's an example
of a how a graph or tree may be implemented with lists:
<PRE>record node(label, links)
...
barney := node(&quot;Barney&quot;, list())
betty := node(&quot;Betty&quot;, list())
bambam := node(&quot;Bam-Bam&quot;, list())
put(bambam.links, barney, betty)</PRE>
<H4><u>An Example</u></H4>
Let us now do a little example to illustrate the above concepts. Here is
a program to read a file, and generate a concordance i.e. for every
word, print a list of the lines it occurs on. We want to skip short words
like `the' though, so we only count the words longer than 3 characters.
<PRE>global wchar
procedure main(args)
wchar := &amp;ucase ++ &amp;lcase
(*args = 1) | stop(&quot;Need a file!&quot;)
f := open(args[1]) | stop(&quot;Couldn't open &quot;, args[1])
wordlist := table()
lineno := 0
while line := read(f) do {
lineno +:= 1
every word := getword(line) do
if *word &gt; 3 then {
# if <em>word</em> isn't in the table, set entry to empty list
/wordlist[word] := list()
put(wordlist[word], lineno)
}
}
L := sort(wordlist)
every l := !L do {
writes(l[1], &quot;\t&quot;)
linelist := &quot;&quot;
# Collect line numbers into a string
every linelist ||:= (!l[2] || &quot;, &quot;)
write(linelist[1:-2])
}
end
procedure getword(s)
s ? while tab(upto(wchar)) do {
word := tab(many(wchar))
suspend word
}
end</PRE>
If we run this program on this input:
<PRE>Sing, Mother, sing.
Can Mother sing?
Mother can sing.
Sing, Mother, sing!</PRE>
the program writes this output:
<PRE>Mother 1, 2, 3, 4
Sing 1, 4
sing 1, 2, 3, 4</PRE>
While we may not have covered all the features used in this program, it
should give you a feeling for the flavour of the language.
<P> <HR> <P>
<H3>Co-expressions</H3>
<P> <HR> <P>
Another novel control facility in Icon is the co-expression, which is an
expression encapsulated in a thread-like execution context where its results
can be picked apart one at a time. Co-expressions are are more portable
and more fine-grained than comparable facilities found in most languages.
Co-expressions let you `capture' generators and then use their results
from multiple places in your code. Co-expressions are created by
<pre> create <i>expr</i></pre>
and each result of the co-expression is requested using the <code>@</code>
operator.
<p>
As a small example, suppose you have a procedure <code>prime()</code> that
generates an infinite sequence of prime numbers, and want to number each
prime as you print them out, one per line. Icon's <code>seq()</code>
function will generate the numbers to precede the primes, but there is no
way to generate elements from the two generators in tandem; no way except
using co-expressions, as in the following:
<pre> numbers := create seq()
primes := create prime()
every write(@numbers, ": ", @primes)
</pre>
More information about co-expressions
can be found at <code> <a href="http://www.drones.com/coexp/">
http://www.drones.com/coexp/</a></code>
and a complete description is in the Icon language book mentioned below.
<P> <HR> <P>
<h3> Graphics </h3>
<P><HR> <P>
Icon features high-level graphics facilities that are portable across
platforms. The most robust implementations are X Window and Microsoft
Windows; Presentation Manager, Macintosh, and Amiga ports are in various
stages of progress. The most important characteristics of
the graphics facilities are:
<ul>
<li> simplicity, ease of learning
<li> windows are integrated with Icon's existing I/O functions
<li> straightforward input event model
</ul>
As a short example, the following program opens a window and allows the user
to type text and draw freehand on it using the left mouse button, until an
ESC char is pressed. Clicking the right button moves the text cursor to a
new location. Mode <code>&quot;g&quot;</code> in the call to open stands
for &quot;graphics&quot;. <code>&amp;window</code> is a special global
variable that serves as a default window for graphics functions.
<code>&amp;lpress</code>, <code>&amp;ldrag</code>, and
<code>&amp;rpress</code> are special constants that denote left mouse button
press and drag, and right mouse button press, respectively.
<code>&amp;x</code> and <code>&amp;y</code> are special global variables
that hold the mouse position associated with the most recent user action
returned by Event(). <code>&quot;\e&quot;</code> is a one-letter Icon
string containing the escape character.
<pre>procedure main()
&amp;window := open(&quot;LJ example&quot;,&quot;g&quot;)
repeat case e := Event() of {
&amp;lpress | &amp;ldrag : DrawPoint(&amp;x,&amp;y)
&amp;rpress : GotoXY(&amp;x,&amp;y)
&quot;\e&quot; : break
default : if type(e)==&quot;string&quot; then writes(&amp;window, e)
}
end
</pre>
A complete description of the graphics facilities is available on the web at
<A href="http://www.cs.arizona.edu/icon/docs/ipd281.html">
<code>http://www.cs.arizona.edu/icon/docs/ipd281.html</code></A>
<P> <HR> <P>
<H3>POSIX Functions</H3>
<P> <HR> <P>
An Icon program that uses the POSIX functions should include the header file
<code>posixdef.icn</code>. On error, the POSIX functions fail and set the
keyword
<code>&amp;errno</code>; the corresponding printable error string is
obtained by
calling <code>sys_errstr()</code>.
<P>
Unix functions that return a C struct (or a list, in Perl) return records
in Icon. The
fields in the return values have names similar to the Unix counterparts:
<I>stat()</I> returns a record with fields <I>ino</I>, <I>
nlink</I>, <I>mode</I> etc.
<P>
A complete description of the POSIX interfaces is included in the
distribution; an HTML version is available on the web, at
<A HREF="http://www.drones.com/unicon/">http://www.drones.com/unicon/</A>.
We look at a few small examples here.
<P> <HR> <P>
<H3>An Implementation of <TT>ls</TT></H3>
<P> <HR> <P>
Let us look at how a simple version of the Unix <TT>ls</TT> command may be
written. What we need to do is to read the directory, and perform a
<TT>stat</TT> call on each name we find. In Icon, opening a directory is
exactly the same as opening a file for reading; every <I>read</I>
returns one filename.
<PRE> f := open(dir) | stop(name, &quot;:&quot;, sys_errstr(&amp;errno))
names := list()
while name := read(f) do
push(names, name)
every name := !sort(names) do
write(format(lstat(name), name, dir))</PRE>
The <I>lstat</I> function returns a record that has all the information
that <TT>lstat(2)</TT> returns. One difference between the Unix version and
the Icon version is
that the <TT>mode</TT> field is
converted to a human readable string -- not an integer on which you
have to do bitwise magic on. (And in Icon, string manipulation is
as natural as a bitwise operation.)
<P>
The function to format the information is simple; it also checks
to see if the name is a symbolic link, in which case it prints the value of
the link also.
<PRE>link printf
procedure format(p, name, dir)
s := sprintf(&quot;%7s %4s %s %3s %8s %8s %8s %s %s&quot;,
p.ino, p.blocks, p.mode, p.nlink,
p.uid, p.gid, p.size, ctime(p.mtime)[5:17], name)
if p.mode[1] == &quot;l&quot; then
s ||:= &quot; -&gt; &quot; || readlink(dir || &quot;/&quot; || name)
return s
end</PRE>
<P> <HR> <P>
<H3>Polymorphism and other pleasant things</H3>
<P> <HR> <P>
It's not just <I>stat</I> that uses human-readable values --
<I>chmod</I> can accept an integer that represents a bit pattern
to set the file mode to, but it also takes a string just like the shell
command:
<PRE> chmod(f, &quot;a+r&quot;)</PRE>
And the first argument: it can be either an opened file or a path to a file.
Since Icon values are typed, the function knows what kind of value it's
dealing with -- no more <TT>fchmod</TT> or <TT>fstat</TT>. The same applies to
other functions -- for example, the Unix functions <TT>getpwnam</TT>,
<TT>getpwuid</TT> and <TT>getpwent</TT> are all subsumed by the Icon function
<I>getpw</I> which does the appropriate thing depending on the type of
the argument:
<PRE> owner := getpw(&quot;ickenham&quot;)
root := getpw(0)
while u := getpw() do ...</PRE>
Similarly, <I>trap</I> and <I>kill</I> can accept a signal number
or name; <I>wait</I> returns human-readable status; <I>chown</I>
takes a username or uid; and <I>select</I> takes a list of files.
<P> <HR> <P>
<H3>Using <code>select</code></H3>
<P><HR> <P>
The <code>select()</code> function waits for input to become
available on a set of
files. Here is an example of the usage -- this program waits for typed
input or for a
window event, with a timeout of 1000 milliseconds:
<P>
<PRE> repeat {
while *(L := select([&amp;input, &amp;window], 1000)) = 0 do
... handle timeout
if &amp;errno ~= 0 then
stop(&quot;Select failed: &quot;, sys_errstr(&amp;errno))
every f := !L do
case f of {
&amp;input : handle_input()
&amp;window : handle_evt()
}
}</PRE>
If called with no timeout value, select will wait forever. A timeout of 0
performs a poll.
<P> <HR> <P>
<H3>Networking</H3>
<P><HR> <P>
Icon provides a much simpler interface to BSD-style sockets.
Instead of the four different system calls that are required to
start a TCP/IP server using Perl, only one is needed in Icon--the
<I>open</I> function opens network connections as well as files. The first
argument to <I>open</I>
is the network address to connect to -- <I>host:port</I> for Internet
domain connections, and a filename for Unix domain sockets.
The second argument specifies the type of connection.
<P>
Here is an Internet domain TCP server listening on port 1888:
<PRE>procedure main()
while f := open(&quot;:1888&quot;, &quot;na&quot;) do
if fork() = 0 then {
service_request(f)
exit()
} else
close(f)
stop(&quot;Open failed: &quot;, sys_errstr(&amp;errno))
end</PRE>
The <I>&quot;na&quot;</I> flags indicate that this is a <em>network
accept</em>. Each call to <code>open</code> waits for a network connection
and then returns a file for that connection.
To connect to this server, the <I>&quot;n&quot;</I> (network connect) flag is
used with <code>open</code>. Here's a function that connects to a
`finger' server:
<PRE>procedure finger(name, host)
static fserv
initial fserv := getserv(&quot;finger&quot;) |
stop(&quot;Couldn't get service: &quot;, sys_errstr(&amp;errno))
f := open(host || &quot;:&quot; || fserv.port, &quot;n&quot;) | fail
write(f, name) | fail
while line := read(f) do
write(line)
end</PRE>
Nice and simple, isn't it? One might even call it Art! On the other hand,
writing socket code in Perl is not much
different from writing it in C, except that you have to perform weird
machinations with <TT>pack</TT>. No more! Eschew obfuscation, do it in Icon.
<H4><u>UDP</u></H4>
UDP networking is similar; using <I>&quot;nu&quot;</I> as the second
argument to <I>open</I>
signifies a UDP connection. A datagram is sent either with <I>write</I>
or <I>send</I>, and is received with <I>receive</I>.
Here is a simple client for the UDP `daytime' service, something like
<TT>rdate(1)</TT>:
<PRE> s := getserv(&quot;daytime&quot;, &quot;udp&quot;)
f := open(host || &quot;:&quot; || s.port, &quot;nu&quot;) |
stop(&quot;Open failed: &quot;, sys_errstr(&amp;errno))
writes(f, &quot; &quot;)
if *select([f], 5000) = 0 then
stop(&quot;Connection timed out.&quot;)
r := receive(f)
write(&quot;Time on &quot;, host, &quot; is &quot;, r.msg)</PRE>
Since UDP is not reliable, the <I>receive</I>
is guarded with <I>select</I> (timeout of 5000 ms), or the program
might hang forever if the reply is lost.
<P>
<P> <HR> <P>
<h3> Icon and other languages </h3>
<P> <HR> <P>
The popular languages Perl and Java have been covered in LJ, and we think
it is worth discussing how Icon stacks up against these dreadnaughts.
<H4><u>Perl and Icon</u></H4>
Perl and Icon are both used for similar purposes. Both
languages offer high-level data structures like lists, associative arrays,
etc. Both make it easy to write short prototypes by not requiring
extensive declarations; and both were intended by their designers to be
`user friendly' i.e. intended to make programming easier for the user
rather than to prove some theoretical point.
<p>
But when it comes to language design, Perl and Icon are not at all
alike. Perl has been designed with very little structure -- or,
as Larry Wall puts it, it's more like a natural language than a programming
language.
Perl looks strange
but underneath the loose syntax its semantics are those of a conventional
imperative language. Icon,
on the other hand, looks more like a conventional imperative language but
has richer semantics.
<h4><u>Advantages of Perl</u></h4>
Perl's pattern matching, while not as general a mechanism as Icon's
string scanning, is more concise for recognizing those patterns that can
be expressed as regular expressions. Perl's syntax looks and feels natural
to long-time die-hard UNIX gurus and system administrators,
who have been using utilities such as <i>sh</i>, <i>sed</i>,
and <i>awk</i>. For
some people, Perl is especially appealing because mastering its idiosyncracies
places one in an elite group.
<H4><u>Some misfeatures of Perl</u></H4>
Let us look at some things that are (in our opinion) undesirable qualities
of Perl. These problems do not negate Perl's ingenious features, they
merely illustrate that Perl is no panacea.
<P>
Namespace confusion: it is a bad idea to allow scalar variables, vector
variables and functions to have the same name. This seems like a useful
thing to do, but it leads to write-only code. We think this is primarily
why it's hard to maintain Perl programs. A couple of things are
beyond belief -- <TT>$foo</TT> and <TT>%foo</TT> are different things, but
the expression <TT>$foo{bar}</TT> actually refers to an element of
<TT>%foo</TT>!
<P>
Parameter passing is a mess. Passing arrays by name is just too confusing!
Even after careful study and substantial practice, we still are not absolutely
certain about how to use <TT>*foo</TT> in Perl. As if to make up for the
difficulty of passing arrays by reference, <em>all</em> scalars are passed by
reference! That's very unaesthetic.
<P>
Why are there no formal parameters? Instead, one has to resort to something
that looks like a function call to declare local variables and assign
<TT>@_</TT> to it.
Allowing the parentheses to be left off subroutine calls is also unfortunate;
it is another `easy to write, hard to read' construct. And the distinction
between built-in functions and user-defined subroutines is ugly.
<P>
Variables like <TT>$`</TT> are a bad idea. We think of special
characters as punctuation, we don't expect them to be (borrowing
Wall's terminology) nouns. And the mnemonics that are required are evidence
that these variables place an additional burden of memorization upon
the programmer. (Quick, if you write Perl programs: What's `<TT>$(</TT>'?)
<P>
The distinction between array and scalar contexts also leads to obfuscated
code. Certainly after you've been writing Perl for a while, you get used to
it (and might even like it), but again, this is just confusing. All the
warnings in the Perl documentation about being certain you are evaluating in
the right context is evidence of this.
<P>
<h4><u>Java and Icon</u></h4>
Java takes the middle road in between C/C++ and the class of `very high level
languages' such as Icon and Perl. Java and Icon use a similar
virtual machine (VM) model. Java's VM is both lower-level and more
machine-independent than the Icon VM, but these are implementation artifacts
and it would be possible to implement Java on the Icon VM or Icon on the
Java VM. <p>
The important differences between Java and Icon are differences of philosophy.
The Java philosophy is that everything is an object, nothing is built-in to
the language, and programmers should learn class libraries for all non-trivial
structures and algorithms. Java's lack of operator overloading means that
its object-oriented notation allows no &quot;shorthand&quot; as does C++.
Java's
simplicity is a welcome relief after C++, but its expressive power is so
weak compared to Icon (and several other very high level languages) that
it is hard to argue that Java can supplant these languages. Most of Java's
market share is being carved out of the C and C++ industries.
<h4><u>How to Improve Java</u></h4>
Java has few bad mistakes. The Sumatra Project has itemized some of them in
The Java Hall of Shame at
<code>http://www.cs.arizona.edu/sumatra/hallofshame/</code>.
Most of Java's misfeatures are sins of omission, because the
language designers were trying to be elegant and minimal. We would
like to see a Java dialect with features such as Icon's goal-directed
evaluation, Perl's pattern matching, and APL's array-at-a-time numeric
operators; a description of such a dialect is at
<A HREF="http://segfault.cs.utsa.edu/godiva/">
http://segfault.cs.utsa.edu/godiva/</A>.
<P> <HR> <P>
<h3> Getting Icon </h3>
<P><HR> <P>
<p> Users who become serious
about the language will want a copy of `The Icon Programming Language', by
Ralph and Madge Griswold, Peer-to-Peer Communications 1997,
ISBN 1-57398-001-3.
<p>
Lots of documentation for Icon is available from the University of Arizona, at
<A HREF="http://www.cs.arizona.edu/icon/">http://www.cs.arizona.edu/icon/</A>
There is also a newsgroup on Usenet: comp.lang.icon.
<p>
The Icon source distribution is at:<br>
<A HREF="ftp://ftp.cs.arizona.edu/icon/packages/unix/unix.tgz">ftp://ftp.cs.arizona.edu/icon/packages/unix/unix.tgz</A><br>
The POSIX functions are in the following patch that you need to apply
if you wish to build from sources:
<br>
<A
HREF="ftp://ftp.drones.com/unicon-patches.tar.gz">ftp://ftp.drones.com/unicon-patches.tar.gz</A>
<P>
Linux binaries (kernel 2.0 ELF, libgdbm 2.0.0, libX11 6.0, libdl 1.7.14,
libm 5.0.0 and libc 5.2.18)
for Icon (with X11 and POSIX support) are available at
<P>
<table>
<tr><td>
</td></tr><tr><td>
<a
HREF="ftp://ftp.drones.com/icon-9.3-3.i386.rpm">ftp://ftp.drones.com/icon-9.3-3.i386.rpm</A></td><td> Icon
<a
HREF="ftp://ftp.drones.com/icon-ipl-9.3-3.i386.rpm">ftp://ftp.drones.com/icon-ipl-9.3-3.i386.rpm</A></td><td> Icon Program Library
</td></tr><tr><td>
<a HREF="ftp://ftp.drones.com/icon-idol-9.3-3.i386.rpm">ftp://ftp.drones.com/icon-idol-9.3-3.i386.rpm</A></td><td> Idol: Object-oriented Icon
</td></tr><tr><td>
<a HREF="ftp://ftp.drones.com/icon-vib-9.3-3.i386.rpm">ftp://ftp.drones.com/icon-vib-9.3-3.i386.rpm</A></td><td> VIB: The Visual Interface Builder
</td></tr><tr><td>
<a HREF="ftp://ftp.drones.com/icon-docs-9.3-3.i386.rpm">ftp://ftp.drones.com/icon-docs-9.3-3.i386.rpm</A></td><td> Documentation
</td></tr>
</table>
<!--===================================================================-->
<P> <hr> <P>
<center><H5>Copyright &copy; 1998, Clinton Jeffery and Shamim Mohamed<BR>
Published in Issue 27 of <i>Linux Gazette</i>, April 1998</H5></center>
<!--===================================================================-->
<P> <hr> <P>
<A HREF="./index.html"><IMG ALIGN=BOTTOM SRC="../gx/indexnew.gif"
ALT="[ TABLE OF CONTENTS ]"></A>
<A HREF="../index.html"><IMG ALIGN=BOTTOM SRC="../gx/homenew.gif"
ALT="[ FRONT PAGE ]"></A>
<A HREF="./wagle.html"><IMG SRC="../gx/back2.gif"
ALT=" Back "></A>
<A HREF="./gm.html"><IMG SRC="../gx/fwd.gif" ALT=" Next "></A>
<P> <hr> <P>
<!--startcut ==========================================================-->
</BODY>
</HTML>
<!--endcut ============================================================-->