LDP/LDP/howto/docbook/Assembly-HOWTO.xml

4077 lines
126 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book [
<!ENTITY version "0.7">
]>
<!-- $id:$ -->
<book xmlns="http://docbook.org/ns/docbook" version="5.0"
xmlns:xlink="http://www.w3.org/1999/xlink"
xml:lang="en"
xml:id="Assembly-HOWTO">
<!-- <?dbhtml filename="Assembly-HOWTO.html"?> -->
<info><title>Linux Assembly HOWTO</title>
<authorgroup>
<author>
<personname xml:id="lnoor">
<firstname>Leo</firstname>
<surname>Noordergraaf</surname>
</personname>
<affiliation>
<orgname>
<link xlink:href="http://asm.sourceforge.net">Linux Assembly</link>
</orgname>
<address>
<email>lnoor@users.sourceforge.net</email>
</address>
</affiliation>
</author>
<author>
<personname xml:id="konst">
<firstname>Konstantin</firstname>
<surname>Boldyshev</surname>
</personname>
<affiliation>
<orgname>
<link xlink:href="http://asm.sourceforge.net">Linux Assembly</link>
</orgname>
<address>
<email>konst@users.sourceforge.net</email>
</address>
</affiliation>
</author>
<author>
<personname xml:id="fare">
<firstname>Francois-Rene</firstname>
<surname>Rideau</surname>
</personname>
<affiliation>
<orgname>
<link xlink:href="http://tunes.org">Tunes project</link>
</orgname>
<address>
<email>fare@tunes.org</email>
</address>
</affiliation>
</author>
</authorgroup>
<copyright>
<year>2013</year>
<holder>Leo Noordergraaf</holder>
</copyright>
<copyright>
<year>1999-2006</year>
<holder>Konstantin Boldyshev</holder>
</copyright>
<copyright>
<year>1996-1999</year>
<holder>Francois-Rene Rideau</holder>
</copyright>
<legalnotice xml:id="legalnotice">
<para>
Permission is granted to copy, distribute and/or modify this document under
the terms of the GNU Free Documentation License, Version 1.1; with no
Invariant Sections, with no Front-Cover Texts, and no Back-Cover texts.
</para>
</legalnotice>
<releaseinfo>Version &version;</releaseinfo>
<edition>&version;</edition>
<pubdate role="subversion">$Date$</pubdate>
<abstract>
<para>
This is the Linux Assembly HOWTO, version &version;
This document describes how to program in assembly language using
<emphasis>free</emphasis> programming tools, focusing on development for or
from the Linux Operating System, mostly on IA-32 (i386) platform. Included
material may or may not be applicable to other hardware and/or software
platforms.
</para>
</abstract>
<keywordset>
<keyword>assembly</keyword>
<keyword>assembler</keyword>
<keyword>asm</keyword>
<keyword>inline</keyword>
<keyword>32-bit</keyword>
<keyword>IA-32</keyword>
<keyword>i386</keyword>
<keyword>x86</keyword>
<keyword>nasm</keyword>
<keyword>gas</keyword>
<keyword>as</keyword>
<keyword>as86</keyword>
<keyword>yasm</keyword>
<keyword>fasm</keyword>
<keyword>shasm</keyword>
<keyword>osimpa</keyword>
<keyword>OS</keyword>
<keyword>Linux</keyword>
<keyword>Unix</keyword>
<keyword>kernel</keyword>
<keyword>system</keyword>
<keyword>libc</keyword>
<keyword>glibc</keyword>
<keyword>system call</keyword>
<keyword>interrupt</keyword>
<keyword>small</keyword>
<keyword>fast</keyword>
<keyword>embedded</keyword>
<keyword>hardware</keyword>
<keyword>port</keyword>
<keyword>macroprocessor</keyword>
<keyword>metaprogramming</keyword>
<keyword>preprocessor</keyword>
</keywordset>
</info>
<chapter xml:id="s-intro" xreflabel="Introduction">
<?dbhtml filename="introduction.html"?>
<title>Introduction</title>
<note>
<para>
You can skip this chapter if you are familiar with HOWTOs, or just hate to
read all this assembly-unrelated crap.
</para>
</note>
<simplesect>
<title>Legal Blurb</title>
<para>
Permission is granted to copy, distribute and/or modify this document under the
terms of the GNU <link xlink:href="http://www.gnu.org/copyleft/fdl.html">Free
Documentation License</link> Version 1.1; with no Invariant Sections, with no
Front-Cover Texts, and no Back-Cover texts. A copy of the license is included
in the <xref linkend="a-gfdl"/> appendix.
</para>
<para>
The most recent official version of this document is available from the
<link xlink:href="http://asm.sourceforge.net/howto.html">Linux Assembly</link>
and <link xlink:href="http://tldp.org/docs.html">LDP</link> sites. If you are
reading a few-months-old copy, consider checking the above URLs for a new
version.
</para>
</simplesect>
<simplesect>
<title>Foreword</title>
<para>
This document aims answering questions of those who program or want to program
32-bit x86 assembly using <emphasis>free software</emphasis>, particularly
under the Linux operating system. At many places Universal Resource Locators
(<acronym>URL</acronym>) are given for some software or documentation
repository. This document also points to other documents about non-free,
non-x86, or non-32-bit assemblers, although this is not its primary goal. Also
note that there are FAQs and docs about programming on your favorite platform
(whatever it is), which you should consult for platform-specific issues, not
related directly to assembly programming.
</para>
<para>
Because the main interest of assembly programming is to build the guts of
operating systems, interpreters, compilers, and games, where C compiler fails
to provide the needed expressiveness (performance is more and more seldom as
issue), we are focusing on development of such kind of software.
</para>
<para>
If you don't know what <link xlink:href="http://www.gnu.org/philosophy/">
<emphasis>free software</emphasis></link> is, please do read
<emphasis>carefully</emphasis> the GNU
<link xlink:href="http://www.gnu.org/copyleft/gpl.html">
General Public License</link> (<acronym>GPL</acronym> or
<acronym>copyleft</acronym>), which is used in a lot of free software, and is
the model for most of their licenses. It generally comes in a file named
<filename>COPYING</filename> (or <filename>COPYING.LIB</filename>). Literature
from the <link xlink:href="http://www.fsf.org">Free Software Foundation</link>
(<acronym>FSF</acronym>) might help you too. Particularly, the interesting
feature of free software is that it comes with source code which you can
consult and correct, or sometimes even borrow from. Read your particular
license carefully and do comply to it.
</para>
</simplesect>
<simplesect>
<title>Contributions</title>
<para>
This is an interactively evolving document: you are especially invited to ask
questions, to answer questions, to correct given answers, to give pointers to
new software, to point the current maintainer to bugs or deficiencies in the
pages. In one word, contribute!
</para>
<para>
To contribute, please contact the <link linkend="lnoor">maintainer</link>.
</para>
<note>
<para>
At the time of writing, it is <link linkend="lnoor">Leo Noordergraaf</link>
taking over from <link linkend="konst">Konstantin Boldyshev</link> (since
version 0.6) and <link linkend="fare">Francois-Rene Rideau</link> (since
version 0.5).
</para>
</note>
</simplesect>
<simplesect>
<title>Translations</title>
<para>
Korean translation of this HOWTO is avalilable at
<link xlink:href="http://kldp.org/HOWTO/html/Assembly-HOWTO/">
http://kldp.org/HOWTO/html/Assembly-HOWTO/</link>.
Turkish translation of this HOWTO is available at
<link xlink:href="http://belgeler.org/howto/assembly-howto.html">
http://belgeler.org/howto/assembly-howto.html</link>.
</para>
</simplesect>
</chapter>
<chapter xml:id="s-doyou" xreflabel="Do you need assembly?">
<?dbhtml filename="doyouneed.html"?>
<title>Do you need assembly?</title>
<para>
Well, I wouldn't want to interfere with what you're doing, but here is some
advice from the hard-earned experience.
</para>
<section>
<title>Pros and Cons</title>
<section>
<title>The advantages of Assembly</title>
<para>
Assembly can express very low-level things:
<itemizedlist>
<listitem>
<para>
you can access machine-dependent registers and I/O
</para>
</listitem>
<listitem>
<para>
you can control the exact code behavior in critical sections that might
otherwise involve deadlock between multiple software threads or hardware
devices
</para>
</listitem>
<listitem>
<para>
you can break the conventions of your usual compiler, which might allow some
optimizations (like temporarily breaking rules about memory allocation,
threading, calling conventions, etc)
</para>
</listitem>
<listitem>
<para>
you can build interfaces between code fragments using incompatible conventions
(e.g. produced by different compilers, or separated by a low-level interface)
</para>
</listitem>
<listitem>
<para>
you can get access to unusual programming modes of your processor (e.g. 16 bit
mode to interface startup, firmware, or legacy code on Intel PCs)
</para>
</listitem>
<listitem>
<para>
you can produce reasonably fast code for tight loops to cope with a bad
non-optimizing compiler (but then, there are free optimizing compilers
available!)
</para>
</listitem>
<listitem>
<para>
you can produce hand-optimized code perfectly tuned for your particular
hardware setup, though not to someone else's
</para>
</listitem>
<listitem>
<para>
you can write some code for your new language's optimizing compiler (that is
something what very few ones will ever do, and even they not often)
</para>
</listitem>
<listitem>
<para>
i.e. you can be in complete control of your code
</para>
</listitem>
</itemizedlist>
</para>
</section>
<section>
<title>The disadvantages of Assembly</title>
<para>
Assembly is a very low-level language (the lowest above hand-coding the binary
instruction patterns). This means
<itemizedlist>
<listitem>
<para>
it is long and tedious to write initially
</para>
</listitem>
<listitem>
<para>
it is quite bug-prone
</para>
</listitem>
<listitem>
<para>
your bugs can be very difficult to chase
</para>
</listitem>
<listitem>
<para>
your code can be fairly difficult to understand and modify, i.e. to maintain
</para>
</listitem>
<listitem>
<para>
the result is non-portable to other architectures, existing or upcoming
</para>
</listitem>
<listitem>
<para>
your code will be optimized only for a certain implementation of a same
architecture: for instance, among Intel-compatible platforms each CPU design
and its variations (relative latency, through-output, and capacity, of
processing units, caches, RAM, bus, disks, presence of FPU, MMX, 3DNOW, SIMD
extensions, etc) implies potentially completely different optimization
techniques. CPU designs already include: Intel 386, 486, Pentium, PPro, PII,
PIII, PIV; Cyrix 5x86, 6x86, M2; AMD K5, K6 (K6-2, K6-III), K7 (Athlon, Duron).
New designs keep popping up, so don't expect either this listing and your code
to be up-to-date.
</para>
</listitem>
<listitem>
<para>
you spend more time on a few details and can't focus on small and large
algorithmic design, that are known to bring the largest part of the speed up
(e.g. you might spend some time building very fast list/array manipulation
primitives in assembly; only a hash table would have sped up your program much
more; or, in another context, a binary tree; or some high-level structure
distributed over a cluster of CPUs)
</para>
</listitem>
<listitem>
<para>
a small change in algorithmic design might completely invalidate all your
existing assembly code. So that either you're ready (and able) to rewrite it
all, or you're tied to a particular algorithmic design
</para>
</listitem>
<listitem>
<para>
On code that ain't too far from what's in standard benchmarks, commercial
optimizing compilers outperform hand-coded assembly (well, that's less true on
the x86 architecture than on RISC architectures, and perhaps less true for
widely available/free compilers; anyway, for typical C code, GCC is fairly
good);
</para>
</listitem>
<listitem>
<para>
And in any case, as moderator John Levine says on
<link xlink:href="news:comp.compilers">comp.compilers</link>,
</para>
<literallayout>
"compilers make it a lot easier to use complex data structures,
and compilers don't get bored halfway through
and generate reliably pretty good code."
</literallayout>
<para>
They will also <emphasis>correctly</emphasis> propagate code transformations
throughout the whole (huge) program when optimizing code between procedures
and module boundaries.
</para>
</listitem>
</itemizedlist>
</para>
</section>
<section>
<title>Assessment</title>
<para>
All in all, you might find that though using assembly is sometimes needed, and
might even be useful in a few cases where it is not, you'll want to:
<itemizedlist>
<listitem>
<para>
minimize use of assembly code
</para>
</listitem>
<listitem>
<para>
encapsulate this code in well-defined interfaces
</para>
</listitem>
<listitem>
<para>
have your assembly code automatically generated from patterns expressed in a
higher-level language than assembly (e.g. GCC inline assembly macros)
</para>
</listitem>
<listitem>
<para>
have automatic tools translate these programs into assembly code
</para>
</listitem>
<listitem>
<para>
have this code be optimized if possible
</para>
</listitem>
<listitem>
<para>
All of the above, i.e. write (an extension to) an optimizing compiler back-end.
</para>
</listitem>
</itemizedlist>
</para>
<para>
Even when assembly is needed (e.g. OS development), you'll find that not so
much of it is required, and that the above principles retain.
</para>
<para>
See the Linux kernel sources concerning this: as little assembly as needed,
resulting in a fast, reliable, portable, maintainable OS. Even a successful
game like DOOM was almost massively written in C, with a tiny part only being
written in assembly for speed up.
</para>
</section>
</section>
<section>
<title>How to NOT use Assembly</title>
<?dbhtml filename="howtonot.html"?>
<section>
<title>General procedure to achieve efficient code</title>
<para>
As Charles Fiterman says on <link xlink:href="news:comp.compilers">comp.compilers</link>
about human vs computer-generated assembly code:
</para>
<blockquote>
<literallayout>
The human should always win and here is why.
First the human writes the whole thing in a high level language.
Second he profiles it to find the hot spots where it spends its time.
Third he has the compiler produce assembly for those small sections of code.
Fourth he hand tunes them looking for tiny improvements over the machine
generated code.
The human wins because he can use the machine.
</literallayout>
</blockquote>
</section>
<section>
<title>Languages with optimizing compilers</title>
<para>
Languages like ObjectiveCAML, SML, CommonLISP, Scheme, ADA, Pascal, C, C++,
among others, all have free optimizing compilers that will optimize the bulk
of your programs, and often do better than hand-coded assembly even for tight
loops, while allowing you to focus on higher-level details, and without
forbidding you to grab a few percent of extra performance in the
above-mentioned way, once you've reached a stable design. Of course, there are
also commercial optimizing compilers for most of these languages, too!
</para>
<para>
Some languages have compilers that produce C code, which can be further
optimized by a C compiler: LISP, Scheme, Perl, and many other. Speed is fairly
good.
</para>
</section>
<section>
<title>General procedure to speed your code up</title>
<para>
As for speeding code up, you should do it only for parts of a program that a
profiling tool has consistently identified as being a performance bottleneck.
</para>
<para>
Hence, if you identify some code portion as being too slow, you should
<itemizedlist>
<listitem>
<para>
first try to use a better algorithm;
</para>
</listitem>
<listitem>
<para>
then try to compile it rather than interpret it;
</para>
</listitem>
<listitem>
<para>
then try to enable and tweak optimization from your compiler;
</para>
</listitem>
<listitem>
<para>
then give the compiler hints about how to optimize (typing information in LISP;
register usage with GCC; lots of options in most compilers, etc).
</para>
</listitem>
<listitem>
<para>
then possibly fallback to assembly programming
</para>
</listitem>
</itemizedlist>
</para>
<para>
Finally, before you end up writing assembly, you should inspect generated code,
to check that the problem really is with bad code generation, as this might
really not be the case: compiler-generated code might be better than what you'd
have written, particularly on modern multi-pipelined architectures! Slow parts
of a program might be intrinsically so. The biggest problems on modern
architectures with fast processors are due to delays from memory access,
cache-misses, TLB-misses, and page-faults; register optimization becomes
useless, and you'll more profitably re-think data structures and threading
to achieve better locality in memory access. Perhaps a completely different
approach to the problem might help, then.
</para>
</section>
<section>
<title>Inspecting compiler-generated code</title>
<para>
There are many reasons to inspect compiler-generated assembly code. Here is
what you'll do with such code:
<itemizedlist>
<listitem>
<para>
check whether generated code can be obviously enhanced with hand-coded assembly
(or by tweaking compiler switches)
</para>
</listitem>
<listitem>
<para>
when that's the case, start from generated code and modify it instead of
starting from scratch
</para>
</listitem>
<listitem>
<para>
more generally, use generated code as stubs to modify, which at least gets
right the way your assembly routines interface to the external world
</para>
</listitem>
<listitem>
<para>
track down bugs in your compiler (hopefully the rarer)
</para>
</listitem>
</itemizedlist>
</para>
<para>
The standard way to have assembly code be generated is to invoke your compiler
with the <option>-S</option> flag. This works with most Unix compilers,
including the GNU C Compiler (GCC), but YMMV. As for GCC, it will produce more
understandable assembly code with the <option>-fverbose-asm</option> command-line
option. Of course, if you want to get good assembly code, don't forget your usual
optimization options and hints!
</para>
</section>
</section>
<section>
<?dbhtml filename="landa.html"?>
<title>Linux and assembly</title>
<para>
As you probably noticed, in general case you don't need to use assembly
language in Linux programming. Unlike DOS, you do not have to write Linux
drivers in assembly (well, actually you can do it if you really want). And
with modern optimizing compilers, if you care of speed optimization for
different CPU's, it's much simpler to write in C. However, if you're reading
this, you might have some reason to use assembly instead of C/C++.
</para>
<para>
You may <emphasis>need</emphasis> to use assembly, or you may
<emphasis>want</emphasis> to use assembly. In short, main practical
(<emphasis>need</emphasis>) reasons of diving into the assembly realm are
<emphasis>small code</emphasis> and <emphasis><application>libc</application>
independence</emphasis>. Impractical (<emphasis>want</emphasis>), and the most
often reason is being just an old crazy hacker, who has twenty years old habit
of doing everything in assembly language.
</para>
<para>
However, if you're porting Linux to some embedded hardware you can be quite
short at the size of whole system: you need to fit kernel,
<application>libc</application> and all that stuff of
<application>(file|find|text|sh|etc.) utils</application> into several hundreds
of kilobytes, and every kilobyte costs much. So, one of the possible ways is to
rewrite some (or all) parts of system in assembly, and this will really save
you a lot of space. For instance, a simple <command>httpd</command> written in
assembly can take less than 600 bytes; you can fit a server consisting of
kernel, httpd and ftpd in 400 KB or less... Think about it.
</para>
</section>
</chapter>
<chapter xml:id="s-assem" xreflabel="Assemblers">
<?dbhtml filename="assemblers.html"?>
<title>Assemblers</title>
<section xml:id="p-gcc">
<?dbhtml filename="gcc.html"?>
<title>GCC Inline Assembly</title>
<para>
The well-known GNU C/C++ Compiler (GCC), an optimizing 32-bit compiler at the
heart of the GNU project, supports the x86 architecture quite well, and
includes the ability to insert assembly code in C programs, in such a way that
register allocation can be either specified or left to GCC. GCC works on most
available platforms, notably Linux, *BSD, VSTa, OS/2, *DOS, Win*, etc.
</para>
<section><title>Where to find GCC</title>
<para>
GCC home page is <link xlink:href="http://gcc.gnu.org">http://gcc.gnu.org</link>.
</para>
<para>
<anchor xml:id="p-djgpp"/>
DOS port of GCC is called
<link xlink:href="http://www.delorie.com/djgpp/">DJGPP</link>.
</para>
<para>
There are two Win32 GCC ports:
<link xlink:href="http://www.cygwin.com">cygwin</link> and
<link xlink:href="http://www.mingw.org">mingw</link>
</para>
<para>
There is also an OS/2 port of GCC called EMX;
it works under DOS too,
and includes lots of unix-emulation library routines.
Look around the following site:
<link xlink:href="ftp://ftp.leo.org/pub/comp/os/os2/leo/gnu/emx+gcc/">
ftp://ftp.leo.org/pub/comp/os/os2/leo/gnu/emx+gcc/</link>.
</para>
</section>
<section>
<title>Where to find docs for GCC Inline Asm</title>
<para>
The documentation of GCC includes documentation files in TeXinfo format.
You can compile them with TeX and print then result,
or convert them to <filename>.info</filename>, and browse them with emacs,
or convert them to <filename>.html</filename>, or nearly whatever you like;
convert (with the right tools) to whatever you like,
or just read as is. The <filename>.info</filename> files
are generally found on any good installation for GCC.
</para>
<para>
The right section to look for is <literal>C Extensions::Extended Asm::</literal>
</para>
<para>
Section <literal>Invoking GCC::Submodel Options::i386 Options::</literal> might
help too. Particularly, it gives the i386 specific constraint names for
registers:
<literal>abcdSDB</literal> correspond to
<literal>%eax</literal>,
<literal>%ebx</literal>,
<literal>%ecx</literal>,
<literal>%edx</literal>,
<literal>%esi</literal>,
<literal>%edi</literal>
and
<literal>%ebp</literal>
respectively (no letter for <literal>%esp</literal>).
</para>
<para>
The DJGPP Games resource (not only for game hackers) had page specifically
about assembly, but it's down. Its data have nonetheless been recovered on the
<link linkend="p-djgpp">DJGPP site</link>, that contains a mine of other
useful information:
<link xlink:href="http://www.delorie.com/djgpp/doc/brennan/">
http://www.delorie.com/djgpp/doc/brennan/</link>.
</para>
<para>
GCC depends on GAS for assembling and follows its syntax (see below);
do mind that inline asm needs percent characters to be quoted,
they will be passed to GAS.
See the section about GAS below.
</para>
<para>
Find <emphasis>lots</emphasis> of useful examples in the
<filename>linux/include/asm-i386/</filename>
subdirectory of the sources for the Linux kernel.
</para>
</section>
<section><title>Invoking GCC to build proper inline assembly code</title>
<para>
Because assembly routines from the kernel headers (and most likely your own
headers, if you try making your assembly programming as clean as it is in the
linux kernel) are embedded in <function>extern inline</function> functions,
GCC must be invoked with the <option>-O</option> flag (or <option>-O2</option>,
<option>-O3</option>, etc), for these routines to be available. If not, your
code may compile, but not link properly, since it will be looking for
non-inlined <function>extern</function> functions in the libraries against
which your program is being linked! Another way is to link against libraries
that include fallback versions of the routines.
</para>
<para>
Inline assembly can be disabled with <option>-fno-asm</option>, which will have
the compiler die when using extended inline asm syntax, or else generate calls
to an external function named <function>asm()</function> that the linker can't
resolve. To counter such flag, <option>-fasm</option> restores treatment of the
<literal>asm</literal> keyword.
</para>
<para>
More generally, good compile flags for GCC on the x86 platform are
</para>
<para>
<command>gcc -O2 -fomit-frame-pointer -W -Wall</command>
</para>
<para>
<option>-O2</option> is the good optimization level in most cases. Optimizing
besides it takes more time, and yields code that is much larger, but only a bit
faster; such over-optimization might be useful for tight loops only (if any),
which you may be doing in assembly anyway. In cases when you need really strong
compiler optimization for a few files, do consider using up to
<option>-O6</option>.
</para>
<para>
<option>-fomit-frame-pointer</option> allows generated code to skip the stupid
frame pointer maintenance, which makes code smaller and faster, and frees a
register for further optimizations. It precludes the easy use of debugging tools
(<command>gdb</command>), but when you use these, you just don't care about size
and speed anymore anyway.
</para>
<para>
<option>-W -Wall</option> enables all useful warnings and helps you to catch
obvious stupid errors.
</para>
<para>
You can add some CPU-specific <option>-m486</option> or such flag so that GCC
will produce code that is more adapted to your precise CPU. Note that modern
GCC has <option>-mpentium</option> and such flags (and
<link xlink:href="http://goof.com/pcg/">PGCC</link> has even more), whereas
GCC 2.7.x and older versions do not. A good choice of CPU-specific flags should
be in the Linux kernel. Check the TeXinfo documentation of your current GCC
installation for more.
</para>
<para>
<option>-m386</option> will help optimize for size, hence also for speed on
computers whose memory is tight and/or loaded, since big programs cause swap,
which more than counters any "optimization" intended by the larger code. In
such settings, it might be useful to stop using C, and use instead a language
that favors code factorization, such as a functional language and/or FORTH,
and use a bytecode- or wordcode- based implementation.
</para>
<para>
Note that you can vary code generation flags from file to file, so
performance-critical files will use maximum optimization, whereas other files
will be optimized for size.
</para>
<para>
To optimize even more, option <option>-mregparm=2</option> and/or corresponding
function attribute might help, but might pose lots of problems when linking to
foreign code, <emphasis>including <application>libc</application></emphasis>.
There are ways to correctly declare foreign functions so the right call
sequences be generated, or you might want to recompile the foreign libraries
to use the same register-based calling convention...
</para>
<para>
Note that you can add make these flags the default by editing file
<filename>/usr/lib/gcc-lib/i486-linux/2.7.2.3/specs</filename> or wherever that
is on your system (better not add <option>-W -Wall</option> there, though). The
exact location of the GCC specs files on system can be found by
<command>gcc -v</command>.
</para>
</section>
<section>
<title>Macro support</title>
<para>
GCC allows (and requires) you to specify register constraints in your inline
assembly code, so the optimizer always know about it; thus, inline assembly
code is really made of patterns, not forcibly exact code.
</para>
<para>
Thus, you can put your assembly into CPP macros, and inline C functions, so
anyone can use it in as any C function/macro. Inline functions resemble macros
very much, but are sometimes cleaner to use. Beware that in all those cases,
code will be duplicated, so only local labels (of <literal>1:</literal> style)
should be defined in that asm code. However, a macro would allow the name for
a non local defined label to be passed as a parameter (or else, you should use
additional meta-programming methods). Also, note that propagating inline asm
code will spread potential bugs in them; so watch out doubly for register
constraints in such inline asm code.
</para>
<para>
Lastly, the C language itself may be considered as a good abstraction to
assembly programming, which relieves you from most of the trouble of assembling.
</para>
</section>
</section>
<section xml:id="p-gas" xreflabel="GAS">
<?dbhtml filename="gas.html"?>
<title>GAS</title>
<para>
GAS is the GNU Assembler, that GCC relies upon.
</para>
<section><title>Where to find it</title>
<para>
Find it at the same place where you've found GCC, in the binutils package.
The latest version of binutils is available from
<link xlink:href="http://sources.redhat.com/binutils/">
http://sources.redhat.com/binutils/</link>.
</para>
</section>
<section>
<title>What is this AT&amp;T syntax</title>
<para>
Because GAS was invented to support a 32-bit unix compiler, it uses standard
AT&amp;T syntax, which resembles a lot the syntax for standard m68k assemblers,
and is standard in the UNIX world. This syntax is neither worse, nor better
than the Intel syntax. It's just different. When you get used to it, you find
it much more regular than the Intel syntax, though a bit boring.
</para>
<para>
Here are the major caveats about GAS syntax:
<itemizedlist>
<listitem>
<para>
Register names are prefixed with <literal>%</literal>, so that registers are
<literal>%eax</literal>, <literal>%dl</literal> and so on, instead of just
<literal>eax</literal>, <literal>dl</literal>, etc. This makes it possible to
include external C symbols directly in assembly source, without any risk of
confusion, or any need for ugly underscore prefixes.
</para>
</listitem>
<listitem>
<para>
The order of operands is source(s) first, and destination last, as opposed to the
Intel convention of destination first and sources last. Hence, what in Intel
syntax is <function>mov eax,edx</function> (move contents of register
<literal>edx</literal> into register <literal>eax</literal>) will be in GAS
syntax <function>mov %edx,%eax</function>.
</para>
</listitem>
<listitem>
<para>
The operand size is specified as a suffix to the instruction name. The suffix
is <literal>b</literal> for (8-bit) byte, <literal>w</literal> for (16-bit)
word, and <literal>l</literal> for (32-bit) long. For instance, the correct
syntax for the above instruction would have been
<function>movl %edx,%eax</function>. However, gas does not require strict
AT&amp;T syntax, so the suffix is optional when size can be guessed from
register operands, and else defaults to 32-bit (with a warning).
</para>
</listitem>
<listitem>
<para>
Immediate operands are marked with a <literal>$</literal> prefix, as in
<function>addl $5,%eax</function> (add immediate long value 5 to register
<literal>%eax</literal>).
</para>
</listitem>
<listitem>
<para>
Missing operand prefix indicates that it is memory-contents; hence
<function>movl $foo,%eax</function> puts the <emphasis>address</emphasis> of
variable <literal>foo</literal> into register <literal>%eax</literal>, but
<function>movl foo,%eax</function> puts the <emphasis>contents</emphasis> of
variable <literal>foo</literal> into register <literal>%eax</literal>.
</para>
</listitem>
<listitem>
<para>
Indexing or indirection is done by enclosing the index register or indirection
memory cell address in parentheses, as in
<function>testb $0x80,17(%ebp)</function> (test the high bit of the byte value
at offset 17 from the cell pointed to by <literal>%ebp</literal>).
</para>
</listitem>
</itemizedlist>
</para>
<para>
<anchor xml:id="p-convert"/>
Note: There are <link linkend="s-res">few programs</link> which may help you to
convert source code between AT&amp;T and Intel assembler syntaxes; some of the
are capable of performing conversion in both directions.
</para>
<para>
GAS has comprehensive documentation in TeXinfo format, which comes at least
with the source distribution. Browse extracted <filename>.info</filename>
pages with Emacs or whatever. There used to be a file named gas.doc or as.doc
around the GAS source package, but it was merged into the TeXinfo docs. Of
course, in case of doubt, the ultimate documentation is the sources themselves!
A section that will particularly interest you is
<literal>Machine Dependencies::i386-Dependent::</literal>
</para>
<para>
Again, the sources for Linux (the OS kernel) come in as excellent examples;
see under <filename>linux/arch/i386/</filename> the following files:
<filename>kernel/*.S</filename>, <filename>boot/compressed/*.S</filename>,
<filename>math-emu/*.S</filename>.
</para>
<para>
If you are writing kind of a language, a thread package, etc., you might as
well see how other languages (<link xlink:href="http://para.inria.fr/">
OCaml</link>, <link xlink:href="http://www.jwdt.com/~paysan/gforth.html">
Gforth</link>, etc.), or thread packages (QuickThreads, MIT pthreads,
LinuxThreads, etc), or whatever else do it.
</para>
<para>
Finally, just compiling a C program to assembly might show you the syntax for
the kind of instructions you want. See section <xref linkend="s-doyou"/> above.
</para>
</section>
<section>
<title>Intel syntax</title>
<para>
Good news are that starting from binutils 2.10 release, GAS supports Intel
syntax too. It can be triggered with <literal>.intel_syntax</literal>
directive. Unfortunately this mode is not documented (yet?) in the official
binutils manual, so if you want to use it, try to examine
<link xlink:href="http://www.lxhp.in-berlin.de/lhpas86.html">
http://www.lxhp.in-berlin.de/lhpas86.html</link>, which is an extract from AMD
64bit port of binutils 2.11.
</para>
</section>
<section>
<title>16-bit mode</title>
<para>
Binutils (2.9.1.0.25+) now fully support 16-bit mode
(registers <emphasis>and</emphasis> addressing) on i386 PCs. Use
<literal>.code16</literal> and <literal>.code32</literal> to switch
between assembly modes.
</para>
<para>
Also, a neat trick used by several people (including the oskit authors) is to
force GCC to produce code for 16-bit real mode, using an inline assembly
statement <literal>asm(".code16\n")</literal>. GCC will still emit only 32-bit
addressing modes, but GAS will insert proper 32-bit prefixes for them.
</para>
</section>
<section>
<title>Macro support</title>
<para>
GAS has some macro capability included, as detailed in the texinfo docs.
Moreover, while GCC recognizes <filename>.s</filename> files as raw assembly
to send to GAS, it also recognizes <filename>.S</filename> files as files
to pipe through CPP before feeding them to GAS. Again and again, see Linux
sources for examples.
</para>
<para>
GAS also has GASP (GAS Preprocessor), which adds all the usual macroassembly
tricks to GAS. GASP comes together with GAS in the GNU binutils archive. It
works as a filter, like <xref linkend="p-cpp"/> and <xref linkend="p-m4"/>. I
have no idea on details, but it comes with its own texinfo documentation,
which you would like to browse (<command>info gasp</command>), print, grok.
GAS with GASP looks like a regular macro-assembler to me.
</para>
</section>
</section>
<section xml:id="p-nasm" xreflabel="NASM">
<?dbhtml filename="nasm.html"?>
<title>NASM</title>
<para>
The Netwide Assembler project provides cool i386 assembler, written in C, that
should be modular enough to eventually support all known syntaxes and object
formats.
</para>
<section xml:id="p-nasm-where">
<title>Where to find NASM</title>
<para>
<link xlink:href="http://www.nasm.us">http://www.nasm.us</link>,
<link xlink:href="http://sourceforge.net/projects/nasm/">
http://sourceforge.net/projects/nasm/</link>
</para>
<para>
Binary release on your usual metalab mirror in
<filename>devel/lang/asm/</filename> directory. Should also be available as
<filename>.rpm</filename> or <filename>.deb</filename> in your usual Linux
distribution.
</para>
</section>
<section>
<title>What it does</title>
<para>
The syntax is Intel-style. Comprehensive macroprocessing support is integrated.
</para>
<para>
Supported object file formats are <literal>bin</literal>,
<literal>aout</literal>, <literal>coff</literal>, <literal>elf</literal>,
<literal>as86</literal>, <literal>obj</literal> (DOS), <literal>win32</literal>,
<literal>rdf</literal> (their own format).
</para>
<para>
NASM can be used as a backend for the free LCC compiler (support files
included).
</para>
<para>
Unless you're using BCC as a 16-bit compiler (which is out of scope of this
32-bit HOWTO), you should definitely use NASM instead of say AS86 or MASM,
because it runs on all platforms.
</para>
<note>
<para>
NASM comes with a disassembler, NDISASM.
</para>
</note>
<para>
Its hand-written parser makes it much faster than GAS, though of course, it
doesn't support three bazillion different architectures. If you like
Intel-style syntax, as opposed to GAS syntax, then it should be the assembler
of choice...
</para>
<para>
Note: There are <link linkend="s-res">few programs</link> which may help you
to convert source code between AT&amp;T and Intel assembler syntaxes; some of
the are capable of performing conversion in both directions.
</para>
</section>
</section>
<section xml:id="p-other">
<?dbhtml filename="other.html"?>
<title>Other Assemblers</title>
<para>
There are other assemblers with various interesting and outstanding features
which may be of your interest as well.
</para>
<note>
<para>
They can be in various stages of development, and can be
non-classic/high-level/whatever else.
</para>
</note>
<section>
<title>AS86</title>
<para>
AS86 is a 80x86 assembler (16-bit and 32-bit) with integrated macro support.
It has mostly Intel-syntax, though it differs slightly as for addressing modes.
Some time ago it was used in a several projects, including the Linux kernel,
but eventually most of those projects have moved to GAS or NASM. AFAIK, only
ELKS continues to use it.
</para>
<para>
AS86 can be found at
<link xlink:href="http://www.debath.co.uk/dev86/">
http://www.debath.co.uk/dev86/</link>, in the bin86 package with linker (ld86),
or as separate archive. Documentation is available as the man page and as.doc
from the source package. When in doubt, the source code itself is often a good
doc: though it is not very well commented, the programming style is
straightforward. AS86 is part of a number of BSD and Linux distributions.
</para>
<note>
<para>
AS86 is primarily a 16 bit assembler.
</para>
</note>
<note>
<title>Using AS86 with BCC</title>
<para>
Here's the GNU Makefile entry for using BCC to transform
<filename>.s</filename> asm into both a.out <filename>.o</filename> object
and <filename>.l</filename> listing:
</para>
<para>
<programlisting>
%.o %.l: %.s
bcc -3 -G -c -A-d -A-l -A$*.l -o $*.o $&lt;
</programlisting>
</para>
<para>
Remove the <literal>%.l</literal>, <literal>-A-l</literal>, and
<literal>-A$*.l</literal>, if you don't want any listing. If you want something
else than a.out, you can examine BCC docs about the other supported formats,
and/or use the objcopy utility from the GNU binutils package.
</para>
</note>
</section>
<section>
<title>YASM</title>
<para>
YASM is a complete rewrite of the NASM assembler under the "new" BSD License.
It is designed from the ground up to allow for multiple syntaxes to be
supported (eg, NASM, TASM, GAS, etc.) in addition to multiple output object
formats including COFF, Win32 and Mach-O. Another primary module of the overall
design is an optimizer module.
</para>
</section>
<section>
<title>FASM</title>
<para>
FASM (flat assembler) is a fast, efficient 80x86 assembler that runs in
'flat real mode'. Unlike many other 80x86 assemblers, FASM only requires the
source code to include the information it really needs. It is written in itself
and is very small and fast. It runs on DOS/Windows/Linux and can produce flat
binary, DOS EXE, Win32 PE, COFF and Linux ELF output. See
<link xlink:href="http://flatassembler.net">http://flatassembler.net</link>.
</para>
</section>
<section>
<title>OSIMPA (SHASM)</title>
<para>
osimpa is an assembler for Intel 80386 processors and subsequent, written
entirely in the GNU Bash command interpreter shell. The predecessor of osimpa
was shasm. osimpa is much cleaned up, can create useful Linux ELF executables,
and has various HLL-like extensions and programmer convenience commands.
</para>
<para>
It is (of course) slower than other assemblers. It has its own syntax (and uses
its own names for x86 opcodes) Fairly good documentation is included. Check it
out: <link xlink:href="ftp://linux01.gwdg.de/pub/cLIeNUX/interim/">
ftp://linux01.gwdg.de/pub/cLIeNUX/interim/</link> (Access is password
controlled). You will probably not use it on regular basis, but at least it
deserves your interest as an interesting idea.
</para>
</section>
<section>
<title>AASM</title>
<para>
Aasm is an advanced assembler designed to support several target architectures.
It has been designed to be easily extended and, should be considered as a good
alternative to monolithic assembler development for each new target CPUs
and binary file formats.
</para>
<para>
Aasm should make assembly programming easier for developer, by providing
a set of advanced features including symbol scopes, an expressions engine,
big integer support, macro capability, numerous and accurate warning messages.
Its dynamic modular architecture enables Aasm to extend its set of features
with plug-ins by taking advantages of dynamic libraries.
</para>
<para>
The input module supports Intel syntax (like nasm, tasm, masm, etc.).
The x86 assembler module supports all opcodes up to P6 including MMX, SSE
and 3DNow! extensions. F-CPU and SPARC assembler modules are under development.
Several output modules are available for ELF, COFF, IntelHex, and raw binary
formats.
</para>
<para>
<link xlink:href="http://savannah.nongnu.org/projects/aasm/">
http://savannah.nongnu.org/projects/aasm/</link>
</para>
</section>
<section>
<title>TDASM</title>
<para>
The Table Driven Assembler (TDASM) is a <emphasis>free</emphasis> portable
cross assembler for any kind of assembly language. It should be possible to use
it as a compiler to any target microprocessor using a table that defines the
compilation process.
</para>
<para>
It is available from <link xlink:href="http://www.penguin.cz/~niki/tdasm/">
http://www.penguin.cz/~niki/tdasm/</link> but is seems it is no longer
actively maintained.
</para>
</section>
<section>
<title>HLA</title>
<para>
<link xlink:href="http://www.plantation-productions.com/Webster/HighLevelAsm/index.html">
HLA</link> is a <emphasis>H</emphasis>igh <emphasis>L</emphasis>evel
<emphasis>A</emphasis>ssembly language. It uses a high level language like
syntax (similar to Pascal, C/C++, and other HLLs) for variable declarations,
procedure declarations, and procedure calls. It uses a modified assembly
language syntax for the standard machine instructions. It also provides several
high level language style control structures (if, while, repeat..until, etc.)
that help you write much more readable code.
</para>
<para>
HLA is free and comes with source, Linux and Win32 versions available. On Win32
you need MASM and a 32-bit version of MS-link on Win32, on Linux you need GAS,
because HLA produces specified assembler code and uses that assembler for final
assembling and linking.
</para>
</section>
<section>
<title>TALC</title>
<para>
<link xlink:href="http://www.cs.cornell.edu/talc/">TALC</link> is another free
MASM/Win32 based compiler (however it supports ELF output, does it?).
</para>
<para>
TAL stands for <emphasis>T</emphasis>yped <emphasis>A</emphasis>ssembly
<emphasis>L</emphasis>anguage. It extends traditional untyped assembly
languages with typing annotations, memory management primitives, and a sound
set of typing rules, to guarantee the memory safety, control flow safety,and
type safety of TAL programs. Moreover, the typing constructs are expressive
enough to encode most source language programming features including records
and structures, arrays, higher-order and polymorphic functions, exceptions,
abstract data types, subtyping, and modules. Just as importantly, TAL is
flexible enough to admit many low-level compiler optimizations. Consequently,
TAL is an ideal target platform for type-directed compilers that want to
produce verifiably safe code for use in secure mobile code applications or
extensible operating system kernels.
</para>
</section>
<section>
<title>Free Pascal</title>
<para>
<link xlink:href="http://www.freepascal.org">Free Pascal</link> has an internal
32-bit assembler (based on NASM tables) and a switchable output that allows:
<itemizedlist>
<listitem>
<para>
Binary (ELF and coff when crosscompiled .o) output
</para>
</listitem>
<listitem>
<para>
NASM
</para>
</listitem>
<listitem>
<para>
MASM
</para>
</listitem>
<listitem>
<para>
TASM
</para>
</listitem>
<listitem>
<para>
AS (aout,coff, elf32)
</para>
</listitem>
</itemizedlist>
</para>
<para>
The MASM and TASM output are not as good debugged as the other two, but can be
handy sometimes.
</para>
<para>
The assembler's look and feel are based on Turbo Pascal's internal BASM, and
the IDE supports similar highlighting, and FPC can fully integrate with gcc
(on C level, not C++).
</para>
<para>
Using a dummy RTL, one can even generate pure assembler programs.
</para>
</section>
<section>
<title>Win32Forth assembler</title>
<para>
Win32Forth is a <emphasis>free</emphasis> 32-bit ANS FORTH system that
successfully runs under Win32s, Win95, Win/NT. It includes a free 32-bit
assembler (either prefix or postfix syntax) integrated into the reflective
FORTH language. Macro processing is done with the full power of the reflective
language FORTH; however, the only supported input and output contexts is
Win32For itself (no dumping of <filename>.obj</filename> file, but you could
add that feature yourself, of course). Find it at
<link xlink:href="ftp://ftp.forth.org/pub/Forth/Compilers/native/windows/Win32For/">
ftp://ftp.forth.org/pub/Forth/Compilers/native/windows/Win32For/</link>.
</para>
</section>
<section>
<title>Terse</title>
<para>
<link xlink:href="http://www.terse.com">Terse</link> is a programming tool that
provides <emphasis>THE</emphasis> most compact assembler syntax for the x86
family! However, it is evil proprietary software. It is said that there was a
project for a free clone somewhere, that was abandoned after worthless pretenses
that the syntax would be owned by the original author. Thus, if you're looking
for a nifty programming project related to assembly hacking, I invite you to
develop a terse-syntax frontend to NASM, if you like that syntax.
</para>
<para>
As an interesting historic remark, on
<link xlink:href="news:comp.compilers">comp.compilers</link>,
</para>
<para>
<literallayout>
1999/07/11 19:36:51, the moderator wrote:
"There's no reason that assemblers have to have awful syntax. About
30 years ago I used Niklaus Wirth's PL360, which was basically a S/360
assembler with Algol syntax and a little syntactic sugar like while
loops that turned into the obvious branches. It really was an
assembler, e.g., you had to write out your expressions with explicit
assignments of values to registers, but it was nice. Wirth used it to
write Algol W, a small fast Algol subset, which was a predecessor to
Pascal. As is so often the case, Algol W was a significant
improvement over many of its successors. -John"
</literallayout>
</para>
</section>
<section>
<title>Non-free and/or Non-32bit x86 assemblers</title>
<para>
You may find more about them, together with the basics of x86 assembly
programming, in the
<link linkend="s-res-gen">Raymond Moon's x86 assembly FAQ</link>.
</para>
<para>
Note that all DOS-based assemblers should work inside the Linux DOS Emulator,
as well as other similar emulators, so that if you already own one, you can
still use it inside a real OS. Recent DOS-based assemblers also support COFF
and/or other object file formats that are supported by the GNU BFD library,
so that you can use them together with your free 32-bit tools, perhaps using
GNU objcopy (part of the binutils) as a conversion filter.
</para>
</section>
</section>
</chapter>
<chapter xml:id="s-meta" xreflabel="Metaprogramming">
<?dbhtml filename="metaprogramming.html"?>
<title>Metaprogramming</title>
<para>
Assembly programming is a bore, but for critical parts of programs.
</para>
<para>
You should use the appropriate tool for the right task, so don't choose
assembly when it does not fit; C, OCaml, perl, Scheme, might be a better
choice in the most cases.
</para>
<para>
However, there are cases when these tools do not give fine enough control on
the machine, and assembly is useful or needed. In these cases you'll
appreciate a system of macroprocessing and metaprogramming that allows
recurring patterns to be factored each into one indefinitely reusable
definition, which allows safer programming, automatic propagation of pattern
modification, etc. Plain assembler often is not enough, even when one is doing
only small routines to link with C.
</para>
<section>
<?dbhtml filename="external.html"?>
<title>External filters</title>
<para>
Whatever is the macro support from your assembler, or whatever language you
use (even C!), if the language is not expressive enough to you, you can have
files passed through an external filter with a Makefile rule like that:
</para>
<para>
<programlisting>
%.s: %.S other_dependencies
$(FILTER) $(FILTER_OPTIONS) &lt; $&lt; &gt; $@
</programlisting>
</para>
<section xml:id="p-cpp" xreflabel="CPP">
<title>CPP</title>
<para>
CPP is truly not very expressive, but it's enough for easy things, it's
standard, and called transparently by GCC.
</para>
<para>
As an example of its limitations, you can't declare objects so that destructors
are automatically called at the end of the declaring block; you don't have
diversions or scoping, etc.
</para>
<para>
CPP comes with any C compiler. However, considering how mediocre it is, stay
away from it if by chance you can make it without C.
</para>
</section>
<section xml:id="p-m4" xreflabel="M4">
<title>M4</title>
<para>
M4 gives you the full power of macroprocessing, with a Turing equivalent
language, recursion, regular expressions, etc. You can do with it everything
that CPP cannot.
</para>
<para>
See
<link xlink:href="ftp://ftp.forth.org/pub/Forth/Compilers/native/unix/this4th.tar.gz">
macro4th (this4th)</link> as an example of advanced macroprogramming using m4.
</para>
<para>
However, its disfunctional quoting and unquoting semantics force you to use
explicit continuation-passing tail-recursive macro style if you want to do
<emphasis>advanced</emphasis> macro programming (which is remindful of TeX
-- BTW, has anyone tried to use TeX as a macroprocessor for anything else than
typesetting ?). This is NOT worse than CPP that does not allow quoting and
recursion anyway.
</para>
<para>
The right version of M4 to get is <literal>GNU m4</literal> which has the most
features and the least bugs or limitations of all. m4 is designed to be slow
for anything but the simplest uses, which might still be ok for most assembly
programming (you are not writing million-lines assembly programs, are you?).
</para>
</section>
<section>
<title>Macroprocessing with your own filter</title>
<para>
You can write your own simple macro-expansion filter with the usual tools:
perl, awk, sed, etc. It can be made rather quickly, and you control everything.
But, of course, power in macroprocessing implies "the hard way".
</para>
</section>
</section>
<section>
<?dbhtml filename="meta.html"?>
<title>Metaprogramming</title>
<para>
Instead of using an external filter that expands macros, one way to do things
is to write programs that write part or all of other programs.
</para>
<para>
For instance, you could use a program outputting source code
<itemizedlist>
<listitem>
<para>
to generate sine/cosine/whatever lookup tables,
</para>
</listitem>
<listitem>
<para>
to extract a source-form representation of a binary file,
</para>
</listitem>
<listitem>
<para>
to compile your bitmaps into fast display routines,
</para>
</listitem>
<listitem>
<para>
to extract documentation, initialization/finalization code,
description tables, as well as normal code from the same source files,
</para>
</listitem>
<listitem>
<para>
to have customized assembly code, generated from a perl/shell/scheme script
that does arbitrary processing,
</para>
</listitem>
<listitem>
<para>
to propagate data defined at one point only into several cross-referencing
tables and code chunks.
</para>
</listitem>
<listitem>
<para>
etc.
</para>
</listitem>
</itemizedlist>
</para>
<para>
Think about it!
</para>
<section>
<title>Backends from compilers</title>
<para>
Compilers like GCC, SML/NJ, Objective CAML, MIT-Scheme, CMUCL, etc,
do have their own generic assembler backend,
which you might choose to use,
if you intend to generate code semi-automatically
from the according languages,
or from a language you hack:
rather than write great assembly code,
you may instead modify a compiler so that it dumps great assembly code!
</para>
</section>
<section>
<title>The New-Jersey Machine-Code Toolkit</title>
<para>
There is a project, using the programming language Icon
(with an experimental ML version),
to build a basis for producing assembly-manipulating code.
See around
<link xlink:href="http://www.eecs.harvard.edu/~nr/toolkit/">
http://www.eecs.harvard.edu/~nr/toolkit/</link>
</para>
</section>
<section>
<title>TUNES</title>
<para>
The <link xlink:href="http://www.tunes.org">TUNES Project</link>
for a Free Reflective Computing System is developing its own assembler
as an extension to the Scheme language, as part of its development process.
It doesn't run at all yet, though help is welcome.
</para>
<para>
The assembler manipulates abstract syntax trees, so it could equally serve as
the basis for a assembly syntax translator, a disassembler, a common
assembler/compiler back-end, etc. Also, the full power of a real language,
Scheme, make it unchallenged as for macroprocessing/metaprogramming.
</para>
</section>
</section>
</chapter>
<chapter xml:id="s-call" xreflabel="Calling conventions">
<?dbhtml filename="conventions.html"?>
<title>Calling conventions</title>
<section>
<?dbhtml filename="linux.html"?>
<title>Linux</title>
<section>
<title>Linking to GCC</title>
<para>
This is the preferred way if you are developing mixed C-asm project. Check GCC
docs and examples from Linux kernel <filename>.S</filename> files that go through
<application>gas</application> (not those that go through
<application>as86</application>).
</para>
<para>
32-bit arguments are pushed down stack in reverse syntactic order (hence
accessed/popped in the right order), above the 32-bit near return address.
<literal>%ebp</literal>, <literal>%esi</literal>, <literal>%edi</literal>,
<literal>%ebx</literal> are callee-saved, other registers are caller-saved;
<literal>%eax</literal> is to hold the result, or <literal>%edx:%eax</literal>
for 64-bit results.
</para>
<para>
FP stack: I'm not sure, but I think result is in <literal>st(0)</literal>,
whole stack caller-saved.
</para>
<para>
Note that GCC has options to modify the calling conventions by reserving
registers, having arguments in registers, not assuming the FPU, etc. Check the
i386 <filename>.info</filename> pages.
</para>
<para>
Beware that you must then declare the <literal>cdecl</literal> or
<literal>regparm(0)</literal> attribute for a function that will follow
standard GCC calling conventions. See
<literal>C Extensions::Extended Asm::</literal> section from the GCC info
pages. See also how Linux defines its <literal>asmlinkage</literal> macro.
</para>
</section>
<section>
<title>ELF vs a.out problems</title>
<para>
Some C compilers prepend an underscore before every symbol, while others do
not.
</para>
<para>
Particularly, Linux a.out GCC does such prepending, while Linux ELF GCC does
not.
</para>
<para>
If you need to cope with both behaviors at once, see how existing packages do.
For instance, get an old Linux source tree, the Elk, qthreads, or OCaml.
</para>
<para>
You can also override the implicit C->asm renaming by inserting statements like
<programlisting>
void foo asm("bar") (void);
</programlisting>
to be sure that the C function <function>foo()</function>
will be called really <function>bar</function> in assembly.
</para>
<para>
Note that the <command>objcopy</command> utility from the binutils package
should allow you to transform your a.out objects into ELF objects,
and perhaps the contrary too, in some cases.
More generally, it will do lots of file format conversions.
</para>
</section>
<section>
<title>Direct Linux syscalls</title>
<para>
Often you will be told that using <application>C library</application>
(<acronym>libc</acronym>) is the only way, and direct system calls are bad.
This is true. To some extent. In general, you must know that
<application>libc</application> is not sacred, and in <emphasis>most</emphasis>
cases it only does some checks, then calls kernel, and then sets errno. You can
easily do this in your program as well (if you need to), and your program will
be dozen times smaller, and this will result in improved performance as well,
just because you're not using shared libraries (static binaries are faster).
Using or not using <application>libc</application> in assembly programming is
more a question of taste/belief than something practical. Remember, Linux is
aiming to be POSIX compliant, so does <application>libc</application>.
This means that syntax of almost all <application>libc</application> "system
calls" exactly matches syntax of real kernel system calls (and vice versa).
Besides, <application>GNU libc</application>(<application>glibc</application>)
becomes slower and slower from version to version, and eats more and more
memory; and so, cases of using direct system calls become quite usual.
However, the main drawback of throwing <application>libc</application> away
is that you will possibly need to implement several
<application>libc</application> specific functions (that are not just syscall
wrappers) on your own (<function>printf()</function> and Co.), and you are
ready for that, aren't you? :-)
</para>
<para>
Here is summary of direct system calls pros and cons.
</para>
<para>
Pros:
<itemizedlist>
<listitem>
<para>
the smallest possible size; squeezing the last byte out of the system
</para>
</listitem>
<listitem>
<para>
the highest possible speed; squeezing cycles out of your favorite benchmark
</para>
</listitem>
<listitem>
<para>
full control: you can adapt your program/library to your specific language or
memory requirements or whatever
</para>
</listitem>
<listitem>
<para>
no pollution by libc cruft
</para>
</listitem>
<listitem>
<para>
no pollution by C calling conventions (if you're developing your own language
or environment)
</para>
</listitem>
<listitem>
<para>
static binaries make you independent from libc upgrades or crashes, or from
dangling <literal>#!</literal> path to an interpreter (and are faster)
</para>
</listitem>
<listitem>
<para>
just for the fun out of it (don't you get a kick out of assembly programming?)
</para>
</listitem>
</itemizedlist>
</para>
<para>
Cons:
<itemizedlist>
<listitem>
<para>
If any other program on your computer uses the libc, then duplicating the libc
code will actually wastes memory, not saves it.
</para>
</listitem>
<listitem>
<para>
Services redundantly implemented in many static binaries are a waste of memory.
But you can make your libc replacement a shared library.
</para>
</listitem>
<listitem>
<para>
Size is much better saved by having some kind of bytecode, wordcode, or
structure interpreter than by writing everything in assembly. (the interpreter
itself could be written either in C or assembly.) The best way to keep multiple
binaries small is to not have multiple binaries, but instead to have an
interpreter process files with <literal>#!</literal> prefix. This is how OCaml
works when used in wordcode mode (as opposed to optimized native code mode),
and it is compatible with using the libc. This is also how Tom Christiansen's
Perl PowerTools reimplementation of unix utilities works. Finally, one last way
to keep things small, that doesn't depend on an external file with a hardcoded
path, be it library or interpreter, is to have only one binary, and have
multiply-named hard or soft links to it: the same binary will provide everything
you need in an optimal space, with no redundancy of subroutines or useless
binary headers; it will dispatch its specific behavior according to its
<parameter>argv[0]</parameter>; in case it isn't called with a recognized name,
it might default to a shell, and be possibly thus also usable as an interpreter!
</para>
</listitem>
<listitem>
<para>
You cannot benefit from the many functionalities that libc provides besides mere
linux syscalls: that is, functionality described in section 3 of the manual
pages, as opposed to section 2, such as malloc, threads, locale, password,
high-level network management, etc.
</para>
</listitem>
<listitem>
<para>
Therefore, you might have to reimplement large parts of libc, from
<function>printf()</function> to <function>malloc()</function> and
<function>gethostbyname</function>. It's redundant with the libc effort, and
can be <emphasis>quite</emphasis> boring sometimes. Note that some people have
already reimplemented "light" replacements for parts of the libc - - check
them out! (Redhat's minilibc, Rick Hohensee's
<link xlink:href="ftp://linux01.gwdg.de/pub/cLIeNUX/interim/libsys.tgz">libsys</link>,
Felix von Leitner's <link xlink:href="http://www.fefe.de/dietlibc/">dietlibc</link>,
<!-- Christian Fowelin's <link xlink:href="http://www.fowelin.de/christian/computer/libASM/">libASM</link>, -->
<link xlink:href="http://asm.sourceforge.net/asmutils.html">asmutils</link>
project is working on pure assembly libc)
</para>
</listitem>
<listitem>
<para>
Static libraries prevent you to benefit from libc upgrades as well as from libc
add-ons such as the <application>zlibc</application> package, that does
on-the-fly transparent decompression of gzip-compressed files.
</para>
</listitem>
<listitem>
<para>
The few instructions added by the libc can be a
<emphasis>ridiculously</emphasis> small speed overhead as compared to the cost
of a system call. If speed is a concern, your main problem is in your usage of
system calls, not in their wrapper's implementation.
</para>
</listitem>
<listitem>
<para>
Using the standard assembly API for system calls is much slower than using the
libc API when running in micro-kernel versions of Linux such as L4Linux, that
have their own faster calling convention, and pay high convention-translation
overhead when using the standard one (L4Linux comes with libc recompiled with
their syscall API; of course, you could recompile your code with their API,
too).
</para>
</listitem>
<listitem>
<para>
See previous discussion for general speed optimization issue.
</para>
</listitem>
<listitem>
<para>
If syscalls are too slow to you, you might want to hack the kernel sources
(in C) instead of staying in userland.
</para>
</listitem>
</itemizedlist>
</para>
<para>
If you've pondered the above pros and cons, and still want to use direct
syscalls, then here is some advice.
</para>
<para>
<itemizedlist>
<listitem>
<para>
You can easily define your system calling functions in a portable way in C
(as opposed to unportable using assembly), by including
<filename>asm/unistd.h</filename>, and using provided macros.
</para>
</listitem>
<listitem>
<para>
Since you're trying to replace it, go get the sources for the libc, and
grok them. (And if you think you can do better, then send feedback to the
authors!)
</para>
</listitem>
<listitem>
<para>
As an example of pure assembly code that does everything you want, examine
<xref linkend="s-res"/>.
</para>
</listitem>
</itemizedlist>
</para>
<para>
Basically, you issue an <function>int 0x80</function>, with the
<literal>__NR_</literal>syscallname number (from <filename>asm/unistd.h</filename>)
in <literal>eax</literal>, and parameters (up to <link linkend="six-arg">six</link>)
in <literal>ebx</literal>, <literal>ecx</literal>, <literal>edx</literal>,
<literal>esi</literal>, <literal>edi</literal>, <link linkend="six-arg">
<literal>ebp</literal></link> respectively.
</para>
<para>
Result is returned in <literal>eax</literal>, with a negative result being an
error, whose opposite is what libc would put into <literal>errno</literal>.
The user-stack is not touched, so you needn't have a valid one when doing a
syscall.
</para>
<note>
<para>
<anchor xml:id="six-arg"/>
Passing sixth parameter in <literal>ebp</literal> appeared in Linux 2.4,
previous Linux versions understand only 5 parameters in registers.
</para>
</note>
<para>
<link xlink:href="http://www.tldp.org/LDP/lki/">Linux Kernel Internals</link>,
and especially <link xlink:href="http://www.tldp.org/LDP/lki/lki-2.html#ss2.11">
How System Calls Are Implemented on i386 Architecture?</link> chapter will give
you more robust overview.
</para>
<para>
As for the invocation arguments passed to a process upon startup, the general
principle is that the stack originally contains the number of arguments
<parameter>argc</parameter>, then the list of pointers that constitute
<parameter>*argv</parameter>, then a null-terminated sequence of
null-terminated <literal>variable=value</literal> strings for the
<parameter>environ</parameter>ment. For more details, do examine
<xref linkend="s-res"/>, read the sources of C startup code from your libc
(<filename>crt0.S</filename> or <filename>crt1.S</filename>), or those from
the Linux kernel (<filename>exec.c</filename> and
<filename>binfmt_*.c</filename> in <filename>linux/fs/</filename>).
</para>
</section>
<section>
<title>Hardware I/O under Linux</title>
<para>
If you want to perform direct port I/O under Linux, either it's something very
simple that does not need OS arbitration, and you should see the
<literal>IO-Port-Programming</literal> mini-HOWTO; or it needs a kernel device
driver, and you should try to learn more about kernel hacking, device driver
development, kernel modules, etc, for which there are other excellent HOWTOs
and documents from the LDP.
</para>
<para>
Particularly, if what you want is Graphics programming, then do join one of the
<link xlink:href="http://www.ggi-project.org/">GGI</link> or
<link xlink:href="http://www.XFree86.org/">XFree86</link> projects.
</para>
<para>
Some people have even done better, writing small and robust XFree86 drivers in
an interpreted domain-specific language, GAL, and achieving the efficiency of
hand C-written drivers through partial evaluation (drivers not only not in asm,
but not even in C!). The problem is that the partial evaluator they used to
achieve efficiency is not free software. Any taker for a replacement?
</para>
<para>
Anyway, in all these cases, you'll be better when using GCC inline assembly
with the macros from <filename>linux/asm/*.h</filename> than writing full
assembly source files.
</para>
</section>
<section>
<title>Accessing 16-bit drivers from Linux/i386</title>
<para>
Such thing is theoretically possible (proof: see how
<link xlink:href="http://www.dosemu.org">DOSEMU</link> can selectively grant
hardware port access to programs), and I've heard rumors that someone somewhere
did actually do it (in the PCI driver? Some VESA access stuff? ISA PnP? dunno).
If you have some more precise information on that, you'll be most welcome.
Anyway, good places to look for more information are the Linux kernel sources,
DOSEMU sources, and sources for various low-level programs under Linux.
(perhaps GGI if it supports VESA).
</para>
<para>
Basically, you must either use 16-bit protected mode or vm86 mode.
</para>
<para>
The first is simpler to setup, but only works with well-behaved code that won't
do any kind of segment arithmetics or absolute segment addressing (particularly
addressing segment 0), unless by chance it happens that all segments used can
be setup in advance in the LDT.
</para>
<para>
The later allows for more "compatibility" with vanilla 16-bit environments, but
requires more complicated handling.
</para>
<para>
In both cases, before you can jump to 16-bit code, you must
<itemizedlist>
<listitem>
<para>
mmap any absolute address used in the 16-bit code (such as ROM, video buffers,
DMA targets, and memory-mapped I/O) from <filename>/dev/mem</filename> to your
process' address space,
</para>
</listitem>
<listitem>
<para>
setup the LDT and/or vm86 mode monitor.
</para>
</listitem>
<listitem>
<para>
grab proper I/O permissions from the kernel (see the above section)
</para>
</listitem>
</itemizedlist>
</para>
<para>
Again, carefully read the source for the stuff contributed to the DOSEMU
project, particularly these mini-emulators for running ELKS and/or simple
<filename>.COM</filename> programs under Linux/i386.
</para>
</section>
</section>
<section>
<?dbhtml filename="dos.html"?>
<title>DOS and Windows</title>
<para>
Most DOS extenders come with some interface to DOS services. Read their docs
about that, but often, they just simulate <function>int 0x21</function> and
such, so you do "as if" you are in real mode (I doubt they have more than stubs
and extend things to work with 32-bit operands; they most likely will just
reflect the interrupt into the real-mode or vm86 handler).
</para>
<para>
Docs about DPMI (and much more) can be found on
<link xlink:href="http://en.wikipedia.org/wiki/DOS_Protected_Mode_Interface">
http://en.wikipedia.org/wiki/DOS_Protected_Mode_Interface</link>).
</para>
<para>
DJGPP comes with its own (limited) <application>glibc</application>
derivative/subset/replacement, too.
</para>
<para>
It is possible to cross-compile from Linux to DOS, see the
<filename>devel/msdos/</filename> directory of your local FTP mirror for
metalab.unc.edu; Also see the MOSS DOS-extender from the
<link xlink:href="http://www.cs.utah.edu/projects/flux/">Flux project</link>
from the university of Utah.
</para>
<para>
Other documents and FAQs are more DOS-centered; we do not recommend DOS
development.
</para>
<formalpara>
<title>Windows and Co.</title>
<para>
This document is not about Windows programming, you can find lots of documents
about it everywhere... The thing you should know is that there is the
<link xlink:href="http://www.cygwin.com">cygwin32.dll library</link>,
for GNU programs to run on Win32 platform; thus, you can use GCC, GAS,
all the GNU tools, and many other Unix applications.
</para>
</formalpara>
</section>
<section>
<?dbhtml filename="ownos.html"?>
<title>Your own OS</title>
<para>
Control is what attracts many OS developers to assembly, often is what leads to
or stems from assembly hacking. Note that any system that allows
self-development could be qualified an "OS", though it can run "on the top" of
an underlying system (much like Linux over Mach or OpenGenera over Unix).
</para>
<para>
Hence, for easier debugging purpose, you might like to develop your "OS" first
as a process running on top of Linux (despite the slowness), then use the
<link xlink:href="http://www.cs.utah.edu/projects/flux/oskit/">Flux OS kit</link>
(which grants use of Linux and BSD drivers in your own OS) to make it
stand-alone. When your OS is stable, it is time to write your own hardware
drivers if you really love that.
</para>
<para>
This HOWTO will not cover topics such as bootloader code, getting into 32-bit
mode, handling Interrupts, the basics about Intel protected mode or V86/R86
braindeadness, defining your object format and calling conventions.
</para>
<para>
The main place where to find reliable information about that all, is source
code of existing OSes and bootloaders. Lots of pointers are on the following
webpage: <link xlink:href="http://www.tunes.org/Review/OSes.html">
http://www.tunes.org/Review/OSes.html</link>
</para>
</section>
</chapter>
<chapter xml:id="s-quick" xreflabel="Quick Start">
<?dbhtml filename="quickstart.html"?>
<title>Quick start</title>
<section>
<title>Introduction</title>
<para>
Finally, if you still want to try this crazy idea and write something in
assembly (if you've reached this section -- you're real assembly fan),
here's what you need to start.
</para>
<para>
As you've read before, you can write for Linux in different ways; I'll show
how to use <emphasis>direct</emphasis> kernel calls, since this is the fastest
way to call kernel service; our code is not linked to any library, does not
use ELF interpreter, it communicates with kernel directly.
</para>
<para>
I will show the same sample program in two assemblers, <command>nasm</command>
and <command>gas</command>, thus showing Intel and AT&amp;T syntax.
</para>
<para>
You may also want to read
<link xlink:href="http://asm.sourceforge.net/intro.html">
Introduction to UNIX assembly programming</link> tutorial, it contains sample
code for other UNIX-like OSes.
</para>
<section>
<title>Tools you need</title>
<para>
First of all you need assembler (compiler) -- <command>nasm</command> or
<command>gas</command>.
</para>
<para>
Second, you need a linker -- <command>ld</command>, since assembler produces
only object code. Almost all distributions have <application>gas</application>
and <application>ld</application>, in the binutils package.
</para>
<para>
As for <application>nasm</application>, you may have to download and install
binary packages for Linux and docs from the
<link linkend="p-nasm-where">nasm site</link>; note that several distributions
(Stampede, Debian, SuSe, Mandrake) already have <application>nasm</application>,
check first.
</para>
<para>
If you're going to dig in, you should also install include files for your OS,
and if possible, kernel source.
</para>
</section>
</section>
<section>
<?dbhtml filename="hello.html"?>
<title>Hello, world!</title>
<section>
<title>Program layout</title>
<para>
Linux is 32-bit, runs in protected mode, has flat memory model, and uses the
ELF format for binaries.
</para>
<para>
A program can be divided into sections: <literal>.text</literal> for your code
(read-only), <literal>.data</literal> for your data (read-write),
<literal>.bss</literal> for uninitialized data (read-write); there can actually
be a few other standard sections, as well as some user-defined sections, but
there's rare need to use them and they are out of our interest here. A program
must have at least <literal>.text</literal> section.
</para>
<para>
Now we will write our first program. Here is sample code:
</para>
</section>
<section>
<title>NASM (hello.asm)</title>
<para>
<programlisting>
section .text ;section declaration
;we must export the entry point to the ELF linker or
global _start ;loader. They conventionally recognize _start as their
;entry point. Use ld -e foo to override the default.
_start:
;write our string to stdout
mov edx,len ;third argument: message length
mov ecx,msg ;second argument: pointer to message to write
mov ebx,1 ;first argument: file handle (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
;and exit
mov ebx,0 ;first syscall argument: exit code
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data ;section declaration
msg db "Hello, world!",0xa ;our dear string
len equ $ - msg ;length of our dear string
</programlisting>
</para>
</section>
<section>
<title>GAS (hello.S)</title>
<para>
<programlisting>
.text # section declaration
# we must export the entry point to the ELF linker or
.global _start # loader. They conventionally recognize _start as their
# entry point. Use ld -e foo to override the default.
_start:
# write our string to stdout
movl $len,%edx # third argument: message length
movl $msg,%ecx # second argument: pointer to message to write
movl $1,%ebx # first argument: file handle (stdout)
movl $4,%eax # system call number (sys_write)
int $0x80 # call kernel
# and exit
movl $0,%ebx # first argument: exit code
movl $1,%eax # system call number (sys_exit)
int $0x80 # call kernel
.data # section declaration
msg:
.ascii "Hello, world!\n" # our dear string
len = . - msg # length of our dear string
</programlisting>
</para>
</section>
</section>
<section>
<?dbhtml filename="build.html"?>
<title>Building an executable</title>
<section>
<title>Producing object code</title>
<para>
First step of building an executable is compiling (or assembling) object file
from the source:
</para>
<para>
For <application>nasm</application> example:
</para>
<para>
<screen>
$ nasm -f elf hello.asm
</screen>
</para>
<para>
For <application>gas</application> example:
</para>
<para>
<screen>
$ as -o hello.o hello.S
</screen>
</para>
<para>
This makes <filename>hello.o</filename> object file.
</para>
</section>
<section>
<title>Producing executable</title>
<para>
Second step is producing executable file itself from the object file by
invoking linker:
</para>
<para>
<screen>
$ ld -s -o hello hello.o
</screen>
</para>
<para>
This will finally build <filename>hello</filename> executable.
</para>
<para>
Hey, try to run it... Works? That's it. Pretty simple.
</para>
</section>
</section>
<section>
<?dbhtml filename="mips.html"?>
<title>MIPS Example</title>
<para>
As a demonstration of a fact that there's a universe other than x86, here comes
an example program for MIPS by Spencer Parkin.
</para>
<para>
<programlisting>
<![CDATA[
# hello.S by Spencer T. Parkin
# This is my first MIPS-RISC assembly program!
# To compile this program type:
# > gcc -o hello hello.S -non_shared
# This program compiles without errors or warnings
# on a PlayStation2 MIPS R5900 (EE Core).
# EE stands for Emotion Engine...lame!
# The -non_shared option tells gcc that we`re
# not interrested in compiling relocatable code.
# If we were, we would need to follow the PIC-
# ABI calling conventions and other protocols.
#include <asm/regdef.h> // ...for human readable register names
#include <asm/unistd.h> // ...for system serivices
.rdata # begin read-only data segment
.align 2 # because of the way memory is built
hello: .asciz "Hello, world!\n" # a null terminated string
.align 4 # because of the way memory is built
length: .word . - hello # length = IC - (hello-addr)
.text # begin code segment
.globl main # for gcc/ld linking
.ent main # for gdb debugging info.
main: # We must specify -non_shared to gcc or we`ll need these 3 lines that fallow.
# .set noreorder # disable instruction reordering
# .cpload t9 # PIC ABI crap (function prologue)
# .set reorder # re-enable instruction reordering
move a0,$0 # load stdout fd
la a1,hello # load string address
lw a2,length # load string length
li v0,__NR_write # specify system write service
syscall # call the kernel (write string)
li v0,0 # load return code
j ra # return to caller
.end main # for dgb debugging info.
# That`s all folks!
]]>
</programlisting>
</para>
</section>
</chapter>
<chapter xml:id="s-res" xreflabel="Linux assembly resources">
<?dbhtml filename="resources.html"?>
<title>Resources</title>
<simplesect xml:id="s-res-url"><title>Pointers</title>
<para>
Your main resource for Linux/UNIX assembly programming material is:
</para>
<blockquote>
<para>
<link xlink:href="http://asm.sourceforge.net/resources.html">
http://asm.sourceforge.net/resources.html</link>
</para>
</blockquote>
<para>
Do visit it, and get plenty of pointers to assembly projects, tools, tutorials,
documentation, guides, etc, concerning different UNIX operating systems and
CPUs. Because it evolves quickly, I will no longer duplicate it here.
</para>
<para>
If you are new to assembly in general, here are few starting pointers:
<anchor xml:id="s-res-gen"/>
<itemizedlist>
<listitem>
<para>
<link xlink:href="http://savannah.nongnu.org/projects/pgubook/">Programming from the ground up</link>
</para>
</listitem>
<listitem>
<para>
x86 assembly FAQ (use Google)
</para>
</listitem>
<listitem>
<para>
<link xlink:href="http://www.koth.org">CoreWars</link>,
a fun way to learn assembly in general
</para>
</listitem>
<listitem>
<para>
Usenet:
<link xlink:href="news://comp.lang.asm.x86">comp.lang.asm.x86</link>;
<link xlink:href="news://alt.lang.asm">alt.lang.asm</link>
</para>
</listitem>
</itemizedlist>
</para>
</simplesect>
<simplesect xml:id="s-res-list">
<title>Mailing list</title>
<para>
If you're are interested in Linux/UNIX assembly programming (or have questions,
or are just curious) I especially invite you to join Linux assembly
programming mailing list.
</para>
<para>
This is an open discussion of assembly programming under Linux, *BSD, BeOS,
or any other UNIX/POSIX like OS; also it is not limited to x86 assembly
(Alpha, Sparc, PPC and other hackers are welcome too!).
</para>
<para>
Mailing list address is <email>linux-assembly@vger.kernel.org</email>.
</para>
<para>
To subscribe send a messgage to <email>majordomo@vger.kernel.org</email>
with the following line in the body of the message:
<programlisting>
subscribe linux-assembly
</programlisting>
</para>
<para>
Detailed information and list archives are available at
<link xlink:href="http://asm.sourceforge.net/list.html">
http://asm.sourceforge.net/list.html</link>.
</para>
</simplesect>
</chapter>
<chapter xml:id="s-faq" xreflabel="FAQ">
<?dbhtml filename="faq.html"?>
<title>Frequently Asked Questions</title>
<para>
Here are frequently asked questions (with answers)
about Linux assembly programming.
Some of the questions (and the answers) were taken from the
the <link linkend="s-res-list">linux-assembly mailing list</link>.
</para>
<qandaset defaultlabel="number">
<qandaentry>
<question>
<para>
How do I do graphics programming in Linux?
</para>
</question>
<answer>
<para>
An answer from <link xlink:href="mailto:paulf@gam.co.za">Paul Furber</link>:
</para>
<para>
<screen>
Ok you have a number of options to graphics in Linux. Which one you use
depends on what you want to do. There isn't one Web site with all the
information but here are some tips:
SVGALib: This is a C library for console SVGA access.
Pros: very easy to learn, good coding examples, not all that different
from equivalent gfx libraries for DOS, all the effects you know from DOS
can be converted with little difficulty.
Cons: programs need superuser rights to run since they write directly to
the hardware, doesn't work with all chipsets, can't run under X-Windows.
Search for svgalib-1.4.x on http://ftp.is.co.za
Framebuffer: do it yourself graphics at SVGA res
Pros: fast, linear mapped video access, ASM can be used if you want :)
Cons: has to be compiled into the kernel, chipset-specific issues, must
switch out of X to run, relies on good knowledge of linux system calls
and kernel, tough to debug
Examples: asmutils (http://www.linuxassembly.org) and the leaves example
and my own site for some framebuffer code and tips in asm
(http://ma.verick.co.za/linux4k/)
Xlib: the application and development libraries for XFree86.
Pros: Complete control over your X application
Cons: Difficult to learn, horrible to work with and requires quite a bit
of knowledge as to how X works at the low level.
Not recommended but if you're really masochistic go for it. All the
include and lib files are probably installed already so you have what
you need.
Low-level APIs: include PTC, SDL, GGI and Clanlib
Pros: very flexible, run under X or the console, generally abstract away
the video hardware a little so you can draw to a linear surface, lots of
good coding examples, can link to other APIs like OpenGL and sound libs,
Windows DirectX versions for free
Cons: Not as fast as doing it yourself, often in development so versions
can (and do) change frequently.
Examples: PTC and GGI have excellent demos, SDL is used in sdlQuake,
Myth II, Civ CTP and Clanlib has been used for games as well.
High-level APIs: OpenGL - any others?
Pros: clean api, tons of functionality and examples, industry standard
so you can learn from SGI demos for example
Cons: hardware acceleration is normally a must, some quirks between
versions and platforms
Examples: loads - check out www.mesa3d.org under the links section.
To get going try looking at the svgalib examples and also install SDL
and get it working. After that, the sky's the limit.
</screen>
</para>
</answer>
</qandaentry>
<qandaentry>
<question>
<para>
How do I debug pure assembly code under Linux?
</para>
</question>
<answer>
<para>
There's an early version of the
<link xlink:href="http://ald.sourceforge.net">Assembly Language Debugger</link>,
which is designed to work with assembly code,
and is portable enough to run on Linux and *BSD.
It is already functional and should be the right choice, check it out!
</para>
<para>
You can also try <command>gdb</command> ;).
Although it is source-level debugger, it can be used to debug
pure assembly code, and with some trickery you can make
<command>gdb</command> to do what you need
(unfortunately, nasm '-g' switch does not generate
proper debug info for gdb; this is nasm bug, I think).
Here's an answer from <link xlink:href="mailto:dl@gazeta.ru">Dmitry Bakhvalov</link>:
</para>
<para>
<screen>
Personally, I use gdb for debugging asmutils. Try this:
1) Use the following stuff to compile:
$ nasm -f elf -g smth.asm
$ ld -o smth smth.o
2) Fire up gdb:
$ gdb smth
3) In gdb:
(gdb) disassemble _start
Place a breakpoint at _start+1 (If placed at _start the breakpoint
wouldnt work, dunno why)
(gdb) b *0x8048075
To step thru the code I use the following macro:
(gdb)define n
>ni
>printf "eax=%x ebx=%x ...etc...",$eax,$ebx,...etc...
>disassemble $pc $pc+15
>end
Then start the program with r command and debug with n.
Hope this helps.
</screen>
</para>
<para>
An additional note from ???:
</para>
<para>
<screen>
I have such a macro in my .gdbinit for quite some time now, and it
for sure makes life easier. A small difference : I use "x /8i $pc",
which guarantee a fixed number of disassembled instructions. Then,
with a well chosen size for my xterm, gdb output looks like it is
refreshed, and not scrolling.
</screen>
</para>
<para>
If you want to set breakpoints across your code, you can just use
<function>int 3</function> instruction as breakpoint
(instead of entering address manually in <command>gdb</command>).
</para>
<para>
If you're using <application>gas</application>, you should consult
<application>gas</application> and <application>gdb</application> related
<link xlink:href="http://asm.sourceforge.net/resources.html#tutorials">tutorials</link>.
</para>
</answer>
</qandaentry>
<qandaentry>
<question>
<para>Any other useful debugging tools?</para>
</question>
<answer>
<para>
Definitely <command>strace</command> can help a lot
(<command>ktrace</command> and <command>kdump</command>
on FreeBSD),
it is used to trace system calls and signals.
Read its manual page (<command>man strace</command>) and
<command>strace - -help</command> output for details.
</para>
</answer>
</qandaentry>
<qandaentry>
<question>
<para>
How do I access BIOS functions from Linux (BSD, BeOS, etc)?
</para>
</question>
<answer>
<para>
Short answer is -- noway. This is protected mode, use OS services instead.
Again, you can't use <function>int 0x10</function>,
<function>int 0x13</function>, etc.
Fortunately almost everything can be implemented
by means of system calls or library functions.
In the worst case you may go through direct port access,
or make a kernel patch to implement needed functionality,
or use LRMI library to access BIOS functions.
</para>
</answer>
</qandaentry>
<qandaentry>
<question>
<para>
Is it possible to write kernel modules in assembly?
</para>
</question>
<answer>
<para>
Yes, indeed it is. While in general it is not a good idea
(it hardly will speedup anything), there may be a need of such wizardy.
The process of writing a module itself is not that hard - -
a module must have some predefined global function,
it may also need to call some external functions from the kernel.
Examine kernel source code (that can be built as module) for details.
</para>
<para>
Meanwhile, here's an example of a minimum dumb kernel module
(<filename>module.asm</filename>)
(source is based on example by mammon_ from APJ #8):
</para>
<para>
<programlisting>
section .text
global init_module
global cleanup_module
global kernel_version
extern printk
init_module:
push dword str1
call printk
pop eax
xor eax,eax
ret
cleanup_module:
push dword str2
call printk
pop eax
ret
str1 db "init_module done",0xa,0
str2 db "cleanup_module done",0xa,0
kernel_version db "2.2.18",0
</programlisting>
</para>
<para>
The only thing this example does is reporting its actions.
Modify <filename>kernel_version</filename> to match yours, and build module with:
</para>
<para>
<screen>
$ nasm -f elf -o module.m module.asm
</screen>
</para>
<para>
<screen>
$ ld -r -o module.o module.m
</screen>
</para>
<para>
Now you can play with it using <command>insmod/rmmod/lsmod</command>
(root privilidged are required); a lot of fun, huh?
</para>
</answer>
</qandaentry>
<qandaentry>
<question>
<para>
How do I allocate memory dynamically?
</para>
</question>
<answer>
<para>
A laconic answer from <link xlink:href="mailto:phpr@snafu.de">H-Peter Recktenwald</link>:
</para>
<para>
<programlisting>
ebx := 0 (in fact, any value below .bss seems to do)
sys_brk
eax := current top (of .bss section)
ebx := [ current top &lt; ebx &lt; (esp - 16K) ]
sys_brk
eax := new top of .bss
</programlisting>
</para>
</answer>
<answer>
<para>
An extensive answer from <link xlink:href="mailto:ee97034@fe.up.pt">Tiago Gasiba</link>:
</para>
<para>
<programlisting>
section .bss
var1 resb 1
section .text
;
;allocate memory
;
%define LIMIT 0x4000000 ; about 100Megs
mov ebx,0 ; get bottom of data segment
call sys_brk
cmp eax,-1 ; ok?
je erro1
add eax,LIMIT ; allocate +LIMIT memory
mov ebx,eax
call sys_brk
cmp eax,-1 ; ok?
je erro1
cmp eax,var1+1 ; has the data segment grown?
je erro1
;
;use allocated memory
;
; now eax contains bottom of
; data segment
mov ebx,eax ; save bottom
mov eax,var1 ; eax=beginning of data segment
repeat:
mov word [eax],1 ; fill up with 1's
inc eax
cmp ebx,eax ; current pos = bottom?
jne repeat
;
;free memory
;
mov ebx,var1 ; deallocate memory
call sys_brk ; by forcing its beginning=var1
cmp eax,-1 ; ok?
je erro2
</programlisting>
</para>
</answer>
</qandaentry>
<qandaentry>
<question>
<para>
I can't understand how to use <function>select</function> system call!
</para>
</question>
<answer>
<para>
An answer from <link xlink:href="mailto:mochel@transmeta.com">Patrick Mochel</link>:
</para>
<para>
<programlisting>
When you call sys_open, you get back a file descriptor, which is simply an
index into a table of all the open file descriptors that your process has.
stdin, stdout, and stderr are always 0, 1, and 2, respectively, because
that is the order in which they are always open for your process from there.
Also, notice that the first file descriptor that you open yourself (w/o first
closing any of those magic three descriptors) is always 3, and they increment
from there.
Understanding the index scheme will explain what select does. When you
call select, you are saying that you are waiting certain file descriptors
to read from, certain ones to write from, and certain ones to watch from
exceptions from. Your process can have up to 1024 file descriptors open,
so an fd_set is just a bit mask describing which file descriptors are valid
for each operation. Make sense?
Since each fd that you have open is just an index, and it only needs to be
on or off for each fd_set, you need only 1024 bits for an fd_set structure.
1024 / 32 = 32 longs needed to represent the structure.
Now, for the loose example.
Suppose you want to read from a file descriptor (w/o timeout).
- Allocate the equivalent to an fd_set.
.data
my_fds: times 32 dd 0
- open the file descriptor that you want to read from.
- set that bit in the fd_set structure.
First, you need to figure out which of the 32 dwords the bit is in.
Then, use bts to set the bit in that dword. bts will do a modulo 32
when setting the bit. That's why you need to first figure out which
dword to start with.
mov edx, 0
mov ebx, 32
div ebx
lea ebx, my_fds
bts ebx[eax * 4], edx
- repeat the last step for any file descriptors you want to read from.
- repeat the entire exercise for either of the other two fd_sets if you want action from them.
That leaves two other parts of the equation - the n paramter and the timeout
parameter. I'll leave the timeout parameter as an exercise for the reader
(yes, I'm lazy), but I'll briefly talk about the n parameter.
It is the value of the largest file descriptor you are selecting from (from
any of the fd_sets), plus one. Why plus one? Well, because it's easy to
determine a mask from that value. Suppose that there is data available on
x file descriptors, but the highest one you care about is (n - 1). Since
an fd_set is just a bitmask, the kernel needs some efficient way for
determining whether to return or not from select. So, it masks off the bits
that you care about, checks if anything is available from the bits that are
still set, and returns if there is (pause as I rummage through kernel source).
Well, it's not as easy as I fantasized it would be. To see how the kernel
determines that mask, look in fs/select.c in the kernel source tree.
Anyway, you need to know that number, and the easiest way to do it is to save
the value of the last file descriptor open somewhere so you don't lose it.
Ok, that's what I know. A warning about the code above (as always) is that
it is not tested. I think it should work, but if it doesn't let me know.
But, if it starts a global nuclear meltdown, don't call me. ;-)
</programlisting>
</para>
</answer>
</qandaentry>
</qandaset>
<para>
<emphasis>That's all for now, folks</emphasis>.
</para>
</chapter>
<appendix xml:id="a-history"><title>History</title>
<?dbhtml filename="history.html"?>
<para>
Each version includes a few fixes and minor corrections,
that need not to be repeatedly mentioned every time.
</para>
<para>
<revhistory xml:id="revhistory">
<revision>
<revnumber>0.7</revnumber>
<date>3 Mar 2013</date>
<authorinitials>lnoor</authorinitials>
<revremark>
New maintainer,
Reformatted as DocBook XML,
Checked, updated or replaced dead links.
</revremark>
</revision>
<revision>
<revnumber>0.6g</revnumber>
<date>11 Feb 2006</date>
<authorinitials>konst</authorinitials>
<revremark>
Added AASM,
updated FASM,
added MIPS example to <link linkend="s-quick">Quick Start</link> section,
added URLs to Turkish and Russian translations,
misc URL updates
</revremark>
</revision>
<revision>
<revnumber>0.6f</revnumber>
<date>17 Aug 2002</date>
<authorinitials>konst</authorinitials>
<revremark>
Added FASM,
added URL to Korean translation,
added URL to SVR4 i386 ABI specs,
update on HLA/Linux,
small fix in hello.S example,
misc URL updates
</revremark>
</revision>
<revision>
<revnumber>0.6e</revnumber>
<date>12 Jan 2002</date>
<authorinitials>konst</authorinitials>
<revremark>
Added URL describing GAS Intel syntax;
Added OSIMPA(former SHASM);
Added YASM;
FAQ update.
</revremark>
</revision>
<revision>
<revnumber>0.6d</revnumber>
<date>18 Mar 2001</date>
<authorinitials>konst</authorinitials>
<revremark>
Added Free Pascal;
new NASM URL again
</revremark>
</revision>
<revision>
<revnumber>0.6c</revnumber>
<date>15 Feb 2001</date>
<authorinitials>konst</authorinitials>
<revremark>
Added SHASM;
new answer in FAQ, new NASM URL, new mailing list address
</revremark>
</revision>
<revision>
<revnumber>0.6b</revnumber>
<date>21 Jan 2001</date>
<authorinitials>konst</authorinitials>
<revremark>
new questions in FAQ, corrected few URLs
</revremark>
</revision>
<revision>
<revnumber>0.6a</revnumber>
<date>10 Dec 2000</date>
<authorinitials>konst</authorinitials>
<revremark>
Remade section on AS86 (thanks to Holluby Istvan for pointing out
obsolete information).
Fixed several URLs that can be incorrectly rendered from sgml to html.
</revremark>
</revision>
<revision>
<revnumber>0.6</revnumber>
<date>11 Nov 2000</date>
<authorinitials>konst</authorinitials>
<revremark>
HOWTO is completely rewritten using DocBook DTD.
Layout is totally rearranged;
too much changes to list them here.
</revremark>
</revision>
<revision>
<revnumber>0.5n</revnumber>
<date>07 Nov 2000</date>
<authorinitials>konst</authorinitials>
<revremark>
Added question regarding kernel modules to <link linkend="s-faq">FAQ</link>,
fixed NASM URLs, GAS has Intel syntax too
</revremark>
</revision>
<revision>
<revnumber>0.5m</revnumber>
<date>22 Oct 2000</date>
<authorinitials>konst</authorinitials>
<revremark>
Linux 2.4 system calls can have 6 args,
Added ALD note to <link linkend="s-faq">FAQ</link>,
fixed mailing list subscribe address
</revremark>
</revision>
<revision>
<revnumber>0.5l</revnumber>
<date>23 Aug 2000</date>
<authorinitials>konst</authorinitials>
<revremark>Added TDASM, updates on NASM</revremark>
</revision>
<revision>
<revnumber>0.5k</revnumber>
<date>11 Jul 2000</date>
<authorinitials>konst</authorinitials>
<revremark>Few additions to FAQ</revremark>
</revision>
<revision>
<revnumber>0.5j</revnumber>
<date>14 Jun 2000</date>
<authorinitials>konst</authorinitials>
<revremark>
Complete rearrangement of <link linkend="s-intro">Introduction</link> and <link linkend="s-res">Resources</link> sections.
<link linkend="s-faq">FAQ</link> added to <link linkend="s-res">Resources</link>,
misc cleanups and additions.
</revremark>
</revision>
<revision>
<revnumber>0.5i</revnumber>
<date>04 May 2000</date>
<authorinitials>konst</authorinitials>
<revremark>
Added HLA, TALC;
rearrangements in <link linkend="s-res">Resources</link>, <link linkend="s-quick">Quick Start</link> sections. Few new pointers.
</revremark>
</revision>
<revision>
<revnumber>0.5h</revnumber>
<date>09 Apr 2000</date>
<authorinitials>konst</authorinitials>
<revremark>
finally managed to state LDP license on document,
new resources added, misc fixes
</revremark>
</revision>
<revision>
<revnumber>0.5g</revnumber>
<date>26 Mar 2000</date>
<authorinitials>konst</authorinitials>
<revremark>new resources on different CPUs</revremark>
</revision>
<revision>
<revnumber>0.5f</revnumber>
<date>02 Mar 2000</date>
<authorinitials>konst</authorinitials>
<revremark>new resources, misc corrections</revremark>
</revision>
<revision>
<revnumber>0.5e</revnumber>
<date>10 Feb 2000</date>
<authorinitials>konst</authorinitials>
<revremark>URL updates, changes in GAS example</revremark>
</revision>
<revision>
<revnumber>0.5d</revnumber>
<date>01 Feb 2000</date>
<authorinitials>konst</authorinitials>
<revremark>
<link linkend="s-res">Resources</link> (former "Pointers") section completely redone,
various URL updates.
</revremark>
</revision>
<revision>
<revnumber>0.5c</revnumber>
<date>05 Dec 1999</date>
<authorinitials>konst</authorinitials>
<revremark>
New pointers, updates and some rearrangements.
Rewrite of sgml source.
</revremark>
</revision>
<revision>
<revnumber>0.5b</revnumber>
<date>19 Sep 1999</date>
<authorinitials>konst</authorinitials>
<revremark>
Discussion about libc or not libc continues.
New web pointers and overall updates.
</revremark>
</revision>
<revision>
<revnumber>0.5a</revnumber>
<date>01 Aug 1999</date>
<authorinitials>konst</authorinitials>
<revremark>
<link linkend="s-quick">Quick Start</link> section rearranged, added GAS example.
Several new web pointers.
</revremark>
</revision>
<revision>
<revnumber>0.5</revnumber>
<date>01 Aug 1999</date>
<authorinitials>konst</authorinitials>
<authorinitials>fare</authorinitials>
<revremark>
GAS has 16-bit mode.
New maintainer (at last): Konstantin Boldyshev.
Discussion about libc or not libc.
Added <link linkend="s-quick">Quick Start</link> section with examples of assembly code.
</revremark>
</revision>
<revision>
<revnumber>0.4q</revnumber>
<date>22 Jun 1999</date>
<authorinitials>fare</authorinitials>
<revremark>
process argument passing (argc, argv, environ) in assembly.
This is yet another
"last release by Fare before new maintainer takes over".
Nobody knows who might be the new maintainer.
</revremark>
</revision>
<revision>
<revnumber>0.4p</revnumber>
<date>06 Jun 1999</date>
<authorinitials>fare</authorinitials>
<revremark>clean up and updates</revremark>
</revision>
<revision>
<revnumber>0.4o</revnumber>
<date>01 Dec 1998</date>
<authorinitials>fare</authorinitials>
</revision>
<revision>
<revnumber>0.4m</revnumber>
<date>23 Mar 1998</date>
<authorinitials>fare</authorinitials>
<revremark>corrections about gcc invocation</revremark>
</revision>
<revision>
<revnumber>0.4l</revnumber>
<date>16 Nov 1997</date>
<authorinitials>fare</authorinitials>
<revremark>release for LSL 6th edition</revremark>
</revision>
<revision>
<revnumber>0.4k</revnumber>
<date>19 Oct 1997</date>
<authorinitials>fare</authorinitials>
</revision>
<revision>
<revnumber>0.4j</revnumber>
<date>07 Sep 1997</date>
<authorinitials>fare</authorinitials>
</revision>
<revision>
<revnumber>0.4i</revnumber>
<date>17 Jul 1997</date>
<authorinitials>fare</authorinitials>
<revremark>info on 16-bit mode access from Linux</revremark>
</revision>
<revision>
<revnumber>0.4h</revnumber>
<date>19 Jun 1997</date>
<authorinitials>fare</authorinitials>
<revremark>
still more on "how not to use assembly";
updates on NASM, GAS.
</revremark>
</revision>
<revision>
<revnumber>0.4g</revnumber>
<date>30 Mar 1997</date>
<authorinitials>fare</authorinitials>
</revision>
<revision>
<revnumber>0.4f</revnumber>
<date>20 Mar 1997</date>
<authorinitials>fare</authorinitials>
</revision>
<revision>
<revnumber>0.4e</revnumber>
<date>13 Mar 1997</date>
<authorinitials>fare</authorinitials>
<revremark>Release for DrLinux</revremark>
</revision>
<revision>
<revnumber>0.4d</revnumber>
<date>28 Feb 1997</date>
<authorinitials>fare</authorinitials>
<revremark>Vapor announce of a new Assembly-HOWTO maintainer</revremark>
</revision>
<revision>
<revnumber>0.4c</revnumber>
<date>09 Feb 1997</date>
<authorinitials>fare</authorinitials>
<revremark>Added section <link linkend="s-doyou">Do you need assembly?</link>.</revremark>
</revision>
<revision>
<revnumber>0.4b</revnumber>
<date>03 Feb 1997</date>
<authorinitials>fare</authorinitials>
<revremark>NASM moved: now is before AS86</revremark>
</revision>
<revision>
<revnumber>0.4a</revnumber>
<date>20 Jan 1997</date>
<authorinitials>fare</authorinitials>
<revremark>CREDITS section added</revremark>
</revision>
<revision>
<revnumber>0.4</revnumber>
<date>20 Jan 1997</date>
<authorinitials>fare</authorinitials>
<revremark>first release of the HOWTO as such</revremark>
</revision>
<revision>
<revnumber>0.4pre1</revnumber>
<date>13 Jan 1997</date>
<authorinitials>fare</authorinitials>
<revremark>
text mini-HOWTO transformed into a full linuxdoc-sgml HOWTO,
to see what the SGML tools are like
</revremark>
</revision>
<revision>
<revnumber>0.3l</revnumber>
<date>11 Jan 1997</date>
<authorinitials>fare</authorinitials>
</revision>
<revision>
<revnumber>0.3k</revnumber>
<date>19 Dec 1996</date>
<authorinitials>fare</authorinitials>
<revremark>What? I had forgotten to point to terse???</revremark>
</revision>
<revision>
<revnumber>0.3j</revnumber>
<date>24 Nov 1996</date>
<authorinitials>fare</authorinitials>
<revremark>point to French translated version</revremark>
</revision>
<revision>
<revnumber>0.3i</revnumber>
<date>16 Nov 1996</date>
<authorinitials>fare</authorinitials>
<revremark>NASM is getting pretty slick</revremark>
</revision>
<revision>
<revnumber>0.3h</revnumber>
<date>06 Nov 1996</date>
<authorinitials>fare</authorinitials>
<revremark>
more about cross-compiling - - See on sunsite: devel/msdos/
</revremark>
</revision>
<revision>
<revnumber>0.3g</revnumber>
<date>02 Nov 1996</date>
<authorinitials>fare</authorinitials>
<revremark>
Created the History. Added pointers in cross-compiling section.
Added section about I/O programming under Linux (particularly video).
</revremark>
</revision>
<revision>
<revnumber>0.3f</revnumber>
<date>17 Oct 1996</date>
<authorinitials>fare</authorinitials>
</revision>
<revision>
<revnumber>0.3c</revnumber>
<date>15 Jun 1996</date>
<authorinitials>fare</authorinitials>
</revision>
<revision>
<revnumber>0.2</revnumber>
<date>04 May 1996</date>
<authorinitials>fare</authorinitials>
</revision>
<revision>
<revnumber>0.1</revnumber>
<date>23 Apr 1996</date>
<authorinitials>fare</authorinitials>
<revremark>
Francois-Rene "Fare" Rideau creates and publishes the first mini-HOWTO,
because "I'm sick of answering ever the same questions
on comp.lang.asm.x86"
</revremark>
</revision>
</revhistory>
</para>
</appendix>
<appendix xml:id="a-ack"><title>Acknowledgements</title>
<?dbhtml filename="acknowledgements.html"?>
<para>
I would like to thank all the people who have contributed ideas,
answers, remarks, and moral support, and additionally
the following persons, by order of appearance:
<itemizedlist>
<listitem>
<para>
<link xlink:href="mailto:buried.alive@in.mail">Linus Torvalds</link>
for Linux
</para>
</listitem>
<listitem>
<para>
<anchor xml:id="bde"/>
<link xlink:href="mailto:bde@zeta.org.au">Bruce Evans</link>
for bcc from which as86 is extracted
</para>
</listitem>
<listitem>
<para>
<link xlink:href="mailto:anakin@pobox.com">Simon Tatham</link> and
<link xlink:href="mailto:jules@earthcorp.com">Julian Hall</link>
for NASM
</para>
</listitem>
<listitem>
<para>
<link xlink:href="mailto:gregh@metalab.unc.edu">Greg Hankins</link> and now
<link xlink:href="mailto:linux-howto@metalab.unc.edu">Tim Bynum</link>
for maintaining HOWTOs
</para>
</listitem>
<listitem>
<para>
<link xlink:href="mailto:raymoon@moonware.dgsys.com">Raymond Moon</link>
for his FAQ
</para>
</listitem>
<listitem>
<para>
<link xlink:href="mailto:dumas@linux.eu.org">Eric Dumas</link>
for his translation of the mini-HOWTO into French
(sad thing for the original author to be French and write in English)
</para>
</listitem>
<listitem>
<para>
<link xlink:href="mailto:paul@geeky1.ebtech.net">Paul Anderson</link> and
<link xlink:href="mailto:rahim@megsinet.net">Rahim Azizarab</link>
for helping me, if not for taking over the HOWTO
</para>
</listitem>
<listitem>
<para>
<link xlink:href="mailto:pcg@goof.com">Marc Lehman</link>
for his insight on GCC invocation
</para>
</listitem>
<listitem>
<para>
<link xlink:href="mailto:ams@wiw.org">Abhijit Menon-Sen</link>
for helping me figure out the argument passing convention
</para>
</listitem>
</itemizedlist>
</para>
</appendix>
<appendix xml:id="a-endor">
<?dbhtml filename="endorsements.html"?>
<title>Endorsements</title>
<para>
This version of the document is endorsed by
<link linkend="lnoor">Leo Noordergraaf</link>.
</para>
<para>
Modifications (including translations) must remove this appendix
according to the <link linkend="a-gfdl">license agreement</link>.
</para>
<para>
<literal>
$Id$
</literal>
</para>
</appendix>
<appendix xml:id="a-gfdl" xreflabel="GNU Free Documentation License">
<?dbhtml filename="fdl.html"?>
<title>GNU Free Documentation License</title>
<para><literallayout>
GNU Free Documentation License
Version 1.1, March 2000
Copyright (C) 2000 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
</literallayout></para>
<para><variablelist>
<varlistentry><term>0. PREAMBLE</term><listitem>
<para>The purpose of this License is to make a manual, textbook,
or other written document "free" in the sense of freedom: to
assure everyone the effective freedom to copy and redistribute it,
with or without modifying it, either commercially or
noncommercially. Secondarily, this License preserves for the
author and publisher a way to get credit for their work, while not
being considered responsible for modifications made by
others.</para>
<para>This License is a kind of "copyleft", which means that
derivative works of the document must themselves be free in the
same sense. It complements the GNU General Public License, which
is a copyleft license designed for free software.</para>
<para>We have designed this License in order to use it for manuals
for free software, because free software needs free documentation:
a free program should come with manuals providing the same
freedoms that the software does. But this License is not limited
to software manuals; it can be used for any textual work,
regardless of subject matter or whether it is published as a
printed book. We recommend this License principally for works
whose purpose is instruction or reference.</para>
</listitem></varlistentry>
<varlistentry><term>1. APPLICABILITY AND DEFINITIONS</term><listitem>
<para>This License applies to any manual or other work that
contains a notice placed by the copyright holder saying it can be
distributed under the terms of this License. The "Document",
below, refers to any such manual or work. Any member of the
public is a licensee, and is addressed as "you".</para>
<para>A "Modified Version" of the Document means any work
containing the Document or a portion of it, either copied
verbatim, or with modifications and/or translated into another
language.</para>
<para>A "Secondary Section" is a named appendix or a front-matter
section of the Document that deals exclusively with the
relationship of the publishers or authors of the Document to the
Document's overall subject (or to related matters) and contains
nothing that could fall directly within that overall subject.
(For example, if the Document is in part a textbook of
mathematics, a Secondary Section may not explain any mathematics.)
The relationship could be a matter of historical connection with
the subject or with related matters, or of legal, commercial,
philosophical, ethical or political position regarding
them.</para>
<para>The "Invariant Sections" are certain Secondary Sections
whose titles are designated, as being those of Invariant Sections,
in the notice that says that the Document is released under this
License.</para>
<para>The "Cover Texts" are certain short passages of text that
are listed, as Front-Cover Texts or Back-Cover Texts, in the
notice that says that the Document is released under this
License.</para>
<para>A "Transparent" copy of the Document means a
machine-readable copy, represented in a format whose specification
is available to the general public, whose contents can be viewed
and edited directly and straightforwardly with generic text
editors or (for images composed of pixels) generic paint programs
or (for drawings) some widely available drawing editor, and that
is suitable for input to text formatters or for automatic
translation to a variety of formats suitable for input to text
formatters. A copy made in an otherwise Transparent file format
whose markup has been designed to thwart or discourage subsequent
modification by readers is not Transparent. A copy that is not
"Transparent" is called "Opaque".</para>
<para>Examples of suitable formats for Transparent copies include
plain ASCII without markup, Texinfo input format, LaTeX input
format, SGML or XML using a publicly available DTD, and
standard-conforming simple HTML designed for human modification.
Opaque formats include PostScript, PDF, proprietary formats that
can be read and edited only by proprietary word processors, SGML
or XML for which the DTD and/or processing tools are not generally
available, and the machine-generated HTML produced by some word
processors for output purposes only.</para>
<para>The "Title Page" means, for a printed book, the title page
itself, plus such following pages as are needed to hold, legibly,
the material this License requires to appear in the title page.
For works in formats which do not have any title page as such,
"Title Page" means the text near the most prominent appearance of
the work's title, preceding the beginning of the body of the
text.</para>
</listitem></varlistentry>
<varlistentry><term>2. VERBATIM COPYING</term><listitem>
<para>You may copy and distribute the Document in any medium,
either commercially or noncommercially, provided that this
License, the copyright notices, and the license notice saying this
License applies to the Document are reproduced in all copies, and
that you add no other conditions whatsoever to those of this
License. You may not use technical measures to obstruct or
control the reading or further copying of the copies you make or
distribute. However, you may accept compensation in exchange for
copies. If you distribute a large enough number of copies you
must also follow the conditions in section 3.</para>
<para>You may also lend copies, under the same conditions stated
above, and you may publicly display copies.</para>
</listitem></varlistentry>
<varlistentry><term>3. COPYING IN QUANTITY</term><listitem>
<para>If you publish printed copies of the Document numbering more
than 100, and the Document's license notice requires Cover Texts,
you must enclose the copies in covers that carry, clearly and
legibly, all these Cover Texts: Front-Cover Texts on the front
cover, and Back-Cover Texts on the back cover. Both covers must
also clearly and legibly identify you as the publisher of these
copies. The front cover must present the full title with all
words of the title equally prominent and visible. You may add
other material on the covers in addition. Copying with changes
limited to the covers, as long as they preserve the title of the
Document and satisfy these conditions, can be treated as verbatim
copying in other respects.</para>
<para>If the required texts for either cover are too voluminous to
fit legibly, you should put the first ones listed (as many as fit
reasonably) on the actual cover, and continue the rest onto
adjacent pages.</para>
<para>If you publish or distribute Opaque copies of the Document
numbering more than 100, you must either include a
machine-readable Transparent copy along with each Opaque copy, or
state in or with each Opaque copy a publicly-accessible
computer-network location containing a complete Transparent copy
of the Document, free of added material, which the general
network-using public has access to download anonymously at no
charge using public-standard network protocols. If you use the
latter option, you must take reasonably prudent steps, when you
begin distribution of Opaque copies in quantity, to ensure that
this Transparent copy will remain thus accessible at the stated
location until at least one year after the last time you
distribute an Opaque copy (directly or through your agents or
retailers) of that edition to the public.</para>
<para>It is requested, but not required, that you contact the
authors of the Document well before redistributing any large
number of copies, to give them a chance to provide you with an
updated version of the Document.</para>
</listitem></varlistentry>
<varlistentry><term>4. MODIFICATIONS</term><listitem>
<para>You may copy and distribute a Modified Version of the
Document under the conditions of sections 2 and 3 above, provided
that you release the Modified Version under precisely this
License, with the Modified Version filling the role of the
Document, thus licensing distribution and modification of the
Modified Version to whoever possesses a copy of it. In addition,
you must do these things in the Modified Version:</para>
<orderedlist numeration="upperalpha">
<listitem><para>Use in the Title Page
(and on the covers, if any) a title distinct from that of the
Document, and from those of previous versions (which should, if
there were any, be listed in the History section of the
Document). You may use the same title as a previous version if
the original publisher of that version gives permission.</para>
</listitem>
<listitem><para>List on the Title Page,
as authors, one or more persons or entities responsible for
authorship of the modifications in the Modified Version,
together with at least five of the principal authors of the
Document (all of its principal authors, if it has less than
five).</para>
</listitem>
<listitem><para>State on the Title page
the name of the publisher of the Modified Version, as the
publisher.</para>
</listitem>
<listitem><para>Preserve all the
copyright notices of the Document.</para>
</listitem>
<listitem><para>Add an appropriate
copyright notice for your modifications adjacent to the other
copyright notices.</para>
</listitem>
<listitem><para>Include, immediately
after the copyright notices, a license notice giving the public
permission to use the Modified Version under the terms of this
License, in the form shown in the Addendum below.</para>
</listitem>
<listitem><para>Preserve in that license
notice the full lists of Invariant Sections and required Cover
Texts given in the Document's license notice.</para>
</listitem>
<listitem><para>Include an unaltered
copy of this License.</para>
</listitem>
<listitem><para>Preserve the section
entitled "History", and its title, and add to it an item stating
at least the title, year, new authors, and publisher of the
Modified Version as given on the Title Page. If there is no
section entitled "History" in the Document, create one stating
the title, year, authors, and publisher of the Document as given
on its Title Page, then add an item describing the Modified
Version as stated in the previous sentence.</para>
</listitem>
<listitem><para>Preserve the network
location, if any, given in the Document for public access to a
Transparent copy of the Document, and likewise the network
locations given in the Document for previous versions it was
based on. These may be placed in the "History" section. You
may omit a network location for a work that was published at
least four years before the Document itself, or if the original
publisher of the version it refers to gives permission.</para>
</listitem>
<listitem><para>In any section entitled
"Acknowledgements" or "Dedications", preserve the section's
title, and preserve in the section all the substance and tone of
each of the contributor acknowledgements and/or dedications
given therein.</para>
</listitem>
<listitem><para>Preserve all the
Invariant Sections of the Document, unaltered in their text and
in their titles. Section numbers or the equivalent are not
considered part of the section titles.</para>
</listitem>
<listitem><para>Delete any section
entitled "Endorsements". Such a section may not be included in
the Modified Version.</para>
</listitem>
<listitem><para>Do not retitle any
existing section as "Endorsements" or to conflict in title with
any Invariant Section.</para>
</listitem>
</orderedlist>
<para>If the Modified Version includes new front-matter sections
or appendices that qualify as Secondary Sections and contain no
material copied from the Document, you may at your option
designate some or all of these sections as invariant. To do this,
add their titles to the list of Invariant Sections in the Modified
Version's license notice. These titles must be distinct from any
other section titles.</para>
<para>You may add a section entitled "Endorsements", provided it
contains nothing but endorsements of your Modified Version by
various parties- -for example, statements of peer review or that
the text has been approved by an organization as the authoritative
definition of a standard.</para>
<para>You may add a passage of up to five words as a Front-Cover
Text, and a passage of up to 25 words as a Back-Cover Text, to the
end of the list of Cover Texts in the Modified Version. Only one
passage of Front-Cover Text and one of Back-Cover Text may be
added by (or through arrangements made by) any one entity. If the
Document already includes a cover text for the same cover,
previously added by you or by arrangement made by the same entity
you are acting on behalf of, you may not add another; but you may
replace the old one, on explicit permission from the previous
publisher that added the old one.</para>
<para>The author(s) and publisher(s) of the Document do not by
this License give permission to use their names for publicity for
or to assert or imply endorsement of any Modified Version.</para>
</listitem></varlistentry>
<varlistentry><term>5. COMBINING DOCUMENTS</term><listitem>
<para>You may combine the Document with other documents released
under this License, under the terms defined in section 4 above for
modified versions, provided that you include in the combination
all of the Invariant Sections of all of the original documents,
unmodified, and list them all as Invariant Sections of your
combined work in its license notice.</para>
<para>The combined work need only contain one copy of this
License, and multiple identical Invariant Sections may be replaced
with a single copy. If there are multiple Invariant Sections with
the same name but different contents, make the title of each such
section unique by adding at the end of it, in parentheses, the
name of the original author or publisher of that section if known,
or else a unique number. Make the same adjustment to the section
titles in the list of Invariant Sections in the license notice of
the combined work.</para>
<para>In the combination, you must combine any sections entitled
"History" in the various original documents, forming one section
entitled "History"; likewise combine any sections entitled
"Acknowledgements", and any sections entitled "Dedications". You
must delete all sections entitled "Endorsements."</para>
</listitem></varlistentry>
<varlistentry><term>6. COLLECTIONS OF DOCUMENTS</term><listitem>
<para>You may make a collection consisting of the Document and
other documents released under this License, and replace the
individual copies of this License in the various documents with a
single copy that is included in the collection, provided that you
follow the rules of this License for verbatim copying of each of
the documents in all other respects.</para>
<para>You may extract a single document from such a collection,
and distribute it individually under this License, provided you
insert a copy of this License into the extracted document, and
follow this License in all other respects regarding verbatim
copying of that document.</para>
</listitem></varlistentry>
<varlistentry><term>7. AGGREGATION WITH INDEPENDENT WORKS</term><listitem>
<para>A compilation of the Document or its derivatives with other
separate and independent documents or works, in or on a volume of
a storage or distribution medium, does not as a whole count as a
Modified Version of the Document, provided no compilation
copyright is claimed for the compilation. Such a compilation is
called an "aggregate", and this License does not apply to the
other self-contained works thus compiled with the Document, on
account of their being thus compiled, if they are not themselves
derivative works of the Document.</para>
<para>If the Cover Text requirement of section 3 is applicable to
these copies of the Document, then if the Document is less than
one quarter of the entire aggregate, the Document's Cover Texts
may be placed on covers that surround only the Document within the
aggregate. Otherwise they must appear on covers around the whole
aggregate.</para>
</listitem></varlistentry>
<varlistentry><term>8. TRANSLATION</term><listitem>
<para>Translation is considered a kind of modification, so you may
distribute translations of the Document under the terms of section
4. Replacing Invariant Sections with translations requires
special permission from their copyright holders, but you may
include translations of some or all Invariant Sections in addition
to the original versions of these Invariant Sections. You may
include a translation of this License provided that you also
include the original English version of this License. In case of
a disagreement between the translation and the original English
version of this License, the original English version will
prevail.</para>
</listitem></varlistentry>
<varlistentry><term>9. TERMINATION</term><listitem>
<para>You may not copy, modify, sublicense, or distribute the
Document except as expressly provided for under this License. Any
other attempt to copy, modify, sublicense or distribute the
Document is void, and will automatically terminate your rights
under this License. However, parties who have received copies, or
rights, from you under this License will not have their licenses
terminated so long as such parties remain in full
compliance.</para>
</listitem></varlistentry>
<varlistentry><term>10. FUTURE REVISIONS OF THIS LICENSE</term><listitem>
<para>The Free Software Foundation may publish new, revised
versions of the GNU Free Documentation License from time to time.
Such new versions will be similar in spirit to the present
version, but may differ in detail to address new problems or
concerns. See <link xlink:href="http://www.gnu.org/copyleft/">
http://www.gnu.org/copyleft/</link>.</para>
<para>Each version of the License is given a distinguishing
version number. If the Document specifies that a particular
numbered version of this License "or any later version" applies to
it, you have the option of following the terms and conditions
either of that specified version or of any later version that has
been published (not as a draft) by the Free Software Foundation.
If the Document does not specify a version number of this License,
you may choose any version ever published (not as a draft) by the
Free Software Foundation.</para>
</listitem></varlistentry>
<varlistentry><term>How to use this License for your documents</term>
<listitem>
<para>To use this License in a document you have written, include
a copy of the License in the document and put the following
copyright and license notices just after the title page:</para>
<para><literallayout>
Copyright (c) YEAR YOUR NAME.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1
or any later version published by the Free Software Foundation;
with the Invariant Sections being LIST THEIR TITLES, with the
Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.
A copy of the license is included in the section entitled "GNU
Free Documentation License".
</literallayout></para>
<para>If you have no Invariant Sections, write "with no Invariant
Sections" instead of saying which ones are invariant. If you have
no Front-Cover Texts, write "no Front-Cover Texts" instead of
"Front-Cover Texts being LIST"; likewise for Back-Cover
Texts.</para>
<para>If your document contains nontrivial examples of program
code, we recommend releasing these examples in parallel under your
choice of free software license, such as the GNU General Public
License, to permit their use in free software.</para>
</listitem></varlistentry>
</variablelist></para>
</appendix>
</book>