mirror of https://github.com/tLDP/LDP
2066 lines
75 KiB
Plaintext
2066 lines
75 KiB
Plaintext
<!doctype linuxdoc system>
|
|
|
|
<!-- $Id$ -->
|
|
|
|
<!--
|
|
This is (probably) the last release of the HOWTO with linuxdoc dtd.
|
|
Following releases (0.6+) will be in docbook dtd.
|
|
Translators (if any), get ready.
|
|
-->
|
|
|
|
<article>
|
|
|
|
<title>Linux Assembly HOWTO
|
|
|
|
<author>
|
|
<url url="mailto:konst@linuxassembly.org" name="Konstantin Boldyshev"> and
|
|
<url url="mailto:fare@tunes.org" name="Francois-Rene Rideau">
|
|
|
|
<date>v0.5m, October 22, 2000
|
|
|
|
<abstract>
|
|
This is the Linux Assembly HOWTO.
|
|
|
|
This document describes how to program in assembly language
|
|
using FREE programming tools,
|
|
focusing on development for or from the Linux Operating System,
|
|
mostly on IA-32 (i386) platform.
|
|
|
|
Included material may or may not be applicable
|
|
to other hardware and/or software platforms.
|
|
Contributions about them are gladly accepted.
|
|
|
|
<bf/Keywords/:
|
|
<tt/assembly, assembler, asm, inline asm, macroprocessor, preprocessor,
|
|
32-bit, IA-32, i386, x86, nasm, gas, as86, OS, kernel, system, libc,
|
|
system call, interrupt, small, fast, embedded, hardware, port/
|
|
</abstract>
|
|
|
|
<toc>
|
|
|
|
<sect>INTRODUCTION
|
|
<p>
|
|
|
|
You can skip this section if you are familiar with HOWTOs,
|
|
or just hate to read all this assembly-nonrelated crap.
|
|
|
|
<sect1>Legal Blurb
|
|
<p>Copyright © 1999-2000 Konstantin Boldyshev.
|
|
<p>Copyright © 1996-1999 Francois-Rene Rideau.
|
|
|
|
This document may be distributed only subject to the terms and conditions set
|
|
forth in the <url url="http://linuxdoc.org/COPYRIGHT.html" name="LDP License">.
|
|
It may be reproduced and distributed in whole or in part,
|
|
in any medium physical or electronic,
|
|
provided that this license notice is displayed in the reproduction.
|
|
Commercial redistribution is permitted and encouraged.
|
|
|
|
All modified documents, including translations, anthologies,
|
|
and partial documents, must meet the following requirements:
|
|
|
|
<itemize>
|
|
<item>The modified version must be labeled as such
|
|
<item>The person making the modifications must be identified
|
|
<item>Acknowledgement of the original author must be retained
|
|
<item>The location of the original unmodified document be identified
|
|
<item>The original author's (or authors') name(s) may not be used to
|
|
assert or imply endorsement of the resulting document without
|
|
the original author's (or authors') permission
|
|
</itemize>
|
|
|
|
The most recent official version of this document is available from
|
|
<url url="http://linuxassembly.org" name="Linux Assembly"> and
|
|
<url url="http://linuxdoc.org" name="LDP"> sites.
|
|
If you are reading a few-months-old copy,
|
|
consider checking urls above for a new version.
|
|
|
|
<sect1>Foreword
|
|
<p>
|
|
This document aims answering questions of those
|
|
who program or want to program 32-bit x86 assembly using
|
|
<em><url url="http://www.gnu.org/philosophy/" name="free software"></em>,
|
|
particularly under the Linux operating system.
|
|
At many places, Universal Resource Locators (URL) are given for some
|
|
software or documentation repository.
|
|
This document also points to other documents about
|
|
non-free, non-x86, or non-32-bit assemblers,
|
|
although this is not its primary goal.
|
|
Also note that there are FAQs and docs about programming
|
|
on your favorite platform (whatever it is), which you should consult
|
|
for platform-specific issues, not related directly to assembly programming.
|
|
|
|
Because the main interest of assembly programming is to build
|
|
the guts of operating systems, interpreters, compilers, and games,
|
|
where C compiler fails to provide the needed expressiveness
|
|
(performance is more and more seldom as issue),
|
|
we are focusing on development of such kind of software.
|
|
|
|
If you don't know what <em/free/ software is,
|
|
please do read <em/carefully/ the GNU General Public License,
|
|
which is used in a lot of free software,
|
|
and is the model for most of their licenses.
|
|
It generally comes in a file named <tt/COPYING/ (or <tt/COPYING.LIB/).
|
|
Literature from the <url url="http://www.fsf.org" name="FSF">
|
|
(free software foundation) might help you, too.
|
|
Particularly, the interesting feature of free software
|
|
is that it comes with source code which you can consult and correct,
|
|
or sometimes even borrow from.
|
|
Read your particular license carefully and do comply to it.
|
|
|
|
<sect1>Contributions
|
|
<p>
|
|
This is an interactively evolving document: you are especially invited
|
|
to ask questions,
|
|
to answer questions,
|
|
to correct given answers,
|
|
to give pointers to new software,
|
|
to point the current maintainer to bugs or deficiencies in the pages.
|
|
In one word, contribute!
|
|
|
|
To contribute, please contact the Assembly-HOWTO maintainer.
|
|
At the time of this writing, it is
|
|
<url url="mailto:konst@linuxassembly.org" name="Konstantin Boldyshev">
|
|
and no more
|
|
<url url="mailto:fare@tunes.org" name="Francois-Rene Rideau">.
|
|
I (Fare) had been looking for some time for a serious hacker
|
|
to replace me as maintainer of this document,
|
|
and am pleased to announce Konstantin as my worthy successor.
|
|
|
|
<sect1>Credits
|
|
<p>
|
|
I would like to thank following persons, by order of appearance:
|
|
<itemize>
|
|
<item><url url="mailto:buried.alive@in.mail" name="Linus Torvalds">
|
|
for Linux
|
|
<item><url url="mailto:bde@zeta.org.au" name="Bruce Evans">
|
|
for bcc from which as86 is extracted
|
|
<item><url url="mailto:anakin@pobox.com" name="Simon Tatham"> and
|
|
<url url="mailto:jules@earthcorp.com" name="Julian Hall">
|
|
for NASM
|
|
<item><url url="mailto:gregh@metalab.unc.edu" name="Greg Hankins">
|
|
and now
|
|
<url url="mailto:linux-howto@metalab.unc.edu" name="Tim Bynum">
|
|
for maintaining HOWTOs
|
|
<item><url url="mailto:raymoon@moonware.dgsys.com" name="Raymond Moon">
|
|
for his FAQ
|
|
<item><url url="mailto:dumas@linux.eu.org" name="Eric Dumas">
|
|
for his translation of the mini-HOWTO into French
|
|
(sad thing for the original author to be French and write in English)
|
|
<item><url url="mailto:paul@geeky1.ebtech.net" name="Paul Anderson">
|
|
and <url url="mailto:rahim@megsinet.net" name="Rahim Azizarab">
|
|
for helping me, if not for taking over the HOWTO.
|
|
<item><url url="mailto:pcg@goof.com" name="Marc Lehman">
|
|
for his insight on GCC invocation.
|
|
<item><url url="mailto:ams@wiw.org" name="Abhijit Menon-Sen">
|
|
for helping me figure out the argument passing convention
|
|
<item>All the people who have contributed ideas, answers, remarks, and moral support.
|
|
</itemize>
|
|
|
|
<sect1>History
|
|
<p>
|
|
Each version includes a few fixes and minor corrections,
|
|
that need not to be repeatedly mentioned every time.
|
|
<descrip>
|
|
<tag/Version 0.5m 22 Oct 2000/
|
|
Linux 2.4 system calls can have 6 args,
|
|
Added ALD note to FAQ,
|
|
fixed mailing list subscribe address
|
|
|
|
<tag/Version 0.5l 23 Aug 2000/
|
|
Added TDASM, updates on NASM
|
|
|
|
<tag/Version 0.5k 11 Jul 2000/
|
|
Few additions to FAQ
|
|
|
|
<tag/Version 0.5j 14 Jun 2000/
|
|
Complete rearrangement of INTRODUCTION and RESOURCES;
|
|
FAQ added to RESOURCES, misc cleanups and additions
|
|
(and more to come)
|
|
|
|
<tag/Version 0.5i 04 May 2000/
|
|
Added HLA, TALC;
|
|
rearrangements in RESOURCES, QUICK START, ASSEMBLERS;
|
|
few new pointers
|
|
|
|
<tag/Version 0.5h 09 Apr 2000/
|
|
finally managed to state LDP license on document,
|
|
new resources added, misc fixes
|
|
|
|
<tag/Version 0.5g 26 Mar 2000/
|
|
new resources on different CPUs
|
|
|
|
<tag/Version 0.5f 02 Mar 2000/
|
|
new resources, misc corrections
|
|
|
|
<tag/Version 0.5e 10 Feb 2000/
|
|
url updates, changes in GAS example
|
|
|
|
<tag/Version 0.5d 01 Feb 2000/
|
|
RESOURCES (former POINTERS) section completely redone,
|
|
various url updates.
|
|
|
|
<tag/Version 0.5c 05 Dec 1999/
|
|
New pointers, updates and some rearrangements.
|
|
Rewrite of sgml source.
|
|
|
|
<tag/Version 0.5b 19 Sep 1999/
|
|
Discussion about libc or not libc continues.
|
|
New web pointers and and overall updates.
|
|
|
|
<tag/Version 0.5a 01 Aug 1999/
|
|
"QUICK START" section rearranged, added GAS example.
|
|
Several new web pointers.
|
|
|
|
<tag/Version 0.5 25 July 1999/
|
|
GAS has 16-bit mode.
|
|
New maintainer (at last): Konstantin Boldyshev.
|
|
Discussion about libc or not libc.
|
|
Added section "QUICK START" with examples of using assembly.
|
|
|
|
<tag/Version 0.4q 22 June 1999/
|
|
process argument passing (argc,argv,environ) in assembly.
|
|
This is yet another
|
|
"last release by Fare before new maintainer takes over".
|
|
Nobody knows who might be the new maintainer.
|
|
|
|
<tag/Version 0.4p 6 June 1999/
|
|
clean up and updates.
|
|
|
|
<tag/Version 0.4o 1 December 1998/ *
|
|
|
|
<tag/Version 0.4m 23 March 1998/
|
|
corrections about gcc invocation
|
|
|
|
<tag/Version 0.4l 16 November 1997/
|
|
release for LSL 6th edition.
|
|
|
|
<tag/Version 0.4k 19 October 1997/ *
|
|
|
|
<tag/Version 0.4j 7 September 1997/ *
|
|
|
|
<tag/Version 0.4i 17 July 1997/
|
|
info on 16-bit mode access from Linux.
|
|
|
|
<tag/Version 0.4h 19 Jun 1997/
|
|
still more on "how not to use assembly";
|
|
updates on NASM, GAS.
|
|
|
|
<tag/Version 0.4g 30 Mar 1997/ *
|
|
|
|
<tag/Version 0.4f 20 Mar 1997/ *
|
|
|
|
<tag/Version 0.4e 13 Mar 1997/
|
|
Release for DrLinux
|
|
|
|
<tag/Version 0.4d 28 Feb 1997/
|
|
Vapor announce of a new Assembly-HOWTO maintainer.
|
|
|
|
<tag/Version 0.4c 9 Feb 1997/
|
|
Added section "DO YOU NEED ASSEMBLY?"
|
|
|
|
<tag/Version 0.4b 3 Feb 1997/
|
|
NASM moved: now is before AS86
|
|
|
|
<tag/Version 0.4a 20 Jan 1997/
|
|
CREDITS section added
|
|
|
|
<tag/Version 0.4 20 Jan 1997/
|
|
first release of the HOWTO as such.
|
|
|
|
<tag/Version 0.4pre1 13 Jan 1997/
|
|
text mini-HOWTO transformed into a full linuxdoc-sgml HOWTO,
|
|
to see what the SGML tools are like.
|
|
|
|
<tag/Version 0.3l 11 Jan 1997/ *
|
|
|
|
<tag/Version 0.3k 19 Dec 1996/
|
|
What? I had forgotten to point to terse???
|
|
|
|
<tag/Version 0.3j 24 Nov 1996/
|
|
point to French translated version
|
|
|
|
<tag/Version 0.3i 16 Nov 1996/
|
|
NASM is getting pretty slick
|
|
|
|
<tag/Version 0.3h 6 Nov 1996/
|
|
more about cross-compiling -- See on sunsite: devel/msdos/
|
|
|
|
<tag/Version 0.3g 2 Nov 1996/
|
|
Created the History. Added pointers in cross-compiling section.
|
|
Added section about I/O programming under Linux (particularly video).
|
|
|
|
<tag/Version 0.3f 17 Oct 1996/ *
|
|
|
|
<tag/Version 0.3c 15 Jun 1996/ *
|
|
|
|
<tag/Version 0.2 04 May 1996/ *
|
|
|
|
<tag/Version 0.1 23 Apr 1996/
|
|
Francois-Rene "Fare" Rideau <fare@tunes.org>
|
|
creates and publishes the first mini-HOWTO,
|
|
because "I'm sick of answering ever the same questions
|
|
on comp.lang.asm.x86"
|
|
</descrip>
|
|
|
|
|
|
|
|
|
|
|
|
<sect>DO YOU NEED ASSEMBLY?<label id="doyouneedasm">
|
|
<p>
|
|
Well, I wouldn't want to interfere with what you're doing,
|
|
but here is some advice from hard-earned experience.
|
|
|
|
|
|
<sect1>Pros and Cons
|
|
<p>
|
|
|
|
<sect2>The advantages of Assembly
|
|
<p>
|
|
Assembly can express very low-level things:
|
|
<itemize>
|
|
<item>you can access machine-dependent registers and I/O.
|
|
<item>you can control the exact behavior of code
|
|
in critical sections that might otherwise involve deadlock
|
|
between multiple software threads or hardware devices.
|
|
<item>you can break the conventions of your usual compiler,
|
|
which might allow some optimizations
|
|
(like temporarily breaking rules about memory allocation,
|
|
threading, calling conventions, etc).
|
|
<item>you can build interfaces between code fragments
|
|
using incompatible such conventions
|
|
(e.g. produced by different compilers,
|
|
or separated by a low-level interface).
|
|
<item>you can get access to unusual programming modes of your processor
|
|
(e.g. 16 bit mode to interface startup, firmware, or legacy code
|
|
on Intel PCs)
|
|
<item>you can produce reasonably fast code for tight loops
|
|
to cope with a bad non-optimizing compiler
|
|
(but then, there are free optimizing compilers available!)
|
|
<item>you can produce hand-optimized code
|
|
perfectly tuned for your particular hardware setup,
|
|
though not to anyone else's.
|
|
<item>you can write some code for your new language's
|
|
optimizing compiler
|
|
(that's something few will ever do, and even they, not often).
|
|
</itemize>
|
|
<p>
|
|
|
|
|
|
<sect2>The disadvantages of Assembly
|
|
<p>
|
|
Assembly is a very low-level language
|
|
(the lowest above hand-coding the binary instruction patterns).
|
|
This means
|
|
<itemize>
|
|
<item>it's long and tedious to write initially,
|
|
<item>it's quite bug-prone,
|
|
<item>your bugs can be very difficult to chase,
|
|
<item>it's very difficult to understand and modify,
|
|
i.e. to maintain.
|
|
<item>the result is very non-portable to other architectures,
|
|
existing or future,
|
|
<item>your code will be optimized only for a certain implementation
|
|
of a same architecture:
|
|
for instance, among Intel-compatible platforms,
|
|
each CPU design and its variations
|
|
(relative latency, throughput, and capacity,
|
|
of processing units, caches, RAM, bus, disks,
|
|
presence of FPU, MMX, 3DNOW, SIMD extensions, etc)
|
|
implies potentially completely different optimization techniques.
|
|
CPU designs already include:
|
|
Intel 386, 486, Pentium, PPro, Pentium II, Pentium III;
|
|
Cyrix 5x86, 6x86; AMD K5, K6 (K6-2, K6-III), K7 (Athlon).
|
|
New designs keep popping up, so don't expect either this listing
|
|
or your code to be up-to-date.
|
|
<item>you spend more time on a few details,
|
|
and can't focus on small and large algorithmic design,
|
|
that are known to bring the largest part of the speed up.
|
|
[e.g. you might spend some time building very fast
|
|
list/array manipulation primitives in assembly;
|
|
only a hash table would have sped up your program much more;
|
|
or, in another context, a binary tree;
|
|
or some high-level structure distributed over a cluster of CPUs]
|
|
<item>a small change in algorithmic design might completely
|
|
invalidate all your existing assembly code.
|
|
So that either you're ready (and able) to rewrite it all,
|
|
or you're tied to a particular algorithmic design;
|
|
<item>On code that ain't too far from what's in standard benchmarks,
|
|
commercial optimizing compilers outperform hand-coded assembly
|
|
(well, that's less true on the x86 architecture
|
|
than on RISC architectures,
|
|
and perhaps less true for widely available/free compilers;
|
|
anyway, for typical C code, GCC is fairly good);
|
|
<item>And in any case, as says moderator John Levine on
|
|
<url url="news:comp.compilers" name="comp.compilers">,
|
|
"compilers make it a lot easier to use complex data structures,
|
|
and compilers don't get bored halfway through
|
|
and generate reliably pretty good code."
|
|
They will also <em/correctly/ propagate code transformations
|
|
throughout the whole (huge) program
|
|
when optimizing code between procedures and module boundaries.
|
|
</itemize>
|
|
<p>
|
|
|
|
<sect2>Assessment
|
|
<p>
|
|
All in all, you might find that though using assembly is sometimes needed,
|
|
and might even be useful in a few cases where it is not, you'll want to:
|
|
<itemize>
|
|
<item>minimize the use of assembly code,
|
|
<item>encapsulate this code in well-defined interfaces
|
|
<item>have your assembly code automatically generated
|
|
from patterns expressed in a higher-level language
|
|
than assembly (e.g. GCC inline assembly macros).
|
|
<item>have automatic tools translate these programs
|
|
into assembly code
|
|
<item>have this code be optimized if possible
|
|
<item>All of the above,
|
|
i.e. write (an extension to) an optimizing compiler back-end.
|
|
</itemize>
|
|
|
|
Even in cases when assembly is needed (e.g. OS development),
|
|
you'll find that not so much of it is,
|
|
and that the above principles hold.
|
|
|
|
See the Linux kernel sources concerning this:
|
|
as little assembly as needed,
|
|
resulting in a fast, reliable, portable, maintainable OS.
|
|
Even a successful game like DOOM was almost massively written in C,
|
|
with a tiny part only being written in assembly for speed up.
|
|
|
|
|
|
<sect1>How to NOT use Assembly
|
|
<p>
|
|
|
|
<sect2>General procedure to achieve efficient code
|
|
<p>
|
|
As says Charles Fiterman on
|
|
<url url="news:comp.compilers" name="comp.compilers">
|
|
about human vs computer-generated assembly code,
|
|
|
|
"
|
|
The human should always win and here is why.
|
|
<itemize>
|
|
<item>First the human writes the whole thing in a high level language.
|
|
<item>Second he profiles it to find the hot spots where it spends its time.
|
|
<item>Third he has the compiler produce assembly for those small
|
|
sections of code.
|
|
<item>Fourth he hand tunes them looking for tiny improvements over
|
|
the machine generated code.
|
|
</itemize>
|
|
The human wins because he can use the machine.
|
|
"
|
|
|
|
|
|
<sect2>Languages with optimizing compilers
|
|
<p>
|
|
Languages like ObjectiveCAML, SML, CommonLISP, Scheme, ADA, Pascal, C, C++,
|
|
among others, all have free optimizing compilers
|
|
that will optimize the bulk of your programs,
|
|
and often do better than hand-coded assembly even for tight loops,
|
|
while allowing you to focus on higher-level details,
|
|
and without forbidding you to grab
|
|
a few percent of extra performance in the above-mentioned way,
|
|
once you've reached a stable design.
|
|
Of course, there are also commercial optimizing compilers
|
|
for most of these languages, too!
|
|
|
|
Some languages have compilers that produce C code,
|
|
which can be further optimized by a C compiler:
|
|
LISP, Scheme, Perl, and many other.
|
|
Speed is fairly good.
|
|
|
|
<sect2>General procedure to speed your code up
|
|
<p>
|
|
As for speeding code up,
|
|
you should do it only for parts of a program
|
|
that a profiling tool has consistently identified
|
|
as being a performance bottleneck.
|
|
|
|
Hence, if you identify some code portion as being too slow, you should
|
|
<itemize>
|
|
<item>first try to use a better algorithm;
|
|
<item>then try to compile it rather than interpret it;
|
|
<item>then try to enable and tweak optimization from your compiler;
|
|
<item>then give the compiler hints about how to optimize
|
|
(typing information in LISP; register usage with GCC;
|
|
lots of options in most compilers, etc).
|
|
<item>then possibly fallback to assembly programming
|
|
</itemize>
|
|
|
|
Finally, before you end up writing assembly,
|
|
you should inspect generated code,
|
|
to check that the problem really is with bad code generation,
|
|
as this might really not be the case:
|
|
compiler-generated code might be better than what you'd have written,
|
|
particularly on modern multi-pipelined architectures!
|
|
Slow parts of a program might be intrinsically so.
|
|
Biggest problems on modern architectures with fast processors
|
|
are due to delays from memory access, cache-misses, TLB-misses,
|
|
and page-faults;
|
|
register optimization becomes useless,
|
|
and you'll more profitably re-think data structures and threading
|
|
to achieve better locality in memory access.
|
|
Perhaps a completely different approach to the problem might help, then.
|
|
|
|
|
|
<sect2>Inspecting compiler-generated code
|
|
<p>
|
|
There are many reasons to inspect compiler-generated assembly code.
|
|
Here are what you'll do with such code:
|
|
<itemize>
|
|
<item>check whether generated code
|
|
can be obviously enhanced with hand-coded assembly
|
|
(or by tweaking compiler switches)
|
|
<item>when that's the case,
|
|
start from generated code and modify it
|
|
instead of starting from scratch
|
|
<item>more generally, use generated code as stubs to modify,
|
|
which at least gets right the way
|
|
your assembly routines interface to the external world
|
|
<item>track down bugs in your compiler (hopefully rarer)
|
|
</itemize>
|
|
|
|
The standard way to have assembly code be generated
|
|
is to invoke your compiler with the <tt/-S/ flag.
|
|
This works with most Unix compilers,
|
|
including the GNU C Compiler (GCC), but YMMV.
|
|
As for GCC, it will produce more understandable assembly code with
|
|
the <tt/-fverbose-asm/ command-line option.
|
|
Of course, if you want to get good assembly code,
|
|
don't forget your usual optimization options and hints!
|
|
|
|
|
|
<sect1>Linux and assembly
|
|
<p>
|
|
In general case you don't need to use assembly language in Linux programming.
|
|
Unlike DOS, you do not have to write Linux drivers in assembly
|
|
(well, actually you can do it if you really want).
|
|
And with modern optimizing compilers,
|
|
if you care of speed optimization for different CPU's,
|
|
it's much simpler to write in C.
|
|
However, if you're reading this,
|
|
you might have some reason to use assembly instead of C/C++.
|
|
|
|
You may <em/need/ to use assembly, or you may <em/want/ to use assembly.
|
|
Shortly, main practical reasons why you may need to get into Linux assembly
|
|
are <em/small code/ and <em/libc independence/.
|
|
Non-practical (and most often) reason is being just an old crazy hacker,
|
|
who has twenty years old habit of doing everything in assembly language.
|
|
|
|
Also, if you're porting Linux to some embedded hardware
|
|
you can be quite short at size of whole system:
|
|
you need to fit kernel, libc
|
|
and all that stuff of (file|find|text|sh|etc.) utils
|
|
into several hundreds of kilobytes,
|
|
and every kilobyte costs much.
|
|
So, one of the ways you've got is to rewrite some
|
|
(or all) parts of system in assembly,
|
|
and this will really save you a lot of space.
|
|
For instance, a simple <tt/httpd/ written in assembly
|
|
can take less than 600 bytes;
|
|
you can fit a webserver, consisting of kernel and httpd,
|
|
in 400 KB or less... Think about it.
|
|
|
|
<sect>ASSEMBLERS
|
|
<p>
|
|
|
|
<sect1>GCC Inline Assembly
|
|
<p>
|
|
The well-known GNU C/C++ Compiler (GCC),
|
|
an optimizing 32-bit compiler at the heart of the GNU project,
|
|
supports the x86 architecture quite well,
|
|
and includes the ability to insert assembly code in C programs,
|
|
in such a way that register allocation can be either specified or left to GCC.
|
|
GCC works on most available platforms,
|
|
notably Linux, *BSD, VSTa, OS/2, *DOS, Win*, etc.
|
|
|
|
<sect2>Where to find GCC
|
|
<p>
|
|
The original GCC site is the GNU FTP site
|
|
<url url="ftp://prep.ai.mit.edu/pub/gnu/gcc/">
|
|
together with all released application software from the GNU project.
|
|
Linux-configured and precompiled versions can be found in
|
|
<url url="ftp://metalab.unc.edu/pub/Linux/GCC/">
|
|
There are a lot of FTP mirrors of both sites,
|
|
everywhere around the world, as well as CD-ROM copies.
|
|
|
|
GCC development has split into two branches some time ago (GCC 2.8 and EGCS),
|
|
but they merged back, and current GCC webpage is <url url="http://gcc.gnu.org">.
|
|
|
|
Sources adapted to your favorite OS and precompiled binaries
|
|
should be found at your usual FTP sites.
|
|
|
|
For most popular DOS port of GCC is named DJGPP,
|
|
and can be found in directories of such name in FTP sites.
|
|
See <url url="http://www.delorie.com/djgpp/">.
|
|
|
|
There are two Win32 GCC ports:
|
|
<url url="http://sourceware.cygnus.com/cygwin/" name="cygwin"> and
|
|
<url url="http://www.mingw.org" name="mingw">
|
|
|
|
There is also a port of GCC to OS/2 named EMX,
|
|
that also works under DOS,
|
|
and includes lots of unix-emulation library routines.
|
|
See around the following site:
|
|
<url url="ftp://ftp-os2.cdrom.com/pub/os2/emx09c/">.
|
|
|
|
<!-- broken url url="http://www.leo.org/pub/comp/os/os2/gnu/emx+gcc/"-->
|
|
<!-- broken url url="http://warp.eecs.berkeley.edu/os2/software/shareware/emx.html"-->
|
|
|
|
<sect2>Where to find docs for GCC Inline Asm
|
|
<p>
|
|
The documentation of GCC includes documentation files in TeXinfo format.
|
|
You can compile them with TeX and print then result,
|
|
or convert them to .info, and browse them with emacs,
|
|
or convert them to .html, or nearly whatever you like;
|
|
convert (with the right tools) to whatever you like,
|
|
or just read as is.
|
|
The .info files are generally found on any good installation for GCC.
|
|
|
|
The right section to look for is <tt/C Extensions::Extended Asm::/
|
|
|
|
Section <tt/Invoking GCC::Submodel Options::i386 Options::/ might help too.
|
|
Particularly, it gives the i386 specific constraint names for registers:
|
|
<tt/abcdSDB/ correspond to
|
|
<tt/%eax/,
|
|
<tt/%ebx/,
|
|
<tt/%ecx/,
|
|
<tt/%edx/,
|
|
<tt/%esi/,
|
|
<tt/%edi/
|
|
and
|
|
<tt/%ebp/
|
|
respectively (no letter for <tt/%esp/).
|
|
|
|
The DJGPP Games resource (not only for game hackers) had page
|
|
specifically about assembly, but it's down.
|
|
Its data have nonetheless been recovered on the
|
|
<url url="http://www.delorie.com/djgpp/" name="DJGPP site">,
|
|
that contains a mine of other useful information:
|
|
<url url="http://www.delorie.com/djgpp/doc/brennan/">,
|
|
and in the <url url="http://www.castle.net/˜avly/djasm.html"
|
|
name="DJGPP Quick ASM Programming Guide">.
|
|
|
|
<!-- broken url url="http://www.rt66.com/˜brennan/djgpp/djgpp_asm.html"-->
|
|
|
|
GCC depends on GAS for assembling, and follow its syntax (see below);
|
|
do mind that inline asm needs percent characters to be quoted
|
|
so they be passed to GAS.
|
|
See the section about GAS below.
|
|
|
|
Find <em/lots/ of useful examples in the <tt>linux/include/asm-i386/</tt>
|
|
subdirectory of the sources for the Linux kernel.
|
|
|
|
|
|
|
|
<sect2>Invoking GCC to build proper inline assembly code
|
|
<p>
|
|
Because assembly routines from the kernel headers
|
|
(and most likely your own headers,
|
|
if you try making your assembly programming as clean
|
|
as it is in the linux kernel)
|
|
are embedded in <tt/extern inline/ functions,
|
|
GCC must be invoked with the <tt/-O/ flag
|
|
(or <tt/-O2/, <tt/-O3/, etc),
|
|
for these routines to be available.
|
|
If not, your code may compile, but not link properly,
|
|
since it will be looking for non-inlined <tt/extern/ functions
|
|
in the libraries against which your program is being linked!
|
|
Another way is to link against libraries that include fallback
|
|
versions of the routines.
|
|
|
|
Inline assembly can be disabled with <tt/-fno-asm/,
|
|
which will have the compiler die when using extended inline asm syntax,
|
|
or else generate calls to an external function named <tt/asm()/
|
|
that the linker can't resolve.
|
|
To counter such flag, <tt/-fasm/ restores treatment
|
|
of the <tt/asm/ keyword.
|
|
|
|
More generally, good compile flags for GCC on the x86 platform are
|
|
<code>
|
|
gcc -O2 -fomit-frame-pointer -W -Wall
|
|
</code>
|
|
|
|
<tt/-O2/ is the good optimization level in most cases.
|
|
Optimizing besides it takes longer, and yields code that is a lot larger,
|
|
but only a bit faster;
|
|
such overoptimization might be useful for tight loops only (if any),
|
|
which you may be doing in assembly anyway.
|
|
In cases when you need really strong compiler optimization for a few files,
|
|
do consider using up to <tt/-O6/.
|
|
|
|
<tt/-fomit-frame-pointer/ allows generated code to skip the stupid
|
|
frame pointer maintenance, which makes code smaller and faster,
|
|
and frees a register for further optimizations.
|
|
It precludes the easy use of debugging tools (<tt/gdb/),
|
|
but when you use these,
|
|
you just don't care about size and speed anymore anyway.
|
|
|
|
<tt/-W -Wall/ enables all warnings
|
|
and helps you catch obvious stupid errors.
|
|
|
|
You can add some CPU-specific <tt/-m486/ or such flag so that
|
|
GCC will produce code that is more adapted to your precise computer.
|
|
Note that modern GCC has <tt/-mpentium/ and such flags
|
|
(and <url url="http://goof.com/pcg/" name="PGCC"> has even more),
|
|
whereas GCC 2.7.x and older versions do not.
|
|
A good choice of CPU-specific flags should be in the Linux kernel.
|
|
Check the TeXinfo documentation of your current GCC installation for more.
|
|
|
|
<tt/-m386/ will help optimize for size,
|
|
hence also for speed on computers whose memory is tight and/or loaded,
|
|
since big programs cause swap, which more than counters
|
|
any "optimization" intended by the larger code.
|
|
In such settings, it might be useful to stop using C,
|
|
and use instead a language that favors code factorization,
|
|
such as a functional language and/or FORTH,
|
|
and use a bytecode- or wordcode- based implementation.
|
|
|
|
Note that you can vary code generation flags from file to file,
|
|
so performance-critical files will use maximum optimization,
|
|
whereas other files will be optimized for size.
|
|
|
|
To optimize even more, option <tt/-mregparm=2/
|
|
and/or corresponding function attribute might help,
|
|
but might pose lots of problems when linking to foreign code,
|
|
<em/including libc/.
|
|
There are ways to correctly declare foreign functions
|
|
so the right call sequences be generated,
|
|
or you might want to recompile the foreign libraries
|
|
to use the same register-based calling convention...
|
|
|
|
Note that you can add make these flags the default by editing file
|
|
<tt>/usr/lib/gcc-lib/i486-linux/2.7.2.3/specs</tt>
|
|
or wherever that is on your system
|
|
(better not add <tt/-W -Wall/ there, though).
|
|
The exact location of the GCC specs files on <em/your/ system
|
|
can be found by asking <tt/gcc -v/.
|
|
|
|
|
|
<sect1>GAS
|
|
<p>
|
|
GAS is the GNU Assembler, that GCC relies upon.
|
|
|
|
|
|
<sect2>Where to find it
|
|
<p>
|
|
Find it at the same place where you found GCC,
|
|
in a package named binutils.
|
|
|
|
The latest version is available from HJLu at
|
|
<url url="ftp://ftp.varesearch.com/pub/support/hjl/binutils/">.
|
|
|
|
|
|
<sect2>What is this AT&T syntax
|
|
<p>
|
|
Because GAS was invented to support a 32-bit unix compiler,
|
|
it uses standard AT&T syntax,
|
|
which resembles a lot the syntax for standard m68k assemblers,
|
|
and is standard in the UNIX world.
|
|
This syntax is no worse, no better than the Intel syntax.
|
|
It's just different.
|
|
When you get used to it,
|
|
you find it much more regular than the Intel syntax,
|
|
though a bit boring.
|
|
|
|
Here are the major caveats about GAS syntax:
|
|
<itemize>
|
|
<item>
|
|
Register names are prefixed with <tt/%/, so that
|
|
registers are <tt/%eax/, <tt/%dl/ and so on,
|
|
instead of just <tt/eax/, <tt/dl/, etc.
|
|
This makes it possible to include external C symbols directly
|
|
in assembly source, without any risk of confusion, or any need
|
|
for ugly underscore prefixes.
|
|
<item>
|
|
The order of operands is source(s) first, and destination last,
|
|
as opposed to the Intel convention of destination first and sources last.
|
|
Hence, what in Intel syntax is <tt/mov ax,dx/ (move contents of
|
|
register <tt/dx/ into register <tt/ax/) will be in GAS syntax
|
|
<tt/mov %dx, %ax/.
|
|
<item>
|
|
The operand length is specified as a suffix to the instruction name.
|
|
The suffix is <tt/b/ for (8-bit) byte,
|
|
<tt/w/ for (16-bit) word,
|
|
and <tt/l/ for (32-bit) long.
|
|
For instance, the correct syntax for the above instruction
|
|
would have been <tt/movw %dx,%ax/.
|
|
However, gas does not require strict AT&T syntax,
|
|
so the suffix is optional when length can be guessed from register operands,
|
|
and else defaults to 32-bit (with a warning).
|
|
<item>
|
|
Immediate operands are marked with a <tt/$/ prefix,
|
|
as in <tt/addl $5,%eax/
|
|
(add immediate long value 5 to register <tt/%eax/).
|
|
<item>
|
|
No prefix to an operand indicates it is a memory-address;
|
|
hence <tt/movl $foo,%eax/
|
|
puts the <em/address/ of variable <tt/foo/
|
|
in register <tt/%eax/,
|
|
but <tt/movl foo,%eax/
|
|
puts the <em/contents/ of variable <tt/foo/
|
|
in register <tt/%eax/.
|
|
<item>
|
|
Indexing or indirection is done by enclosing the index register
|
|
or indirection memory cell address in parentheses,
|
|
as in <tt/testb $0x80,17(%ebp)/
|
|
(test the high bit of the byte value at offset 17
|
|
from the cell pointed to by <tt/%ebp/).
|
|
</itemize>
|
|
|
|
|
|
A program exists to help you convert programs
|
|
from TASM syntax to AT&T syntax. See
|
|
<url url="ftp://x2ftp.oulu.fi/pub/msdos/programming/convert/ta2asv08.zip">.
|
|
<!--
|
|
(Since the original x2ftp site is closing (no more?), use a
|
|
<url url="ftp://ftp.lip6.fr/pub/pc/x2ftp/README.mirror_sites"
|
|
name="mirror site">).
|
|
-->
|
|
There also exists a program for the reverse conversion:
|
|
<url url="http://www.multimania.com/placr/a2i.html">.
|
|
|
|
|
|
GAS has comprehensive documentation in TeXinfo format,
|
|
which comes at least with the source distribution.
|
|
Browse extracted .info pages with Emacs or whatever.
|
|
There used to be a file named gas.doc or as.doc
|
|
around the GAS source package, but it was merged into the TeXinfo docs.
|
|
Of course, in case of doubt, the ultimate documentation
|
|
is the sources themselves!
|
|
A section that will particularly interest you is
|
|
<tt/Machine Dependencies::i386-Dependent::/
|
|
|
|
|
|
Again, the sources for Linux (the OS kernel) come in as excellent examples;
|
|
see under <tt>linux/arch/i386/</tt> the following files:
|
|
<tt>kernel/*.S</tt>, <tt>boot/compressed/*.S</tt>, <tt>mathemu/*.S</tt>.
|
|
|
|
If you are writing kind of a language, a thread package, etc.,
|
|
you might as well see how other languages
|
|
(<url url="http://para.inria.fr/" name="OCaml">,
|
|
<url url="http://www.jwdt.com/~paysan/gforth.html" name="Gforth">,
|
|
etc.),
|
|
or thread packages (QuickThreads, MIT pthreads, LinuxThreads, etc),
|
|
or whatever, do it.
|
|
|
|
Finally, just compiling a C program to assembly
|
|
might show you the syntax for the kind of instructions you want.
|
|
See section <ref id="doyouneedasm" name="Do you need Assembly?"> above.
|
|
|
|
|
|
<sect2>16-bit mode
|
|
<p>
|
|
The current stable release of binutils (2.9.1.0.25)
|
|
now fully supports 16-bit mode (registers <em/and/ addressing) on i386 PCs.
|
|
Still with its peculiar AT&T syntax, of course.
|
|
Use <tt/.code16/ and <tt/.code32/
|
|
to switch between assembly modes.
|
|
|
|
Also, a neat trick used by some (including the oskit authors)
|
|
is to have GCC produce code for 16-bit real mode,
|
|
using an inline assembly statement
|
|
<tt/asm(&dquot;.code16\n&dquot;)/.
|
|
GCC will still emit only 32-bit addressing modes,
|
|
but GAS will insert proper 32-bit prefixes for them.
|
|
|
|
|
|
<sect2>GASP
|
|
<p>
|
|
GASP is the GAS Preprocessor.
|
|
It adds macros and some nice syntax to GAS.
|
|
GASP comes together with GAS in the GNU binutils archive.
|
|
It works as a filter, much like cpp and the like.
|
|
I have no idea on details, but it comes with its own texinfo documentation,
|
|
so just browse them (in .info), print them, grok them.
|
|
GAS with GASP looks like a regular macro-assembler to me.
|
|
|
|
|
|
<sect1>NASM
|
|
<p>
|
|
The Netwide Assembler project provides cool i386 assembler,
|
|
written in C, that should be modular enough
|
|
to eventually support all known syntaxes and object formats.
|
|
|
|
<sect2>Where to find NASM<label id="findnasm">
|
|
<p>
|
|
<url url="http://nasm.sourceforge.net">
|
|
|
|
Binary release on your usual metalab mirror in <tt>devel/lang/asm/</tt>.
|
|
Should also be available as .rpm or .deb in your usual RedHat/Debian
|
|
distributions' contrib.
|
|
|
|
<sect2>What it does
|
|
<p>
|
|
The syntax is Intel-style.
|
|
Fairly good macroprocessing support is integrated.
|
|
|
|
Supported object file formats are
|
|
<tt/bin, aout, coff, elf, as86,/ (DOS) <tt/obj, win32,/ (their own format) <tt/rdf/.
|
|
|
|
NASM can be used as a backend for the free LCC compiler
|
|
(support files included).
|
|
|
|
Unless you're using BCC as a 16-bit compiler
|
|
(which is out of scope of this 32-bit HOWTO),
|
|
you should definitely use NASM instead of say AS86 or MASM,
|
|
because it is actively supported online,
|
|
and runs on all platforms.
|
|
|
|
Note: NASM also comes with a disassembler, NDISASM.
|
|
|
|
Its hand-written parser makes it much faster than GAS,
|
|
though of course, it doesn't support three bazillion different architectures.
|
|
If you like Intel-style syntax, as opposed to GAS syntax,
|
|
then it should be the assembler of choice...
|
|
|
|
Note: There are <ref id="res" name="converters between GAS AT&T and Intel assembler syntax">,
|
|
which perform conversion in both directions.
|
|
|
|
<sect1>AS86
|
|
<p>
|
|
AS86 is a 80x86 assembler, both 16-bit and 32-bit,
|
|
part of Bruce Evans' C Compiler (BCC).
|
|
It has mostly Intel-syntax,
|
|
though it differs slightly as for addressing modes.
|
|
|
|
|
|
<sect2>Where to get AS86
|
|
<p>
|
|
A completely outdated version of AS86 is distributed by HJLu
|
|
just to compile the Linux kernel,
|
|
in a package named bin86 (current version 0.4),
|
|
available in any Linux GCC repository.
|
|
But I advise no one to use it for anything else but compiling Linux.
|
|
This version supports only a hacked minix object file format,
|
|
which is not supported by the GNU binutils or anything,
|
|
and it has a few bugs in 32-bit mode,
|
|
so you really should better keep it only for compiling Linux.
|
|
|
|
The most recent versions by Bruce Evans (bde@zeta.org.au)
|
|
are published together with the FreeBSD distribution.
|
|
Well, they were: I could not find the sources from distribution 2.1 on :(
|
|
Hence, I put the sources at my place:
|
|
<url url="http://www.tunes.org/~fare/files/asm/bcc-95.3.12.src.tgz">
|
|
|
|
The Linux/8086 (aka ELKS) project is somehow maintaining bcc
|
|
(though I don't think they included the 32-bit patches).
|
|
See around
|
|
<url url="http://www.linux.org.uk/ELKS-Home/">
|
|
(or <url url="http://www.elks.ecs.soton.ac.uk">)
|
|
and <url url="ftp://linux.mit.edu/pub/linux/ELKS/">.
|
|
I haven't followed these developments,
|
|
and would appreciate a reader contributing on this topic.
|
|
|
|
Among other things, these more recent versions, unlike HJLu's,
|
|
supports Linux GNU a.out format,
|
|
so you can link you code to Linux programs, and/or use the usual
|
|
tools from the GNU binutils package to manipulate your data.
|
|
This version can co-exist without any harm with the previous one
|
|
(see according question below).
|
|
|
|
BCC from 12 march 1995 and earlier version has a misfeature
|
|
that makes all segment pushing/popping 16-bit,
|
|
which is quite annoying when programming in 32-bit mode.
|
|
I wrote a patch at a time when the TUNES Project used as86:
|
|
<url url="http://www.tunes.org/~fare/files/asm/as86.bcc.patch.gz">.
|
|
Bruce Evans accepted this patch,
|
|
but since as far as I know he hasn't published a new release of bcc,
|
|
the ones to ask about integrating it (if not done yet)
|
|
are the ELKS developers.
|
|
|
|
|
|
<sect2>How to invoke the assembler?
|
|
<p>
|
|
Here's the GNU Makefile entry for using bcc
|
|
to transform <tt/.s/ asm
|
|
into both GNU a.out <tt/.o/ object
|
|
and <tt/.l/ listing:
|
|
|
|
<code>
|
|
%.o %.l: %.s
|
|
bcc -3 -G -c -A-d -A-l -A$*.l -o $*.o $<
|
|
</code>
|
|
|
|
Remove the <tt/%.l/, <tt/-A-l/, and <tt/-A$*.l/,
|
|
if you don't want any listing.
|
|
If you want something else than GNU a.out,
|
|
you can see the docs of bcc about the other supported formats,
|
|
and/or use the objcopy utility from the GNU binutils package.
|
|
|
|
|
|
<sect2>Where to find docs
|
|
<p>
|
|
The docs are what is included in the bcc package.
|
|
I salvaged the man pages that used to be available from the FreeBSD site at
|
|
<url url="http://www.tunes.org/~fare/files/asm/bcc-95.3.12.src.tgz">.
|
|
Maybe ELKS developers know better.
|
|
When in doubt, the sources themselves are often a good docs:
|
|
it's not very well commented, but the programming style is straightforward.
|
|
You might try to see how as86 is used in ELKS or Tunes 0.0.0.25...
|
|
|
|
|
|
<sect2>What if I can't compile Linux anymore with this new version ?
|
|
<p>
|
|
Linus is buried alive in mail,
|
|
and since HJLu (official bin86 maintainer)
|
|
chose to write hacks around an obsolete version of as86
|
|
instead of building clean code around the latest version,
|
|
I don't think my patch for compiling Linux with a modern as86
|
|
has any chance to be accepted if resubmitted.
|
|
Now, this shouldn't matter: just keep your as86 from the bin86 package
|
|
in <tt>/usr/bin/</tt>, and let bcc install the good as86 as
|
|
<tt>/usr/local/libexec/i386/bcc/as</tt>
|
|
where it should be. You never need explicitly call this "good" as86,
|
|
because bcc does everything right, including conversion to Linux a.out,
|
|
when invoked with the right options;
|
|
so assemble files exclusively with bcc as a frontend, not directly with as86.
|
|
|
|
Since GAS now supports 16-bit code,
|
|
and since H. Peter Anvin, well-known linux hacker, works on NASM,
|
|
maybe Linux will get rid of AS86, anyway? Who knows!
|
|
|
|
|
|
<sect1>OTHER ASSEMBLERS
|
|
<p>
|
|
These are other non-regular options,
|
|
in case the previous didn't satisfy you (why?),
|
|
that I don't recommend in the usual (?) case,
|
|
but that could be quite useful if the assembler must be integrated
|
|
in the software you're designing (i.e. an OS or development environment).
|
|
|
|
<sect2>Win32Forth assembler
|
|
<p>
|
|
Win32Forth is a <em/free/ 32-bit ANS FORTH system
|
|
that successfully runs under Win32s, Win95, Win/NT.
|
|
It includes a free 32-bit assembler (either prefix or postfix syntax)
|
|
integrated into the reflective FORTH language.
|
|
Macro processing is done with
|
|
the full power of the reflective language FORTH;
|
|
however, the only supported input and output contexts is Win32For itself
|
|
(no dumping of .obj file, but you could add that feature yourself, of course).
|
|
Find it at
|
|
<url url="ftp://ftp.forth.org/pub/Forth/Compilers/native/windows/Win32For/">.
|
|
|
|
|
|
<sect2>TDASM
|
|
<p>
|
|
The Table Driven Assembler (TDASM) is a <em/free/ portable
|
|
cross assembler for any kind of assembly language.
|
|
It should be possible to use it as a compiler to any target microprocessor
|
|
using a table that defines the compilation process.
|
|
|
|
It is available from <url url="http://www.penguin.cz/~niki/tdasm/">.
|
|
|
|
|
|
<sect2>Terse
|
|
<p>
|
|
<url url="http://www.terse.com" name="Terse">
|
|
is a programming tool that provides
|
|
<em/THE/ most compact assembler syntax for the x86 family!
|
|
However, it is evil proprietary software.
|
|
It is said that there was a project for a free clone somewhere,
|
|
that was abandoned after worthless pretenses that the syntax
|
|
would be owned by the original author.
|
|
Thus, if you're looking for
|
|
a nifty programming project related to assembly hacking,
|
|
I invite you to develop a terse-syntax frontend to NASM,
|
|
if you like that syntax.
|
|
|
|
As an interesting historic remark, on
|
|
<url url="news:comp.compilers" name="comp.compilers">,
|
|
1999/07/11 19:36:51, the moderator wrote:
|
|
"There's no reason that assemblers have to have awful syntax. About
|
|
30 years ago I used Niklaus Wirth's PL360, which was basically a S/360
|
|
assembler with Algol syntax and a a little syntactic sugar like while
|
|
loops that turned into the obvious branches. It really was an
|
|
assembler, e.g., you had to write out your expressions with explicit
|
|
assignments of values to registers, but it was nice. Wirth used it to
|
|
write Algol W, a small fast Algol subset, which was a predecessor to
|
|
Pascal. As is so often the case, Algol W was a significant
|
|
improvement over many of its successors. -John"
|
|
|
|
<sect2>HLA
|
|
<p>
|
|
<url url="http://webster.cs.ucr.edu" name="HLA">
|
|
is a <bf/H/igh <bf/L/evel <bf/A/ssembly language.
|
|
It uses a high level language like syntax
|
|
(similar to Pascal, C/C++, and other HLLs) for variable declarations,
|
|
procedure declarations, and procedure calls. It uses a modified
|
|
assembly language syntax for the standard machine instructions.
|
|
It also provides several high level language style control structures
|
|
(if, while, repeat..until, etc.) that help you write much more readable code.
|
|
|
|
HLA is free, but runs only under Win32.
|
|
You need MASM and a 32-bit version of MS-link,
|
|
because HLA produces MASM code and uses MASM for final
|
|
assembling and linking. However it comes with <tt/m2t/ (MASM to TASM)
|
|
post-processor program that converts the HLA MASM output to a form
|
|
that will compile under TASM. Unfortunately, NASM is not supported.
|
|
|
|
|
|
<sect2>TALC
|
|
<p>
|
|
<url url="http://www.cs.cornell.edu/talc/" name="TALC">
|
|
is another free MASM/Win32 based compiler
|
|
(however it supports ELF output, does it?).
|
|
|
|
TAL stands for <bf/T/yped <bf/A/ssembly <bf/L/anguage.
|
|
It extends traditional untyped assembly languages with typing annotations,
|
|
memory management primitives, and a sound set of typing rules, to guarantee
|
|
the memory safety, control flow safety,and type safety of TAL programs.
|
|
Moreover, the typing constructs are expressive enough to encode
|
|
most source language programming features including records and structures,
|
|
arrays, higher-order and polymorphic functions, exceptions, abstract data types,
|
|
subtyping, and modules.
|
|
Just as importantly, TAL is flexible enough to admit many low-level compiler optimizations.
|
|
Consequently, TAL is an ideal target platform for type-directed compilers
|
|
that want to produce verifiably safe code
|
|
for use in secure mobile code applications
|
|
or extensible operating system kernels.
|
|
|
|
<sect2>Non-free and/or Non-32bit x86 assemblers.
|
|
<p>
|
|
You may find more about them,
|
|
together with the basics of x86 assembly programming,
|
|
in <ref id="res-general" name="Raymond Moon's x86 assembly FAQ">.
|
|
|
|
Note that all DOS-based assemblers should work inside the Linux DOS Emulator,
|
|
as well as other similar emulators, so that if you already own one,
|
|
you can still use it inside a real OS.
|
|
Recent DOS-based assemblers also support COFF and/or other object file formats
|
|
that are supported by the GNU BFD library,
|
|
so that you can use them together with your free 32-bit tools,
|
|
perhaps using GNU objcopy (part of the binutils) as a conversion filter.
|
|
|
|
|
|
|
|
<sect>METAPROGRAMMING/MACROPROCESSING
|
|
<p>
|
|
Assembly programming is a bore,
|
|
but for critical parts of programs.
|
|
|
|
You should use the appropriate tool for the right task,
|
|
so don't choose assembly when it's not fit;
|
|
C, OCaml, perl, Scheme, might be a better choice for most
|
|
of your programming.
|
|
|
|
However, there are cases when these tools do not give
|
|
a fine enough control on the machine, and assembly is useful or needed.
|
|
In those case, you'll appreciate a system of macroprocessing and
|
|
metaprogramming that'll allow recurring patterns to be factored
|
|
each into a one indefinitely reusable definition,
|
|
which allows safer programming, automatic propagation of pattern modification,
|
|
etc.
|
|
Plain assembler often is not enough,
|
|
even when one is doing only small routines to link with C.
|
|
|
|
|
|
<sect1>What's integrated into the above
|
|
<p>
|
|
|
|
Yes I know this section does not contain much useful up-to-date information.
|
|
Feel free to contribute what you discover the hard way...
|
|
|
|
|
|
<sect2>GCC
|
|
<p>
|
|
GCC allows (and requires) you to specify register constraints
|
|
in your inline assembly code, so the optimizer always know about it;
|
|
thus, inline assembly code is really made of patterns,
|
|
not forcibly exact code.
|
|
|
|
Thus, you can make put your assembly into CPP macros, and inline C functions,
|
|
so anyone can use it in as any C function/macro.
|
|
Inline functions resemble macros very much, but are sometimes cleaner to use.
|
|
Beware that in all those cases, code will be duplicated,
|
|
so only local labels (of <tt/1:/ style)
|
|
should be defined in that asm code.
|
|
However, a macro would allow the name for a non local defined label
|
|
to be passed as a parameter
|
|
(or else, you should use additional meta-programming methods).
|
|
Also, note that propagating inline asm code will spread potential bugs in them;
|
|
so watch out doubly for register constraints in such inline asm code.
|
|
|
|
Lastly, the C language itself may be considered as a good abstraction
|
|
to assembly programming,
|
|
which relieves you from most of the trouble of assembling.
|
|
|
|
|
|
<sect2>GAS
|
|
<p>
|
|
GAS has some macro capability included, as detailed in the texinfo docs.
|
|
Moreover, while GCC recognizes .s files as raw assembly to send to GAS,
|
|
it also recognizes .S files as files to pipe through CPP before
|
|
to feed them to GAS.
|
|
Again and again, see Linux sources for examples.
|
|
|
|
|
|
<sect2>GASP
|
|
<p>
|
|
It adds all the usual macroassembly tricks to GAS.
|
|
See its texinfo docs.
|
|
|
|
|
|
<sect2>NASM
|
|
<p>
|
|
NASM has comprehensive macro support, too.
|
|
See according docs.
|
|
If you have some bright idea,
|
|
you might wanna contact the authors,
|
|
as they are actively developing it.
|
|
Meanwhile, see about external filters below.
|
|
|
|
|
|
<sect2>AS86
|
|
<p>
|
|
It has some simple macro support, but I couldn't find docs.
|
|
Now the sources are very straightforward,
|
|
so if you're interested, you should understand them easily.
|
|
If you need more than the basics, you should use an external filter
|
|
(see below).
|
|
|
|
|
|
<sect2>OTHER ASSEMBLERS
|
|
<p>
|
|
<itemize>
|
|
<item>
|
|
Win32FORTH:
|
|
CODE and END-CODE are normal that do not switch from interpretation mode
|
|
to compilation mode, so you have access to the full power of FORTH
|
|
while assembling.
|
|
<item>
|
|
TUNES:
|
|
it doesn't work yet, but the Scheme language is a real high-level language
|
|
that allows arbitrary meta-programming.
|
|
</itemize>
|
|
|
|
|
|
<sect1>External Filters
|
|
<p>
|
|
Whatever is the macro support from your assembler,
|
|
or whatever language you use (even C !),
|
|
if the language is not expressive enough to you,
|
|
you can have files passed through an external filter
|
|
with a Makefile rule like that:
|
|
|
|
<code>
|
|
%.s: %.S other_dependencies
|
|
$(FILTER) $(FILTER_OPTIONS) < $< > $@
|
|
</code>
|
|
|
|
|
|
<sect2>CPP
|
|
<p>
|
|
CPP is truly not very expressive, but it's enough for easy things,
|
|
it's standard, and called transparently by GCC.
|
|
|
|
As an example of its limitations, you can't declare objects so that
|
|
destructors are automatically called at the end of the declaring block;
|
|
you don't have diversions or scoping, etc.
|
|
|
|
CPP comes with any C compiler.
|
|
However, considering how mediocre it is,
|
|
stay away from it if by chance you can make it without C,
|
|
|
|
|
|
<sect2>M4
|
|
<p>
|
|
M4 gives you the full power of macroprocessing,
|
|
with a Turing equivalent language, recursion, regular expressions, etc.
|
|
You can do with it everything that CPP cannot.
|
|
|
|
See <url url="ftp://ftp.forth.org/pub/Forth/Compilers/native/unix/this4th.tar.gz" name="macro4th (this4th)">
|
|
or
|
|
<url url="ftp://ftp.tunes.org/pub/tunes/obsolete/dist/tunes.0.0.0/tunes.0.0.0.25.src.zip"
|
|
name="the Tunes 0.0.0.25 sources">
|
|
as examples of advanced macroprogramming using m4.
|
|
|
|
However, its disfunctional quoting and unquoting semantics force you to use
|
|
explicit continuation-passing tail-recursive macro style if
|
|
you want to do <em/advanced/ macro programming
|
|
(which is remindful of TeX -- BTW, has anyone tried to use TeX as
|
|
a macroprocessor for anything else than typesetting ?).
|
|
This is NOT worse than CPP that does not allow quoting and recursion anyway.
|
|
|
|
The right version of m4 to get is GNU m4 1.4 (or later if exists),
|
|
which has the most features and the least bugs or limitations of all.
|
|
m4 is designed to be slow for anything but the simplest uses,
|
|
which might still be ok for most assembly programming
|
|
(you're not writing million-lines assembly programs, are you?).
|
|
|
|
|
|
<sect2>Macroprocessing with your own filter
|
|
<p>
|
|
You can write your own simple macro-expansion filter
|
|
with the usual tools: perl, awk, sed, etc.
|
|
That's quick to do, and you control everything.
|
|
But of course, any power in macroprocessing must be earned the hard way.
|
|
|
|
|
|
<sect2>Metaprogramming
|
|
<p>
|
|
Instead of using an external filter that expands macros,
|
|
one way to do things is to write programs that write part
|
|
or all of other programs.
|
|
|
|
For instance, you could use a program outputting source code
|
|
<itemize>
|
|
<item>
|
|
to generate sine/cosine/whatever lookup tables,
|
|
<item>
|
|
to extract a source-form representation of a binary file,
|
|
<item>
|
|
to compile your bitmaps into fast display routines,
|
|
<item>
|
|
to extract documentation, initialization/finalization code,
|
|
description tables, as well as normal code from the same source files,
|
|
<item>
|
|
to have customized assembly code, generated from a perl/shell/scheme script
|
|
that does arbitrary processing,
|
|
<item>
|
|
to propagate data defined at one point only
|
|
into several cross-referencing tables and code chunks.
|
|
<item>
|
|
etc.
|
|
</itemize>
|
|
|
|
Think about it!
|
|
|
|
|
|
<sect3>Backends from compilers
|
|
<p>
|
|
Compilers like GCC, SML/NJ, Objective CAML, MIT-Scheme, CMUCL, etc,
|
|
do have their own generic assembler backend,
|
|
which you might choose to use,
|
|
if you intend to generate code semi-automatically
|
|
from the according languages,
|
|
or from a language you hack:
|
|
rather than write great assembly code,
|
|
you may instead modify a compiler so that it dumps great assembly code!
|
|
|
|
|
|
<sect3>The New-Jersey Machine-Code Toolkit
|
|
<p>
|
|
There is a project, using the programming language Icon
|
|
(with an experimental ML version),
|
|
to build a basis for producing assembly-manipulating code.
|
|
See around
|
|
<url url="http://www.eecs.harvard.edu/˜nr/toolkit/">
|
|
|
|
|
|
<sect3>TUNES<p>
|
|
|
|
The <url url="http://www.tunes.org" name="TUNES Project">
|
|
for a Free Reflective Computing System
|
|
is developing its own assembler
|
|
as an extension to the Scheme language,
|
|
as part of its development process.
|
|
It doesn't run at all yet, though help is welcome.
|
|
|
|
The assembler manipulates abstract syntax trees,
|
|
so it could equally serve as the basis for a assembly syntax translator,
|
|
a disassembler, a common assembler/compiler back-end, etc.
|
|
Also, the full power of a real language, Scheme,
|
|
make it unchallenged as for macroprocessing/metaprogramming.
|
|
|
|
|
|
|
|
|
|
<sect>CALLING CONVENTIONS
|
|
<p>
|
|
|
|
|
|
<sect1>Linux
|
|
<p>
|
|
|
|
<sect2>Linking to GCC
|
|
<p>
|
|
This is the preferred way if you are developing mixed C-asm project.
|
|
Check GCC docs and examples from Linux kernel <tt/.S/ files
|
|
that go through gas (not those that go through as86).
|
|
|
|
32-bit arguments are pushed down stack in reverse syntactic order
|
|
(hence accessed/popped in the right order),
|
|
above the 32-bit near return address.
|
|
<tt/%ebp, %esi, %edi, %ebx/ are callee-saved,
|
|
other registers are caller-saved;
|
|
<tt/%eax/ is to hold the result,
|
|
or <tt/%edx:%eax/ for 64-bit results.
|
|
|
|
FP stack: I'm not sure,
|
|
but I think it's result in <tt/st(0)/, whole stack caller-saved.
|
|
|
|
Note that GCC has options to modify the calling conventions
|
|
by reserving registers, having arguments in registers,
|
|
not assuming the FPU, etc. Check the i386 .info pages.
|
|
|
|
Beware that you must then declare the <tt/cdecl/ or <tt/regparm(0)/
|
|
attribute for a function that will follow standard GCC calling conventions.
|
|
See in the GCC info pages the section:
|
|
<tt/C Extensions::Extended Asm::/.
|
|
See also how Linux defines its asmlinkage macro...
|
|
|
|
|
|
|
|
<sect2>ELF vs a.out problems
|
|
<p>
|
|
Some C compilers prepend an underscore before every symbol,
|
|
while others do not.
|
|
|
|
Particularly, Linux a.out GCC does such prepending,
|
|
while Linux ELF GCC does not.
|
|
|
|
If you need cope with both behaviors at once,
|
|
see how existing packages do.
|
|
For instance, get an old Linux source tree,
|
|
the Elk, qthreads, or OCaml...
|
|
|
|
You can also override the implicit C→asm renaming
|
|
by inserting statements like
|
|
<code>
|
|
void foo asm(&dquot;bar&dquot;) (void);
|
|
</code>
|
|
to be sure that the C function <tt/foo/
|
|
will be called really <tt/bar/ in assembly.
|
|
|
|
Note that the utility <tt/objcopy/, from the <tt/binutils/ package,
|
|
should allow you to transform your a.out objects into ELF objects,
|
|
and perhaps the contrary too, in some cases.
|
|
More generally, it will do lots of file format conversions.
|
|
|
|
|
|
|
|
<sect2>Direct Linux syscalls
|
|
<p>
|
|
Often you will be told that using libc is the only way,
|
|
and direct system calls are bad. This is true. To some extent.
|
|
So, you must know that libc is not sacred, and in <em/most/ cases
|
|
libc only does some checks, then calls kernel, and then sets errno.
|
|
You can easily do this in your program as well (if you need to),
|
|
and your program will be dozen times smaller, and
|
|
this will also result in improved performance, just because
|
|
you're not using shared libraries (static binaries are faster).
|
|
Using or not using libc in assembly programming is more a question of
|
|
taste/belief than something practical.
|
|
Remember, Linux is aiming to be POSIX compliant, so
|
|
does libc. This means that syntax of almost all libc "system calls" exactly
|
|
matches syntax of real kernel system calls (and vice versa). Besides, modern
|
|
libc becomes slower and slower, and eats more and more memory, and so, cases
|
|
of using direct system calls become quite usual.
|
|
But.. main drawback of throwing libc away is that possibly you will need to
|
|
implement several libc specific functions (that are not just syscall wrappers)
|
|
on your own (printf and Co.).. and you are ready for that, aren't you? :)
|
|
|
|
|
|
Here is summary of direct system calls pros and cons.
|
|
|
|
Pros:
|
|
<itemize>
|
|
<item>smallest possible size; squeezing the last byte out of the system.
|
|
<item>highest possible speed; squeezing cycles out of your favorite benchmark.
|
|
<item>full control: you can adapt your program/library
|
|
to your specific language or memory requirements or whatever
|
|
<item>no pollution by libc cruft.
|
|
<item>no pollution by C calling conventions
|
|
(if you're developing your own language or environment).
|
|
<item>static binaries make you independent from libc upgrades or crashes,
|
|
or from dangling <tt/#!/ path to a interpreter (and are faster).
|
|
<item>just for the fun out of it
|
|
(don't you get a kick out of assembly programming?)
|
|
</itemize>
|
|
|
|
Cons:
|
|
<itemize>
|
|
<item>If any other program on your computer uses the libc,
|
|
then duplicating the libc code will actually
|
|
waste memory, not save it.
|
|
<item>Services redundantly implemented in many static binaries
|
|
are a waste of memory.
|
|
But you can make your libc replacement a shared library.
|
|
<item>Size is much better saved by having some kind
|
|
of bytecode, wordcode, or structure interpreter
|
|
than by writing everything in assembly.
|
|
(the interpreter itself could be written either in C or assembly.)
|
|
The best way to keep multiple binaries small is
|
|
to not have multiple binaries, but instead
|
|
to have an interpreter process files with <tt/#!/ prefix.
|
|
This is how OCaml works when used in wordcode mode
|
|
(as opposed to optimized native code mode),
|
|
and it is compatible with using the libc.
|
|
This is also how Tom Christiansen's
|
|
<url name="Perl PowerTools"
|
|
url="http://language.perl.com/ppt/">
|
|
reimplementation of unix utilities works.
|
|
Finally, one last way to keep things small,
|
|
that doesn't depend on an external file with a hardcoded path,
|
|
be it library or interpreter,
|
|
is to have only one binary,
|
|
and have multiply-named hard or soft links to it:
|
|
the same binary will provide everything you need in an optimal space,
|
|
with no redundancy of subroutines or useless binary headers;
|
|
it will dispatch its specific behavior
|
|
according to its <tt/argv[0]/;
|
|
in case it isn't called with a recognized name,
|
|
it might default to a shell,
|
|
and be possibly thus also usable as an interpreter!
|
|
<item>You cannot benefit from the many functionalities that libc provides
|
|
besides mere linux syscalls:
|
|
that is, functionality described in section 3 of the manual pages,
|
|
as opposed to section 2,
|
|
such as malloc, threads, locale, password,
|
|
high-level network management, etc.
|
|
<item>Consequently, you might have to reimplement large parts of libc,
|
|
from <tt/printf/ to <tt/malloc/ and <tt/gethostbyname/.
|
|
It's redundant with the libc effort,
|
|
and can be <em/quite/ boring sometimes.
|
|
Note that some people have already reimplemented "light"
|
|
replacements for parts of the libc -- check them out!
|
|
(Redhat's minilibc,
|
|
Rick Hohensee's <url url="ftp://linux01.gwdg.de/pub/cLIeNUX/interim/libsys.tgz" name="libsys">,
|
|
Felix von Leitner's <url url="http://www.fefe.de/dietlibc/" name="dietlibc">,
|
|
Christian Fowelin's <ref id="res" name="libASM">,
|
|
<ref id="res" name="asmutils"> project is working on pure assembly libc)
|
|
|
|
<item>Static libraries prevent your benefitting from libc upgrades
|
|
as well as from libc add-ons such as the <tt/zlibc/ package,
|
|
that does on-the-fly transparent decompression
|
|
of gzip-compressed files.
|
|
<item>The few instructions added by the libc are
|
|
a <em/ridiculously/ small speed overhead as compared
|
|
to the cost of a system call.
|
|
If speed is a concern, your main problem is in
|
|
your usage of system calls, not in their wrapper's implementation.
|
|
<item>Using the standard assembly API for system calls is much slower
|
|
than using the libc API when running in micro-kernel versions
|
|
of Linux such as L4Linux,
|
|
that have their own faster calling convention,
|
|
and pay high convention-translation overhead
|
|
when using the standard one
|
|
(L4Linux comes with libc recompiled with their syscall API;
|
|
of course, you could recompile your code with their API, too).
|
|
<item>See previous discussion for general speed optimization issue.
|
|
<item>If syscalls are too slow to you,
|
|
you might want to hack the kernel sources (in C)
|
|
instead of staying in userland.
|
|
</itemize>
|
|
|
|
If you've pondered the above pros and cons,
|
|
and still want to use direct syscalls
|
|
(as documented in section 2 of the manual pages),
|
|
then here is some advice.
|
|
|
|
<itemize>
|
|
<item>You can easily define your system calling functions
|
|
in a portable way in C (as opposed to unportable using assembly),
|
|
by including <tt><asm/unistd.h></tt>,
|
|
and using provided macros.
|
|
<item>Since you're trying to replace it,
|
|
go get the sources for the libc, and grok them.
|
|
(And if you think you can do better,
|
|
then send feedback to the authors!)
|
|
<item>As an example of pure assembly code that does everything you want,
|
|
examine <ref id="res" name="Linux Assembly resources">.
|
|
</itemize>
|
|
|
|
Basically, you issue an <tt/int 0x80/,
|
|
with the <tt/__NR_/syscallname number
|
|
(from <tt>asm/unistd.h</tt>)
|
|
in <tt/eax/, and parameters (up to six [*]) in
|
|
<tt/ebx, ecx, edx, esi, edi, ebp [*]/ respectively ([*] - Linux 2.4 only,
|
|
previous versions have only 5 parameters).
|
|
Result is returned in <tt/eax/, with a negative result being an error,
|
|
whose opposite is what libc would put in <tt/errno/.
|
|
The user-stack is not touched,
|
|
so you needn't have a valid one when doing a syscall.
|
|
|
|
<url url="http://www.linuxdoc.org/LDP/lki/" name="Linux Kernel Internals">,
|
|
and especially
|
|
<url url="http://www.linuxdoc.org/LDP/lki/Linux-Kernel-Internals-2.html#ss2.11"
|
|
name="How System Calls Are Implemented on i386 Architecture?">
|
|
chapter will give you more robust overview.
|
|
|
|
As for the invocation arguments passed to a process upon startup,
|
|
the general principle is that the stack
|
|
originally contains the number of arguments <tt/argc/,
|
|
then the list of pointers that constitute <tt/*argv/,
|
|
then a null-terminated sequence of null-terminated
|
|
variable=value strings for the <tt/environ/ment.
|
|
For more details,
|
|
do examine <ref id="res" name="Linux assembly resources">,
|
|
read the sources of C startup code from your libc
|
|
(<tt/crt0.S/ or <tt/crt1.S/),
|
|
or those from the Linux kernel
|
|
(<tt/exec.c/ and <tt/binfmt_*.c/ in <tt>linux/fs/</tt>).
|
|
|
|
|
|
<sect2>Hardware I/O under Linux
|
|
<p>
|
|
If you want to do direct I/O under Linux,
|
|
either it's something very simple that needn't OS arbitration,
|
|
and you should see the <tt/IO-Port-Programming/ mini-HOWTO;
|
|
or it needs a kernel device driver, and you should try to learn more about
|
|
kernel hacking, device driver development, kernel modules, etc,
|
|
for which there are other excellent HOWTOs and documents from the LDP.
|
|
|
|
Particularly, if what you want is Graphics programming,
|
|
then do join one of the
|
|
<url url="http://www.ggi-project.org/" name="GGI">
|
|
or <url url="http://www.XFree86.org/" name="XFree86">
|
|
projects.
|
|
|
|
Some people have even done better,
|
|
writing small and robust XFree86 drivers
|
|
in an interpreted domain-specific language,
|
|
<url url="http://www.irisa.fr/compose/gal/" name="GAL">,
|
|
and achieving the efficiency of hand C-written drivers
|
|
through partial evaluation (drivers not only not in asm, but not even in C!).
|
|
The problem is that the partial evaluator they used
|
|
to achieve efficiency is not free software.
|
|
Any taker for a replacement?
|
|
|
|
Anyway, in all these cases, you'll be better when using GCC inline assembly
|
|
with the macros from <tt>linux/asm/*.h</tt>
|
|
than writing full assembly source files.
|
|
|
|
|
|
<sect2>Accessing 16-bit drivers from Linux/i386
|
|
<p>
|
|
Such thing is theoretically possible
|
|
(proof: see how <url url="http://www.dosemu.org" name="DOSEMU">
|
|
can selectively grant hardware port access to programs),
|
|
and I've heard rumors that someone somewhere did actually do it
|
|
(in the PCI driver? Some VESA access stuff? ISA PnP? dunno).
|
|
If you have some more precise information on that,
|
|
you'll be most welcome.
|
|
Anyway, good places to look for more information are the Linux kernel sources,
|
|
DOSEMU sources (and other programs in the
|
|
<url url="ftp://tsx-11.mit.edu/pub/linux/ALPHA/dosemu/"
|
|
name="DOSEMU repository">),
|
|
and sources for various low-level programs under Linux...
|
|
(perhaps GGI if it supports VESA).
|
|
|
|
Basically, you must either use 16-bit protected mode or vm86 mode.
|
|
|
|
The first is simpler to setup, but only works with well-behaved code
|
|
that won't do any kind of segment arithmetics
|
|
or absolute segment addressing (particularly addressing segment 0),
|
|
unless by chance it happens that all segments used can be setup in advance
|
|
in the LDT.
|
|
|
|
The later allows for more "compatibility" with vanilla 16-bit environments,
|
|
but requires more complicated handling.
|
|
|
|
In both cases, before you can jump to 16-bit code,
|
|
you must
|
|
<itemize>
|
|
<item>mmap any absolute address used in the 16-bit code
|
|
(such as ROM, video buffers, DMA targets, and memory-mapped I/O)
|
|
from <tt>/dev/mem</tt> to your process' address space,
|
|
<item>setup the LDT and/or vm86 mode monitor.
|
|
<item>grab proper I/O permissions from the kernel (see the above section)
|
|
</itemize>
|
|
|
|
Again, carefully read the source for the stuff contributed
|
|
to the DOSEMU project,
|
|
particularly these mini-emulators
|
|
for running ELKS and/or simple .COM programs under Linux/i386.
|
|
|
|
|
|
<sect1>DOS
|
|
<p>
|
|
Most DOS extenders come with some interface to DOS services.
|
|
Read their docs about that,
|
|
but often, they just simulate <tt/int 0x21/ and such,
|
|
so you do "as if" you are in real mode
|
|
(I doubt they have more than stubs
|
|
and extend things to work with 32-bit operands;
|
|
they most likely will just reflect the interrupt
|
|
into the real-mode or vm86 handler).
|
|
|
|
Docs about DPMI (and much more) can be found on
|
|
<url url="ftp://x2ftp.oulu.fi/pub/msdos/programming/">
|
|
(again, the original x2ftp site is closing (no more?), so use a
|
|
<url url="ftp://ftp.lip6.fr/pub/pc/x2ftp/README.mirror_sites"
|
|
name="mirror site">).
|
|
|
|
DJGPP comes with its own (limited) glibc derivative/subset/replacement, too.
|
|
|
|
It is possible to cross-compile from Linux to DOS,
|
|
see the devel/msdos/ directory of your local FTP mirror for metalab.unc.edu
|
|
Also see the MOSS dos-extender from the
|
|
<url url="http://www.cs.utah.edu/projects/flux/" name="Flux project">
|
|
from university of Utah.
|
|
|
|
Other documents and FAQs are more DOS-centered.
|
|
We do not recommend DOS development.
|
|
|
|
|
|
<sect1>Windows and Co.
|
|
<p>
|
|
This HOWTO is not about Windows programming,
|
|
you can find lots of documents about it everywhere..
|
|
The thing you should know is that
|
|
<url url="http://www.cygnus.com" name="Cygnus Solutions">
|
|
developed the
|
|
<url url="http://sourceware.cygnus.com/cygwin/" name="cygwin32.dll library">,
|
|
for GNU programs to run on Win32 platform; thus, you can use GCC, GAS,
|
|
all the GNU tools, and many other Unix applications.
|
|
|
|
<sect1>Your own OS
|
|
<p>
|
|
Control is what attracts many OS developers to assembly,
|
|
often is what leads to or stems from assembly hacking.
|
|
Note that any system that allows self-development
|
|
could be qualified an "OS",
|
|
though it can run "on the top" of an underlying system
|
|
(much like Linux over Mach or OpenGenera over Unix).
|
|
|
|
Hence, for easier debugging purpose,
|
|
you might like to develop your "OS" first as a process running
|
|
on top of Linux (despite the slowness), then use the
|
|
<url url="http://www.cs.utah.edu/projects/flux/oskit/" name="Flux OS kit">
|
|
(which grants use of Linux and BSD drivers in your own OS)
|
|
to make it standalone.
|
|
When your OS is stable, it is time to write your own
|
|
hardware drivers if you really love that.
|
|
|
|
This HOWTO will not cover topics such as
|
|
Boot loader code & getting into 32-bit mode,
|
|
Handling Interrupts,
|
|
The basics about Intel protected mode or V86/R86 braindeadness,
|
|
defining your object format and calling conventions.
|
|
|
|
The main place where to find reliable information about that all,
|
|
is source code of existing OSes and bootloaders.
|
|
Lots of pointers are on the following webpage:
|
|
<url url="http://www.tunes.org/Review/OSes.html">
|
|
|
|
|
|
<sect>QUICK START
|
|
<p>
|
|
Finally, if you still want to try this crazy idea and write something in
|
|
assembly (if you've reached this section -- you're real assembly fan),
|
|
I'll herein provide what you will need to get started.
|
|
|
|
As you've read before, you can write for Linux in different ways;
|
|
I'll show example of using pure system calls.
|
|
This means that we will not use libc at all, the only thing required for
|
|
our program to run is kernel.
|
|
Our code will not be linked to any library, will not use ELF interpreter --
|
|
it will communicate directly with kernel.
|
|
|
|
I will show the same sample program in two assemblers, <tt/nasm/ and <tt/gas/,
|
|
thus showing Intel and AT&T syntax.
|
|
|
|
You may also want to read <url url="http://linuxassembly.org/intro.html"
|
|
name="Introduction to UNIX assembly programming"> tutorial,
|
|
it contains sample code for other UNIX-like OSes.
|
|
|
|
<sect1>Tools you need
|
|
<p>
|
|
First of all you need assembler (compiler): <tt/nasm/ or <tt/gas/.
|
|
Second, you need linker: <tt/ld/, assembler produces only object code.
|
|
Almost all distributions include <tt/gas/ and <tt/ld/, in binutils package.
|
|
As for <tt/nasm/, you may have to download and install binary packages
|
|
for Linux and docs from the <ref id="findnasm" name="nasm webpage">;
|
|
however, several distributions (Stampede, Debian, SuSe)
|
|
already include it, check first.
|
|
<p>
|
|
If you are going to dig in, you should also install kernel source.
|
|
I assume that you are using at least Linux 2.0 and ELF.
|
|
|
|
<sect1>Hello, world!
|
|
<p>
|
|
Linux is 32bit and has flat memory model.
|
|
A program can be divided into sections.
|
|
Main sections are <em/.text/ for your code,
|
|
<em/.data/ for your data, <em/.bss/ for undefined data.
|
|
Program must have at least <em/.text/ section.
|
|
<p>
|
|
Now we will write our first program. Here is sample code:
|
|
|
|
<sect2>NASM (hello.asm)
|
|
<p>
|
|
<tscreen><code>
|
|
section .data ;section declaration
|
|
|
|
msg db "Hello, world!",0xa ;our dear string
|
|
len equ $ - msg ;length of our dear string
|
|
|
|
section .text ;section declaration
|
|
|
|
;we must export the entry point to the ELF linker or
|
|
global _start ;loader. They conventionally recognize _start as their
|
|
;entry point. Use ld -e foo to override the default.
|
|
|
|
_start:
|
|
|
|
;write our string to stdout
|
|
|
|
mov edx,len ;third argument: message length
|
|
mov ecx,msg ;second argument: pointer to message to write
|
|
mov ebx,1 ;first argument: file handle (stdout)
|
|
mov eax,4 ;system call number (sys_write)
|
|
int 0x80 ;call kernel
|
|
|
|
;and exit
|
|
|
|
mov ebx,0 ;first syscall argument: exit code
|
|
mov eax,1 ;system call number (sys_exit)
|
|
int 0x80 ;call kernel
|
|
|
|
</code></tscreen>
|
|
|
|
<sect2>GAS (hello.S)
|
|
<p>
|
|
<tscreen><code>
|
|
.data # section declaration
|
|
|
|
msg:
|
|
.string "Hello, world!\n" # our dear string
|
|
len = . - msg # length of our dear string
|
|
|
|
.text # section declaration
|
|
|
|
# we must export the entry point to the ELF linker or
|
|
.global _start # loader. They conventionally recognize _start as their
|
|
# entry point. Use ld -e foo to override the default.
|
|
|
|
_start:
|
|
|
|
# write our string to stdout
|
|
|
|
movl $len,%edx # third argument: message length
|
|
movl $msg,%ecx # second argument: pointer to message to write
|
|
movl $1,%ebx # first argument: file handle (stdout)
|
|
movl $4,%eax # system call number (sys_write)
|
|
int $0x80 # call kernel
|
|
|
|
# and exit
|
|
|
|
movl $0,%ebx # first argument: exit code
|
|
movl $1,%eax # system call number (sys_exit)
|
|
int $0x80 # call kernel
|
|
</code></tscreen>
|
|
|
|
<sect1>Producing object code
|
|
<p>
|
|
First step of building binary is producing object file from source
|
|
by invoking assembler; we must issue the following:
|
|
<p>
|
|
For <tt/nasm/ example:
|
|
|
|
<tt/$ nasm -f elf hello.asm/
|
|
<p>
|
|
For <tt/gas/ example:
|
|
|
|
<tt/$ as -o hello.o hello.S/
|
|
<p>
|
|
This will produce <tt/hello.o/ object file.
|
|
|
|
|
|
<sect1>Producing executable
|
|
<p>
|
|
Second step is producing executable file itself from object file
|
|
by invoking linker:
|
|
<p>
|
|
<tt/$ ld -s -o hello hello.o/
|
|
|
|
This will finally build <tt/hello/ executable.
|
|
<p>
|
|
Hey, try to run it... Works? That's it. Pretty simple.
|
|
|
|
|
|
|
|
<sect>RESOURCES<label id="res">
|
|
<p>
|
|
|
|
You main resource for Linux/UNIX assembly programming material
|
|
is <bf><url url="http://linuxassembly.org/resources.html"
|
|
name="Linux Assembly resources page"></bf>.
|
|
Do visit it, and get plenty of pointers to assembly projects,
|
|
tools, tutorials, documentation, guides, etc,
|
|
concerning different UNIX operating systems and CPUs.
|
|
Because it evolves quickly, I will no longer duplicate it in this HOWTO.
|
|
|
|
If you are new to assembly in general, here are few starting pointers:
|
|
|
|
<label id="res-general">
|
|
<itemize>
|
|
<item><url url="http://webster.cs.ucr.edu/Page_asm/ArtOfAsm.html"
|
|
name="The Art Of Assembly">
|
|
<item><url url="http://www2.dgsys.com/˜raymoon/faq/"
|
|
name="x86 assembly FAQ">
|
|
<item><url url="ftp://ftp.luth.se/pub/msdos/"
|
|
name="ftp.luth.se"> mirrors the hornet and x2ftp
|
|
former archives of msdos assembly coding stuff
|
|
<item><url url="http://www.koth.org" name="CoreWars">,
|
|
a fun way to learn assembly in general
|
|
<item>Usenet:
|
|
<url url="news://comp.lang.asm.x86" name="comp.lang.asm.x86">;
|
|
<url url="news://alt.lang.asm" name="alt.lang.asm">
|
|
</itemize>
|
|
|
|
|
|
<sect1>Mailing list<label id="res-list">
|
|
<p>
|
|
If you're are interested in Linux/UNIX assembly programming
|
|
(or have questions, or are just curious)
|
|
I especially invite you to join Linux assembly programming mailing list.
|
|
|
|
This is an open discussion of assembly programming under Linux, *BSD, BeOS,
|
|
or any other UNIX/POSIX like OS; also it is not limited to x86 assembly
|
|
(Alpha, Sparc, PPC and other hackers are welcome too!).
|
|
|
|
To subscribe send a blank message to <url url="mailto:linux-assembly-subscribe@egroups.com">.
|
|
|
|
List address is <url url="mailto:linux-assembly@egroups.com">.
|
|
|
|
List archives are available at <url url="http://www.egroups.com/list/linux-assembly/">.
|
|
|
|
<sect1>Frequently asked questions (with answers)<label id="faq">
|
|
<p>
|
|
Here are frequently asked questions. Answers are taken
|
|
from the <ref id="res-list" name="linux-assembly mailing list">.
|
|
|
|
<sect2>How do I do graphics programming in Linux?
|
|
<p>
|
|
An answer from <url url="mailto:paulf@icom.co.za" name="Paul Furber">:
|
|
|
|
<verb>
|
|
Ok you have a number of options to graphics in Linux. Which one you use
|
|
depends on what you want to do. There isn't one Web site with all the
|
|
information but here are some tips:
|
|
|
|
SVGALib: This is a C library for console SVGA access.
|
|
Pros: very easy to learn, good coding examples, not all that different
|
|
from equivalent gfx libraries for DOS, all the effects you know from DOS
|
|
can be converted with little difficulty.
|
|
Cons: programs need superuser rights to run since they write directly to
|
|
the hardware, doesn't work with all chipsets, can't run under X-Windows.
|
|
Search for svgalib-1.4.x on http://ftp.is.co.za
|
|
|
|
Framebuffer: do it yourself graphics at SVGA res
|
|
Pros: fast, linear mapped video access, ASM can be used if you want :)
|
|
Cons: has to be compiled into the kernel, chipset-specific issues, must
|
|
switch out of X to run, relies on good knowledge of linux system calls
|
|
and kernel, tough to debug
|
|
Examples: asmutils (http://www.linuxassembly.org) and the leaves example
|
|
and my own site for some framebuffer code and tips in asm
|
|
(http://ma.verick.co.za/linux4k/)
|
|
|
|
Xlib: the application and development libraries for XFree86.
|
|
Pros: Complete control over your X application
|
|
Cons: Difficult to learn, horrible to work with and requires quite a bit
|
|
of knowledge as to how X works at the low level.
|
|
Not recommended but if you're really masochistic go for it. All the
|
|
include and lib files are probably installed already so you have what
|
|
you need.
|
|
|
|
Low-level APIs: include PTC, SDL, GGI and Clanlib
|
|
Pros: very flexible, run under X or the console, generally abstract away
|
|
the video hardware a little so you can draw to a linear surface, lots of
|
|
good coding examples, can link to other APIs like OpenGL and sound libs,
|
|
Windows DirectX versions for free
|
|
Cons: Not as fast as doing it yourself, often in development so versions
|
|
can (and do) change frequently.
|
|
Examples: PTC and GGI have excellent demos, SDL is used in sdlQuake,
|
|
Myth II, Civ CTP and Clanlib has been used for games as well.
|
|
|
|
High-level APIs: OpenGL - any others?
|
|
Pros: clean api, tons of functionality and examples, industry standard
|
|
so you can learn from SGI demos for example
|
|
Cons: hardware acceleration is normally a must, some quirks between
|
|
versions and platforms
|
|
Examples: loads - check out www.mesa3d.org under the links section.
|
|
|
|
To get going try looking at the svgalib examples and also install SDL
|
|
and get it working. After that, the sky's the limit.
|
|
</verb>
|
|
|
|
<sect2>How do I debug pure assembly code under Linux?
|
|
<p>
|
|
|
|
There's an early version of the
|
|
<url url="http://www.ellipse.magenet.com/ald.html"
|
|
name="Assembly Language Debugger">,
|
|
which is designed to work with assembly code,
|
|
and is portable enough to run on Linux and *BSD.
|
|
It is already functional and should be the right choice, check it out!
|
|
|
|
You can also try <tt/gdb/ ;).
|
|
Although it is source-level debugger, it can be used to debug
|
|
pure assembly code, and with some trickery you can make <tt/gdb/ to do what you need.
|
|
Here's an answer from <url url="mailto:dl@gazeta.ru" name="Dmitry Bakhvalov">:
|
|
|
|
<verb>
|
|
Personally, I use gdb for debugging asmutils. Try this:
|
|
|
|
1) Use the following stuff to compile:
|
|
$nasm -f elf -g smth.asm
|
|
$ld -o smth smth.o
|
|
|
|
2) Fire up gdb:
|
|
$gdb smth
|
|
|
|
3) In gdb:
|
|
(gdb) disassemble _start
|
|
Place a breakpoint at <_start+1> (If placed at _start the breakpoint
|
|
wouldnt work, dunno why)
|
|
(gdb) b *0x8048075
|
|
|
|
To step thru the code I use the following macro:
|
|
(gdb)define n
|
|
>ni
|
|
>printf "eax=%x ebx=%x ...etc...",$eax,$ebx,...etc...
|
|
>disassemble $pc $pc+15
|
|
>end
|
|
|
|
Then start the program with r command and debug with n.
|
|
|
|
Hope this helps.
|
|
</verb>
|
|
|
|
An additional note from ???:
|
|
|
|
<verb>
|
|
I have such a macro in my .gdbinit for quite some time now, and it
|
|
for sure makes life easier. A small difference : I use "x /8i $pc",
|
|
which guarantee a fixed number of disassembled instructions. Then,
|
|
with a well chosen size for my xterm, gdb output looks like it is
|
|
refreshed, and not scrolling.
|
|
</verb>
|
|
|
|
If you want to set breakpoints across your code, you can just use
|
|
<tt/int 3/ instruction as breakpoint (instead of entering address
|
|
manually in <tt/gdb/).
|
|
|
|
If you're using <tt/gas/, you should consult <tt/gas/ and <tt/gdb/ related
|
|
<url url="http://linuxassembly.org/resources.html#tutorials" name="tutorials">.
|
|
|
|
|
|
<sect2>Any other useful debugging tools?
|
|
<p>
|
|
Definitely <tt/strace/ can help a lot (<tt/ktrace/ and <tt/kdump/ on FreeBSD),
|
|
it is used to trace system calls and signals.
|
|
Read its manual page (<tt/man strace/) and <tt/strace --help/ output for details.
|
|
|
|
|
|
<sect2>How do I access BIOS functions from Linux (BSD, BeOS, etc)?
|
|
<p>
|
|
Noway. This is protected mode, use OS services instead.
|
|
Again, you can't use <tt/int 0x10/, <tt/int 0x13/, etc.
|
|
Fortunately almost everything can be implemented
|
|
by means of system calls or library functions.
|
|
In the worst case you may go through direct port access,
|
|
or make a kernel patch to implement needed functionality.
|
|
|
|
<em/That's all for now, folks/.
|
|
|
|
$Id$
|
|
|
|
</article>
|