392 lines
16 KiB
HTML
392 lines
16 KiB
HTML
<!--startcut ==============================================-->
|
|
<!-- *** BEGIN HTML header *** -->
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
|
<HTML><HEAD>
|
|
<title>Introduction to UNIX Assembly Programming LG #53</title>
|
|
</HEAD>
|
|
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#0000AF"
|
|
ALINK="#FF0000">
|
|
<!-- *** END HTML header *** -->
|
|
|
|
<IMG ALT="LINUX GAZETTE" SRC="../gx/lglogo.jpg"
|
|
WIDTH="600" HEIGHT="124" border="0"><BR CLEAR="all">
|
|
<!-- *** BEGIN navbar *** -->
|
|
<A HREF="baptista.html"><IMG ALT="[ Prev ]" SRC="../gx/navbar/prev.jpg" WIDTH="16" HEIGHT="45" BORDER="0" ALIGN="bottom"></A>
|
|
<IMG ALT=""
|
|
SRC="../gx/navbar/left.jpg" WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="bottom" >
|
|
<A HREF="index.html"><IMG ALT="[ Table of Contents ]"
|
|
SRC="../gx/navbar/toc.jpg" WIDTH="220" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A>
|
|
<A HREF="../index.html"><IMG ALT="[ Front Page ]"
|
|
SRC="../gx/navbar/frontpage.jpg" WIDTH="137" HEIGHT="45" BORDER="0" ALIGN="bottom"></A>
|
|
<A HREF="../faq/index.html"><IMG ALT="[ FAQ ]"
|
|
SRC="./../gx/navbar/faq.jpg"WIDTH="62" HEIGHT="45" BORDER="0" ALIGN="bottom"></A>
|
|
<A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue53/boldyshev.html"><IMG ALT="[ Talkback ]" SRC="../gx/navbar/talkback.jpg" WIDTH="121" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A>
|
|
<IMG ALT=""
|
|
SRC="../gx/navbar/right.jpg" WIDTH="15" HEIGHT="45" ALIGN="bottom" >
|
|
<A HREF="collinge.html"><IMG ALT="[ Next ]" SRC="../gx/navbar/next.jpg" WIDTH="15" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A>
|
|
<!-- *** END navbar *** -->
|
|
<P>
|
|
<!-- A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue53/boldyshev.html">
|
|
<FONT SIZE="+2"><EM>Talkback:</EM> Discuss this article with peers</FONT></A -->
|
|
|
|
<!--endcut ============================================================-->
|
|
|
|
<H4>
|
|
"Linux Gazette...<I>making Linux just a little more fun!</I>"
|
|
</H4>
|
|
|
|
<P> <HR> <P>
|
|
<!--===================================================================-->
|
|
|
|
<center>
|
|
<H1><font color="maroon">Introduction to UNIX Assembly Programming</font></H1>
|
|
<H4>By <a href="mailto:konst@linuxassembly.org">Konstantin Boldyshev</a></H4>
|
|
</center>
|
|
<P> <HR> <P>
|
|
|
|
<!-- END header -->
|
|
|
|
|
|
|
|
|
|
|
|
<P>
|
|
<EM>This document is intended to be a tutorial, showing how to write
|
|
a simple assembly program in
|
|
several UNIX operating systems on IA32 (i386) platform.
|
|
Included material may or may not be applicable
|
|
to other hardware and/or software platforms.
|
|
Document explains program layout, system call convention,
|
|
and build process.
|
|
It accompanies Linux Assembly HOWTO, which may be of your interest as well,
|
|
though is more Linux specific.</EM>
|
|
<P>
|
|
v0.3, April 09, 2000
|
|
<HR>
|
|
|
|
<H2><A NAME="s1">1. Introduction</A></H2>
|
|
|
|
<H2><A NAME="ss1.1">1.1 Legal blurb</A>
|
|
</H2>
|
|
|
|
<P>Copyright © 1999-2000 Konstantin Boldyshev.
|
|
Permission is granted to copy, distribute and/or modify
|
|
this document under the terms of the GNU
|
|
<A HREF="http://www.gnu.org/copyleft/fdl.html">Free Documentation License</A>,
|
|
Version 1.1 or any later version published by the Free Software Foundation.
|
|
<P>
|
|
<H2><A NAME="ss1.2">1.2 Obtatining this document</A>
|
|
</H2>
|
|
|
|
<P>The latest version of this document is available from
|
|
<A HREF="http://linuxassembly.org/intro.html">http://linuxassembly.org/intro.html</A>.
|
|
If you are reading a few-months-old copy,
|
|
please check the url above for a new version.
|
|
<P>
|
|
<H2><A NAME="ss1.3">1.3 Tools you need</A>
|
|
</H2>
|
|
|
|
<P>You will need several tools to play with programs included in this tutorial.
|
|
<P>First of all you need assembler (compiler).
|
|
As a rule modern UNIX distribution includes <CODE>gas</CODE> (GNU Assembler),
|
|
but all examples specified here use another assembler -- <CODE>nasm</CODE> (Netwide Assembler).
|
|
You can download it from the
|
|
<A HREF="http://www.cryogen.com/Nasm/">nasm page</A>,
|
|
it comes with full source code.
|
|
Compile it, or try to find precompiled binary for your OS;
|
|
note that several distributions (at least Linux ones)
|
|
already have <CODE>nasm</CODE>, check first.
|
|
<P>Second, you need linker -- <CODE>ld</CODE>, since <CODE>nasm</CODE> produces only object code.
|
|
Any distribution should embrace <CODE>ld</CODE>.
|
|
<P>If you're going to dig in, you should also install include files for your OS,
|
|
and if possible, kernel source.
|
|
<P>Now you should be ready to start, welcome..
|
|
<P>
|
|
<HR>
|
|
<H2><A NAME="s2">2. Hello, world!</A></H2>
|
|
|
|
<P>
|
|
<P>Now we will write our program, classical "Hello, world" (hello.asm).
|
|
You can download its sources and binaries
|
|
<A HREF="http://linuxassembly.org/intro/hello.tgz">here</A>.
|
|
But before let me explain several basics.
|
|
<P>
|
|
<H2><A NAME="ss2.1">2.1 System call</A>
|
|
</H2>
|
|
|
|
<P>Unless program is just implementing some math algorithms in assembly,
|
|
it will deal with such things as getting input, producing output,
|
|
and exiting. Here comes a need to call some OS service.
|
|
In fact, programming in assembly language is quite the same in different OSes,
|
|
unless OS services are touched.
|
|
<P>There are two common ways of performing a system call in UNIX OS:
|
|
trough the C library (libc) wrapper, or directly.
|
|
<P>Using or not using libc in assembly programming is more a question
|
|
of taste/belief than something practical.
|
|
Libc wrappers are made to protect program from possible system call convention change,
|
|
and to provide POSIX compatible interface, if kernel lacks it for some call.
|
|
However usually UNIX kernel is more or less POSIX compliant,
|
|
this means that syntax of most libc "system calls" exactly
|
|
matches syntax of real kernel system calls (and vice versa).
|
|
But main drawback of throwing libc away is that are loosing several functions
|
|
that are not just syscall wrappers, like printf(), malloc() and similar.
|
|
<P>This tutorial will show how to use <B>direct</B> kernel calls,
|
|
since this is the fastest way to call kernel service;
|
|
our code is not linked to any library,
|
|
it communicates with kernel directly.
|
|
<P>Things that differ in different UNIX kernels
|
|
are set of system calls and system call convention
|
|
(however as they strive for POSIX compliance, there's a lot of common between them).
|
|
<P><EM>Note for (former) DOS programmers: so, what is that system call?
|
|
Better to explain it in such a way:
|
|
if you ever wrote a DOS assembly program (and most IA32 assembly programmers did),
|
|
you remember DOS services <CODE>int 0x21, int 0x25, int 0x26</CODE> etc..
|
|
This is what can be designated as system call.
|
|
However the actual implementation is absolutely different,
|
|
and this doesn't mean that system calls necessary are done via some interrupt.
|
|
Also, quite often DOS programmers mix OS services with BIOS services
|
|
like <CODE>int 0x10</CODE> or <CODE>int 0x16</CODE>, and are very surprised when they fail
|
|
to perform them in UNIX, since these are not OS services).</EM>
|
|
<P>
|
|
<H2><A NAME="ss2.2">2.2 Program layout</A>
|
|
</H2>
|
|
|
|
<P>As a rule, modern IA32 UNIXes are 32bit (*grin*), run in protected mode,
|
|
have flat memory model, and use ELF format for binaries.
|
|
<P>Program can be divided into sections (or segments):
|
|
<CODE>.text</CODE> for your code (read-only),
|
|
<CODE>.data</CODE> for your data (read-write),
|
|
<CODE>.bss</CODE> for uninitialized data (read-write);
|
|
actually there can be few other, as well as user-defined sections,
|
|
but there's rare need to use them and they are out of our interest here.
|
|
Program must have at least <CODE>.text</CODE> section.
|
|
<P>Ok, now we'll dive into OS specific details.
|
|
<P>
|
|
<H2><A NAME="ss2.3">2.3 Linux</A>
|
|
</H2>
|
|
|
|
<P>System calls in Linux are done through int 0x80.
|
|
(actually there's a kernel patch allowing system calls to be done
|
|
via <EM>syscall (sysenter)</EM> instruction on newer CPUs, but this
|
|
thing is still experimental).
|
|
<P>Linux differs from usual UNIX calling convention,
|
|
and features "fastcall" convention
|
|
for system calls (it resembles DOS).
|
|
System function number is passed in <CODE>eax</CODE>,
|
|
and arguments are passed through registers, not the stack.
|
|
There can be up to five arguments in <CODE>ebx, ecx, edx, esi, edi</CODE> consequently.
|
|
If there's more than five arguments, they are simply passed though the
|
|
structure as first argument.
|
|
Result is returned in <CODE>eax</CODE>, stack is not touched at all.
|
|
<P>System call function numbers are in sys/syscall.h,
|
|
but actually in asm/unistd.h,
|
|
some documentation is in the 2nd section of manual
|
|
(f.e. to find info on <CODE>write</CODE> system call, issue <CODE>man 2 write</CODE>).
|
|
<P>There are several attempts to made up-to-date documentation of Linux system calls,
|
|
examine URLs in the
|
|
<A HREF="boldyshev.html#references">references</A>.
|
|
<P>So, our Linux program will look like:
|
|
<P>
|
|
<BLOCKQUOTE><CODE>
|
|
<HR>
|
|
<PRE>
|
|
section .text
|
|
global _start ;must be declared for linker (ld)
|
|
|
|
msg db 'Hello, world!',0xa ;our dear string
|
|
len equ $ - msg ;length of our dear string
|
|
|
|
_start: ;we tell linker where is entry point
|
|
|
|
mov edx,len ;message length
|
|
mov ecx,msg ;message to write
|
|
mov ebx,1 ;file descriptor (stdout)
|
|
mov eax,4 ;system call number (sys_write)
|
|
int 0x80 ;call kernel
|
|
|
|
mov eax,1 ;system call number (sys_exit)
|
|
int 0x80 ;call kernel
|
|
</PRE>
|
|
<HR>
|
|
</CODE></BLOCKQUOTE>
|
|
<P>As you will see futther, Linux syscall convention is the most compact one.
|
|
<P>Kernel source references:
|
|
<UL>
|
|
<LI>arch/i386/kernel/entry.S</LI>
|
|
<LI>include/asm-i386/unistd.h</LI>
|
|
<LI>include/linux/sys.h</LI>
|
|
</UL>
|
|
<P>
|
|
<P>
|
|
<H2><A NAME="ss2.4">2.4 FreeBSD</A>
|
|
</H2>
|
|
|
|
<P>FreeBSD has "usual" calling convention,
|
|
when syscall number is in eax, and parameters are on the stack
|
|
(the first argument is pushed the last).
|
|
System call is to be performed through the <B>function call</B> to a
|
|
function containing <CODE>int 0x80</CODE> and <CODE>ret</CODE>, not just <CODE>int 0x80</CODE> itself
|
|
(return address MUST be on the stack before <CODE>int 0x80</CODE> is issued!).
|
|
Caller must clean up the stack after call.
|
|
Result is returned as usual in <CODE>eax</CODE>.
|
|
<P>Also there's an alternate way of using <CODE>call 7:0</CODE> gate instead of <CODE>int 0x80</CODE>.
|
|
End-result is the same, not counting increase of program size,
|
|
since you will also need to <CODE>push eax</CODE> before,
|
|
and these two instructions occupy more bytes.
|
|
<P>System call function numbers are in sys/syscall.h,
|
|
documentation is in the 2nd section of man.
|
|
<P>Ok, I think the source will explain this better:
|
|
<P><EM>Note: Included code may run on other *BSD as well, I think.</EM>
|
|
<P>
|
|
<BLOCKQUOTE><CODE>
|
|
<HR>
|
|
<PRE>
|
|
section .text
|
|
global _start ;must be declared for linker (ld)
|
|
|
|
msg db "Hello, world!",0xa ;our dear string
|
|
len equ $ - msg ;length of our dear string
|
|
|
|
_syscall:
|
|
int 0x80 ;system call
|
|
ret
|
|
|
|
_start: ;tell linker entry point
|
|
|
|
push dword len ;message length
|
|
push dword msg ;message to write
|
|
push dword 1 ;file descriptor (stdout)
|
|
mov eax,0x4 ;system call number (sys_write)
|
|
call _syscall ;call kernel
|
|
|
|
;actually there's an alternate
|
|
;way to call kernel:
|
|
;push eax
|
|
;call 7:0
|
|
|
|
add esp,12 ;clean stack (3 arguments * 4)
|
|
|
|
push dword 0 ;exit code
|
|
mov eax,0x1 ;system call number (sys_exit)
|
|
call _syscall ;call kernel
|
|
|
|
;we do not return from sys_exit,
|
|
;there's no need to clean stack
|
|
</PRE>
|
|
<HR>
|
|
</CODE></BLOCKQUOTE>
|
|
<P>Kernel source references:
|
|
<UL>
|
|
<LI>i386/i386/exception.s</LI>
|
|
<LI>i386/i386/trap.c</LI>
|
|
<LI>sys/syscall.h</LI>
|
|
</UL>
|
|
<P>
|
|
<H2><A NAME="ss2.5">2.5 BeOS</A>
|
|
</H2>
|
|
|
|
<P>BeOS kernel is using "usual" UNIX calling convention too.
|
|
The difference from FreeBSD example is that you call <CODE>int 0x25</CODE>.
|
|
<P>On information where to find system call function numbers and other
|
|
interesting details, examine
|
|
<A HREF="boldyshev.html#references">asmutils</A>,
|
|
especially os_beos.inc file.
|
|
<P><EM>Note: to make <CODE>nasm</CODE> compile correctly on BeOS you need
|
|
to insert <CODE>#include "nasm.h"</CODE> into <CODE>float.h</CODE>,
|
|
and <CODE>#include <stdio.h></CODE> into <CODE>nasm.h</CODE>.</EM>
|
|
<P>
|
|
<BLOCKQUOTE><CODE>
|
|
<HR>
|
|
<PRE>
|
|
section .text
|
|
global _start ;must be declared for linker (ld)
|
|
|
|
msg db "Hello, world!",0xa ;our dear string
|
|
len equ $ - msg ;length of our dear string
|
|
|
|
_syscall: ;system call
|
|
int 0x25
|
|
ret
|
|
|
|
_start: ;tell linker entry point
|
|
|
|
push dword len ;message length
|
|
push dword msg ;message to write
|
|
push dword 1 ;file descriptor (stdout)
|
|
mov eax,0x3 ;system call number (sys_write)
|
|
call _syscall ;call kernel
|
|
add esp,12 ;clean stack (3 * 4)
|
|
|
|
push dword 0 ;exit code
|
|
mov eax,0x3f ;system call number (sys_exit)
|
|
call _syscall ;call kernel
|
|
;no need to clean stack
|
|
</PRE>
|
|
<HR>
|
|
</CODE></BLOCKQUOTE>
|
|
<P>
|
|
<H2><A NAME="ss2.6">2.6 Building binary</A>
|
|
</H2>
|
|
|
|
<P>
|
|
<P>Building binary is usual two-step process of compiling and linking.
|
|
To make binary from our hello.asm we must do the following:
|
|
<P>
|
|
<HR>
|
|
<PRE>
|
|
$ nasm -f elf hello.asm # this will produce hello.o object file
|
|
$ ld -s -o hello hello.o # this will produce hello executable
|
|
</PRE>
|
|
<HR>
|
|
<P>That's it. Simple.
|
|
Now you can launch hello program by entering <CODE>./hello</CODE>, it should work.
|
|
Look at the binary size -- surprised?
|
|
<P>
|
|
<HR>
|
|
<H2><A NAME="references"></A> <A NAME="s3">3. References</A></H2>
|
|
|
|
<P>I hope you enjoyed the journey. If you get interested in assembly
|
|
programming for UNIX, I strongly encourage you to visit
|
|
<A HREF="http://linuxassembly.org">Linux Assembly</A>
|
|
for more information, and download
|
|
<A HREF="http://linuxassembly.org/asmutils.html">asmutils</A> package,
|
|
it contains a lot of sample code.
|
|
For comprehensive overview of Linux/UNIX assembly programming refer to the
|
|
<A HREF="http://linuxassembly.org/howto.html">Linux Assembly HOWTO</A>.
|
|
<P>Thank you for your interest!
|
|
|
|
|
|
|
|
|
|
|
|
<!-- *** BEGIN copyright *** -->
|
|
<P> <hr> <!-- P -->
|
|
<H5 ALIGN=center>
|
|
|
|
Copyright © 2000, Konstantin Boldyshev<BR>
|
|
Published in Issue 53 of <i>Linux Gazette</i>, May 2000</H5>
|
|
<!-- *** END copyright *** -->
|
|
|
|
<!--startcut ==========================================================-->
|
|
<!-- P --> <HR> <!-- P -->
|
|
<!-- A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue53/boldyshev.html">
|
|
<FONT SIZE="+2"><EM>Talkback:</EM> Discuss this article with peers</FONT></A -->
|
|
<P>
|
|
<!-- *** BEGIN navbar *** -->
|
|
<A HREF="baptista.html"><IMG ALT="[ Prev ]" SRC="../gx/navbar/prev.jpg" WIDTH="16" HEIGHT="45" BORDER="0" ALIGN="bottom"></A>
|
|
<IMG ALT=""
|
|
SRC="../gx/navbar/left.jpg" WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="bottom" >
|
|
<A HREF="index.html"><IMG ALT="[ Table of Contents ]"
|
|
SRC="../gx/navbar/toc.jpg" WIDTH="220" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A>
|
|
<A HREF="../index.html"><IMG ALT="[ Front Page ]"
|
|
SRC="../gx/navbar/frontpage.jpg" WIDTH="137" HEIGHT="45" BORDER="0" ALIGN="bottom"></A>
|
|
<A HREF="../faq/index.html"><IMG ALT="[ FAQ ]"
|
|
SRC="./../gx/navbar/faq.jpg"WIDTH="62" HEIGHT="45" BORDER="0" ALIGN="bottom"></A>
|
|
<A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue53/boldyshev.html"><IMG ALT="[ Talkback ]" SRC="../gx/navbar/talkback.jpg" WIDTH="121" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A>
|
|
<IMG ALT=""
|
|
SRC="../gx/navbar/right.jpg" WIDTH="15" HEIGHT="45" ALIGN="bottom" >
|
|
<A HREF="collinge.html"><IMG ALT="[ Next ]" SRC="../gx/navbar/next.jpg" WIDTH="15" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A>
|
|
<!-- *** END navbar *** -->
|
|
</BODY></HTML>
|
|
<!--endcut ============================================================-->
|