759 lines
25 KiB
HTML
759 lines
25 KiB
HTML
<!--startcut ==============================================-->
|
|
<!-- *** BEGIN HTML header *** -->
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
|
<HTML><HEAD>
|
|
<title>From C To Assembly Language LG #94</title>
|
|
</HEAD>
|
|
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#0000AF"
|
|
ALINK="#FF0000">
|
|
<!-- *** END HTML header *** -->
|
|
|
|
<!-- *** BEGIN navbar *** -->
|
|
<A HREF="ecol.html"><< Prev</A> | <A HREF="index.html">TOC</A> | <A HREF="../index.html">Front Page</A> | <A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue94/ramankutty.html">Talkback</A> | <A HREF="../faq/index.html">FAQ</A> | <A HREF="kolp.html">Next >></A>
|
|
<!-- *** END navbar *** -->
|
|
|
|
<!--endcut ============================================================-->
|
|
|
|
<TABLE BORDER><TR><TD WIDTH="200">
|
|
<A HREF="http://www.linuxgazette.com/">
|
|
<IMG ALT="LINUX GAZETTE" SRC="../gx/2002/lglogo_200x41.png"
|
|
WIDTH="200" HEIGHT="41" border="0"></A>
|
|
<BR CLEAR="all">
|
|
<SMALL>...<I>making Linux just a little more fun!</I></SMALL>
|
|
</TD><TD WIDTH="380">
|
|
|
|
|
|
<CENTER>
|
|
<BIG><BIG><STRONG><FONT COLOR="maroon">From C To Assembly Language</FONT></STRONG></BIG></BIG>
|
|
<BR>
|
|
<STRONG>By <A HREF="../authors/ramankutty.html">Hiran Ramankutty</A></STRONG>
|
|
</CENTER>
|
|
|
|
</TD></TR>
|
|
</TABLE>
|
|
<P>
|
|
|
|
<!-- END header -->
|
|
|
|
|
|
|
|
<html>
|
|
<body>
|
|
<h2><b>1. Overview</b></h2>
|
|
<p>
|
|
What is a microcomputer system made up of? A microcomputer system is
|
|
made up of a <i>microprocessor unit</i> (MPU), a bus system, a memory
|
|
subsystem, an I/O subsystem and an interface among all components. A
|
|
typical answer one can expect.
|
|
</p>
|
|
<p>
|
|
This is only the hardware side. Every microcomputer system requires a
|
|
software so as to direct each of the hardware components while they
|
|
are performing their respective tasks. Computer software can be
|
|
thought about at system side (system software) and user side (user
|
|
software).
|
|
</p>
|
|
<p>
|
|
The user software may include some in-built libraries and user created
|
|
libraries in the form of subroutines which may be needed in preparing
|
|
programs for execution.
|
|
</p>
|
|
<p>
|
|
The system software may encompass a variety of high-level language
|
|
translators, an assembler, a text editor, and several other programs
|
|
for aiding in the preparation of other programs. We already know that
|
|
there are three levels of programming and they are Machine language,
|
|
Assembly language and High-level language.
|
|
</p>
|
|
<p>
|
|
Machine language programs are programs that the computer can
|
|
understand and execute directly (think of programming in any
|
|
microprocessor kit). Assembler language instructions match machine
|
|
language instructions on a more or less one-for-one basis, but are
|
|
written using character strings so that they are more easily
|
|
understood, and high-level language instructions are much closer to
|
|
the English language and are structured so that they naturally
|
|
correspond to the way programmers think. Ultimately, an assembler
|
|
language or high-level language program must be converted into machine
|
|
language by programs called translators. They are referred to as
|
|
<i>assembler</i> and <i>compiler</i> or <i>interpreter</i> respectively.
|
|
</p>
|
|
<p>
|
|
Compilers for high-level languages like C/C++ have the ability to
|
|
translate high-level language into assembly code. The GNU C and C++
|
|
Compiler option of -S will generate an assembly code equivalent to
|
|
that of the corresponding source program. Knowing how the most
|
|
rudimentary constructs like loops, function calls and variable
|
|
declaration are mapped into assembly language is one way to achieve
|
|
the goal of mastering C internals. Before proceeding further, you must
|
|
make it a point that you are familiar with Computer Architecture and
|
|
Intel x86 assembly language to help you follow the material presented
|
|
here.
|
|
</p>
|
|
<h2><b>2. Getting Started</b></h2>
|
|
<p>
|
|
To begin with, write a small program in C to print <i>hello world</i>
|
|
and compile it with -S options. The output is an assembler code for
|
|
the input file specified. By default, GCC makes the assembler file
|
|
name by replacing the suffix `.c', with `.s'. Try to interpret the few
|
|
lines at the end of the assembler file.
|
|
</p>
|
|
<p>
|
|
The 80386 and above family of processors have myriads of registers,
|
|
instructions and addressing modes. A basic knowledge about only a few
|
|
simple instructions is sufficient to understand the code generated by
|
|
the GNU compiler.
|
|
</p>
|
|
<p>
|
|
Generally, any assembly language instruction includes a <i>label</i>, a
|
|
<i>mnemonic</i>, and <i>operands</i>. An operand's notation is
|
|
sufficient to decipher the operand's addressing mode. The
|
|
<i>mnemonics</i> operate on the information contained in the operands.
|
|
In fact, assembly language instructions operate on registers and
|
|
memory locations. The 80386 family has general purpose registers (32
|
|
bit) called <i>eax</i>, <i>ebx</i>, <i>ecx</i> etc. Two registers,
|
|
<i>ebp</i> and <i>esp</i> are used for manipulating the stack. A
|
|
typical instruction, written in GNU Assembler (GAS) syntax, would look
|
|
like this:
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
movl $10, %eax
|
|
</pre>
|
|
<p>
|
|
This instruction stores the value 10 in the <i>eax</i> register. The
|
|
prefix `%' to the register name and `$' to the immediate value are
|
|
essential assembler syntax. It is to be noted that not all assemblers
|
|
follow the same syntax.
|
|
</p>
|
|
<p>
|
|
Our first assembly language program, stored in a file named
|
|
<i>first.s</i> is shown in <b>Listing 1</b>.
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 1</i>
|
|
.globl main
|
|
main:
|
|
movl $20, %eax
|
|
ret
|
|
</pre>
|
|
<p>
|
|
This file can be assembled and linked to generate an <i>a.out</i> by
|
|
giving the command <i>cc first.s</i>. The extensions `.s' are
|
|
identified by the GNU compiler front end <i>cc</i> as assembly
|
|
language files and invokes the assembler and linker, skipping the
|
|
compilation phase.
|
|
</p>
|
|
<p>
|
|
The first line of the program is a comment. The <i>.globl</i>
|
|
assembler directive serves to make the symbol <i>main</i> visible to
|
|
the linker. This is vital as your program will be linked with the C
|
|
startup library which will contain a call to <i>main</i>. The linker
|
|
will complain about 'undefined reference to symbol main' if that line
|
|
is omitted (try it). The program simply stores the value 20 in register
|
|
<i>eax</i> and returns to the caller.
|
|
</p>
|
|
|
|
<h2><b>3. Arithmetic, Comparison, Looping</b></h2>
|
|
<p>
|
|
Our next program is <b>Listing 2</b> which computes the factorial of a
|
|
number stored in <i>eax</i>. The factorial is stored in <i>ebx</i>.
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 2</i>
|
|
.globl main
|
|
main:
|
|
movl $5, %eax
|
|
movl $1, %ebx
|
|
L1: cmpl $0, %eax //compare 0 with value in <i>eax</i>
|
|
je L2 //jump to L2 if 0==eax (je - jump if equal)
|
|
imull %eax, %ebx // ebx = ebx*eax
|
|
decl %eax //decrement eax
|
|
jmp L1 // unconditional jump to L1
|
|
L2: ret
|
|
</pre>
|
|
<p>
|
|
<i>L1</i> and <i>L2</i> are labels. When control flow reaches
|
|
<i>L2</i>, <i>ebx</i> would contain the factorial of the number stored
|
|
in <i>eax</i>.
|
|
</p>
|
|
<h2><b>4. Subroutines</b></h2>
|
|
<p>
|
|
When implementing complicated programs, we split the tasks to be
|
|
solved in systematic order. We write subroutines and functions for
|
|
each of the tasks which are called when ever required. <b>Listing 3</b>
|
|
illustrates subroutine call and return in assembly language programs.
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 3</i>
|
|
.globl main
|
|
main:
|
|
movl $10, %eax
|
|
call foo
|
|
ret
|
|
foo:
|
|
addl $5, %eax
|
|
ret
|
|
</pre>
|
|
<p>
|
|
The instruction <i>call</i> transfers control to subroutine <i>foo</i>.
|
|
The <i>ret</i> instruction in <i>foo</i> transfers control back to the
|
|
instruction after the call in <i>main</i>.
|
|
</p>
|
|
<p>
|
|
Generally, each function defines the scope of variables it uses in
|
|
each call of the routine. To maintain the scopes of variables you need
|
|
space. The stack can be used to maintain values of the variables in
|
|
each call of the routine. It is important to know the basics of how
|
|
the activation records can be maintained for repeated, recursive calls
|
|
or any other possible calls in the execution of the program. Knowing
|
|
how to manipulate registers like <i>esp</i> and <i>ebp</i> and making
|
|
use of instructions like <i>push</i> and <i>pop</i> which operate on
|
|
the stack are central to understanding the subroutine call and return
|
|
mechanism.
|
|
</p>
|
|
<h2><b>5. Using The Stack</b></h2>
|
|
<p>
|
|
A section of your program's memory is reserved for use as a stack. The
|
|
Intel 80386 and above microprocessors contain a register called stack
|
|
pointer, <i>esp</i>, which stores the address of the top of stack.
|
|
<b>Figure 1</b> below shows three integer values, 49,30 and 72, stored
|
|
on the stack (each integer occupying four bytes) with <i>esp</i>
|
|
register holding the address of the top of stack.
|
|
</p>
|
|
<a href="misc/ramankutty/stack1.bmp">Figure 1</a>
|
|
<p>
|
|
Unlike the stack analogous to a pile of bricks growing up wards, on
|
|
Intel machines stack grows down wards. <b>Figure 2</b> shows the stack
|
|
layout after the execution of the instruction <i>pushl $15</i>.
|
|
</p>
|
|
<a href="misc/ramankutty/stack2.bmp">Figure 2</a>
|
|
<p>
|
|
The stack pointer register is decremented by four and the number 15 is
|
|
stored as four bytes at locations 1988, 1989, 1990 and 1991.
|
|
</p>
|
|
<p>
|
|
The instruction <i>popl %eax</i> copies the value at top of stack (four
|
|
bytes) to the <i>eax</i> register and increments <i>esp</i> by four.
|
|
What if you do not want to copy the value at top of stack to any
|
|
register? You just execute the instruction <i>addl $4, %esp</i> which
|
|
simply increments the stack pointer.
|
|
</p>
|
|
<p>
|
|
In <b>Listing 3</b>, the instruction <i>call foo</i> pushes the
|
|
address of the instruction after the call in the calling program on to
|
|
the stack and branches to <i>foo</i>. The subroutine ends with
|
|
<i>ret</i> which transfers control to the instruction whose address is
|
|
taken from the top of stack. Obviously, the top of stack must contain
|
|
a valid return address.
|
|
</p>
|
|
<h2><b>6. Allocating Space for Local Variables</b></h2>
|
|
<p>
|
|
It is possible to have a C program manipulating hundreds and thousands
|
|
of variables. The assembly code for the corresponding C program will
|
|
give you an idea of how the variables are accommodated and how the
|
|
registers are used for manipulating the variables without causing any
|
|
conflicts in the final result that is to be obtained.
|
|
</p>
|
|
<p>
|
|
The registers are few in number and cannot be used for holding all the
|
|
variables in a program. Local variables are allotted space within the
|
|
stack. <b>Listing 4</b> shows how it is done.
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 4</i>
|
|
.globl main
|
|
main:
|
|
call foo
|
|
ret
|
|
foo:
|
|
pushl %ebp
|
|
movl %esp, %ebp
|
|
subl $4, %esp
|
|
movl $10, -4(%ebp)
|
|
movl %ebp, %esp
|
|
popl %ebp
|
|
ret
|
|
</pre>
|
|
<p>
|
|
First, the value of the stack pointer is copied to <i>ebp</i>, the base
|
|
pointer register. The base pointer is used as a fixed reference to
|
|
access other locations on the stack. In the program, <i>ebp</i> may be
|
|
used by the caller of <i>foo</i> also, and hence its value is copied
|
|
to the stack before it is overwritten with the value of <i>esp</i>.
|
|
The instruction <i>subl $4, %esp</i> creates enough space (four bytes)
|
|
to hold an integer by decrementing the stack pointer. In the next line,
|
|
the value 10 is copied to the four bytes whose address is obtained by
|
|
subtracting four from the contents of <i>ebp</i>. The instruction
|
|
<i>movl %ebp, %esp</i> restores the stack pointer to the value it had
|
|
after executing the first line of <i>foo</i> and <i>popl %ebp</i>
|
|
restores the base pointer register. The stack pointer now has the same
|
|
value which it had before executing the first line of <i>foo</i>. The
|
|
table below displays the contents of registers <i>ebp</i>, <i>esp</i>
|
|
and stack locations from 3988 to 3999 at the point of entry into
|
|
<i>main</i> and after the execution of every instruction in
|
|
<b>Listing 4</b> (except the return from main). We assume that
|
|
<i>ebp</i> and <i>esp</i> have values 7000 and 4000 stored in them and
|
|
stack locations 3988 to 3999 contain some arbitrary values 219986,
|
|
1265789 and 86 before the first instruction in <i>main</i> is executed.
|
|
It is also assumed that the address of the instruction after
|
|
<i>call foo</i> in <i>main</i> is 30000.
|
|
</p>
|
|
<p></p>
|
|
<a href="misc/ramankutty/table.bmp">Table 1</a>
|
|
<h2><b>6. Parameter Passing and Value Return</b></h2>
|
|
<p>
|
|
The stack can be used for passing parameters to functions. We will
|
|
follow a convention (which is used by our C compiler) that the value
|
|
stored by a function in the <i>eax</i> register is taken to be the
|
|
return value of the function. The calling program passes a parameter to
|
|
the callee by pushing its value on the stack. <b>Listing 5</b>
|
|
demonstrates this with a simple function called <i>sqr</i>.
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 5</i>
|
|
.globl main
|
|
main:
|
|
movl $12, %ebx
|
|
pushl %ebx
|
|
call sqr
|
|
addl $4, %esp //adjust esp to its value before the push
|
|
ret
|
|
sqr:
|
|
movl 4(%esp), %eax
|
|
imull %eax, %eax //compute eax * eax, store result in eax
|
|
ret
|
|
</pre>
|
|
<p>
|
|
Read the first line of <i>sqr</i> carefully. The calling function
|
|
pushes the content of <i>ebx</i> on the stack and then executes a
|
|
<i>call </i> instruction. The call will push the return address on the
|
|
stack. So inside <i>sqr</i>, the parameter is accessible at an offset
|
|
of four bytes from the top of stack.
|
|
</p>
|
|
|
|
<h2><b>8. Mixing C and Assembler</b></h2>
|
|
<p>
|
|
<b>Listing 6</b> shows a C program and an assembly language function.
|
|
The C function is defined in a file called <i>main.c</i> and the
|
|
assembly language function in <i>sqr.s</i>. You compile and link the
|
|
files together by typing <i>cc main.c sqr.s</i>.
|
|
</p>
|
|
<p>
|
|
The reverse is also pretty simple. <b>Listing 7</b> demonstrates a C
|
|
function print and its assembly language caller.
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 6</i>
|
|
//main.c
|
|
main()
|
|
{
|
|
int i = sqr(11);
|
|
printf("%d\n",i);
|
|
}
|
|
|
|
//sqr.s
|
|
.globl sqr
|
|
sqr:
|
|
movl 4(%esp), %eax
|
|
imull %eax, %eax
|
|
ret
|
|
</pre>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 7</i>
|
|
//print.c
|
|
print(int i)
|
|
{
|
|
printf("%d\n",i);
|
|
}
|
|
|
|
//main.s
|
|
.globl main
|
|
main:
|
|
movl $123, %eax
|
|
pushl %eax
|
|
call print
|
|
addl $4, %esp
|
|
ret
|
|
</pre>
|
|
|
|
<h2><b>9. Assembler Output Generated by GNU C</b></h2>
|
|
<p>
|
|
I guess this much reading is sufficient for understanding the
|
|
assembler output produced by <i>gcc</i>. <b>Listing 8</b> shows the
|
|
file <i>add.s</i> generated by <i>gcc -S add.c</i>. Note that
|
|
<i>add.s</i> has been edited to remove many assembler directives
|
|
(mostly for alignments and other things of that sort).
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 8</i>
|
|
//add.c
|
|
int add(int i,int j)
|
|
{
|
|
int p = i + j;
|
|
return p;
|
|
}
|
|
|
|
//add.s
|
|
.globl add
|
|
add:
|
|
pushl %ebp
|
|
movl %esp, %ebp
|
|
subl $4, %esp //create space for integer p
|
|
movl 8(%ebp),%edx //8(%ebp) refers to i
|
|
addl 12(%ebp), %edx //12(%ebp) refers to j
|
|
movl %edx, -4(%ebp) //-4(%ebp) refers to p
|
|
movl -4(%ebp), %eax //store return value in eax
|
|
leave //i.e. to movl %ebp, %esp; popl %ebp ret
|
|
</pre>
|
|
<p>
|
|
The program will make sense upon realizing the C statement
|
|
<b>add(10,20)</b> which gets translated into the following assembler
|
|
code:
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
pushl $20
|
|
pushl $10
|
|
call add
|
|
</pre>
|
|
<p>
|
|
Note that the second parameter is passed first.
|
|
</p>
|
|
<h2><b>10. Global Variables</b></h2>
|
|
<p>
|
|
Space is created for local variables on the stack by decrementing the
|
|
stack pointer and the allotted space is reclaimed by simply
|
|
incrementing the stack pointer. So what is the equivalent GNU C
|
|
generated code for global variables? <b>Listing 9</b> provides the
|
|
answer.
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 9</i>
|
|
//glob.c
|
|
int foo = 10;
|
|
main()
|
|
{
|
|
int p foo;
|
|
}
|
|
|
|
//glob.s
|
|
.globl foo
|
|
foo:
|
|
.long 10
|
|
.globl main
|
|
main:
|
|
pushl %ebp
|
|
movl %esp,%ebp
|
|
subl $4,%esp
|
|
movl foo,%eax
|
|
movl %eax,-4(%ebp)
|
|
leave
|
|
ret
|
|
</pre>
|
|
<p>
|
|
The statement <i>foo: .long 10</i> defines a block of 4 bytes named
|
|
foo and initializes the block with zero. The <i>.globl foo</i>
|
|
directive makes foo accessible from other files. Now try this out.
|
|
Change the statement <b>int foo</b> to <b>static int foo</b>. See how
|
|
it is represented in the assembly code. You will notice that the
|
|
assembler directive <i>.globl</i> is missing. Try this out for
|
|
different storage classes (double, long, short, const etc.).
|
|
<h2><b>11. System Calls</b></h2>
|
|
<p>
|
|
Unless a program is just implementing some math algorithms in
|
|
assembly, it will deal with such things as getting input, producing
|
|
output, and exiting. For this it will need to call on OS services. In
|
|
fact, programming in assembly language is quite the same in different
|
|
OSes, unless OS services are touched.
|
|
</p>
|
|
<p>
|
|
There are two common ways of performing a system call in Linux:
|
|
through the C library (libc) wrapper, or directly.
|
|
</p>
|
|
<p>
|
|
Libc wrappers are made to protect programs from possible system call
|
|
convention changes, and to provide POSIX compatible interface if the
|
|
kernel lacks it for some call. However, the UNIX kernel is usually
|
|
more-or-less POSIX compliant: this means that the syntax of most libc
|
|
"system calls" exactly matches the syntax of real kernel system calls
|
|
(and vice versa). But the main drawback of throwing libc away is that
|
|
one loses several functions that are not just syscall wrappers, like
|
|
printf(), malloc() and similar.
|
|
</p>
|
|
<p>
|
|
System calls in Linux are done through int 0x80. Linux differs from
|
|
the usual Unix calling convention, and features a "fastcall"
|
|
convention for system calls. The system function number is passed in
|
|
eax, and arguments are passed through registers, not the stack. There
|
|
can be up to six arguments in ebx, ecx, edx, esi, edi, ebp
|
|
consequently. If there are more arguments, they are simply passed
|
|
though the structure as first argument. The result is returned in eax,
|
|
and the stack is not touched at all.
|
|
</p>
|
|
<p>Consider Listing 10 given below.</p>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 10
|
|
#fork.c</i>
|
|
#include <stdio.h>
|
|
#include <stdlib.h>
|
|
#include <sys/types.h>
|
|
#include <unistd.h>
|
|
|
|
int main()
|
|
{
|
|
fork();
|
|
printf("Hello\n");
|
|
return 0;
|
|
}
|
|
</pre>
|
|
<p>
|
|
Compile this program with the command <i>cc -g fork.c -static</i>. Use
|
|
the <i>gdb</i> tool and type the command <i>disassemble fork</i>.
|
|
You can see the assembly code used for fork in the program. The
|
|
<i>-static</i> is the static linker option of GCC (see man page). You
|
|
can test this for other system calls and see how the actual functions
|
|
work.
|
|
</p>
|
|
<p>
|
|
There have been several attempts to write an up-to-date documentation
|
|
of the Linux system calls and I am not making this another of them.
|
|
</p>
|
|
<h2><b>11. Inline Assembly Programming</b></h2>
|
|
<p>
|
|
The GNU C supports the x86 architecture quite well, and includes the
|
|
ability to insert assembly code within C programs, such that register
|
|
allocation can be either specified or left to GCC. Of course, the
|
|
assembly instruction are architecture dependent.
|
|
</p>
|
|
<p>
|
|
The <i>asm</i> instruction allows you to insert assembly instructions
|
|
into your C or C++ programs. For example the instruction:
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
asm ("fsin" : "=t" (answer) : "0" (angle));
|
|
</pre>
|
|
<p>
|
|
is an x86-specific way of coding this C statement:
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
answer = sin(angle);
|
|
</pre>
|
|
<p>
|
|
You can notice that unlike ordinary assembly code instructions
|
|
<i>asm</i> statements permit you to specify input and output operands
|
|
using C syntax. <i>Asm</i> statements should not be used
|
|
indiscriminately. So, when should we use them?
|
|
</p>
|
|
<p></p>
|
|
<ul>
|
|
<li> <i>Asm</i> statements allow your programs to access the computer
|
|
hardware directly. This can produce programs that execute quickly. You
|
|
can use them when writing operating system code that directly needs to
|
|
interact with the hardware. For example, <i>/usr/include/asm/io.h</i>
|
|
contains assembly instructions to access input/output ports directly.
|
|
<li> Inline assembly instructions also speed up the innermost loops
|
|
of the programs. For instance, <i>sine</i> and <i>cosine</i> of the
|
|
same angles can be found by <i>fsincos</i> x86 instruction. Probably,
|
|
the two listings given below will help you understand this factor
|
|
better.
|
|
</ul>
|
|
<pre>
|
|
<i>#Listing 11
|
|
#Name : bit-pos-loop.c
|
|
#Description : Find bit position using a loop</i>
|
|
|
|
#include <stdio.h>
|
|
#include <stdlib.h>
|
|
|
|
int main (int argc, char *argv[])
|
|
{
|
|
long max = atoi (argv[1]);
|
|
long number;
|
|
long i;
|
|
unsigned position;
|
|
volatile unsigned result;
|
|
|
|
for (number = 1; number <= max; ; ++number) {
|
|
for (i=(number>>1), position=0; i!=0; ++position)
|
|
i >>= 1;
|
|
result = position;
|
|
}
|
|
return 0;
|
|
}
|
|
</pre>
|
|
<p></p>
|
|
<pre>
|
|
<i>#Listing 12
|
|
#Name : bit-pos-asm.c
|
|
#Description : Find bit position using bsrl</i>
|
|
|
|
#include <stdio.h>
|
|
#include <stdlib.h>
|
|
|
|
int main(int argc, char *argv[])
|
|
{
|
|
long max = atoi(argv[1]);
|
|
long number;
|
|
unsigned position;
|
|
volatile unsigned result;
|
|
|
|
for (number = 1; number <= max; ; ++number) {
|
|
asm("bsrl %1, %0" : "=r" (position) : "r" (number));
|
|
result = position;
|
|
}
|
|
return 0;
|
|
}
|
|
</pre>
|
|
<p>
|
|
Compile the two versions with full optimizations as given below:
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
$ cc -O2 -o bit-pos-loop bit-pos-loop.c
|
|
$ cc -O2 -o bit-pos-asm bit-pos-asm.c
|
|
</pre>
|
|
<p>
|
|
Measure the running time for each version by using the time command
|
|
and specifying a large value as the command-line argument to make sure
|
|
that each version takes at least few seconds to run.
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
$ time ./bit-pos-loop 250000000
|
|
</pre>
|
|
<p>and</p>
|
|
<p></p>
|
|
<pre>
|
|
$ time ./bit-pos-asm 250000000
|
|
</pre>
|
|
<p>
|
|
The results will be varying in different machines. However, you will
|
|
notice that the version that uses the inline assembly executes a great
|
|
deal faster.
|
|
</p>
|
|
<p>
|
|
GCC's optimizer attempts to rearrange and rewrite program' code to
|
|
minimize execution time even in the presence of <i>asm</i> expressions.
|
|
If the optimizer determines that an <i>asm's</i> output values are
|
|
not used, the instruction will be omitted unless the keyword
|
|
<i>volatile</i> occurs between <i>asm</i> and its arguments. (As a
|
|
special case, GCC will not move an <i>asm</i> without any output
|
|
operands outside a loop.) Any <i>asm</i> can be moved in ways that are
|
|
difficult to predict, even across jumps. The only way to guarantee a
|
|
particular assembly instruction ordering is to include all the
|
|
instructions in the same <i>asm</i>.
|
|
</p>
|
|
<p>
|
|
Using <i>asm's</i> can restrict the optimizer's effectiveness because
|
|
the compiler does not know the <i>asms'</i> semantics. GCC is forced
|
|
to make conservative guesses that may prevent some optimizations.
|
|
</p>
|
|
<h2><b>12. Exercises</b></h2>
|
|
<ol>
|
|
<li>Interpret the assembly code for C program in Listing 6. Modify it
|
|
for eliminating errors that are obtained when generating assembly code
|
|
with -Wall option. Compare the two assembly codes. What changes do you
|
|
observe?
|
|
<li>Compile several small C programs with and without optimization
|
|
options (like -O2). Read the resulting assembly codes and find out
|
|
some common optimization tricks used by the compiler.
|
|
<li>Interpret assembly code for switch statement.
|
|
<li>Compile several small C programs with inline asm statements. What
|
|
differences do you observe in assembly codes for such programs.
|
|
<li>A nested function is defined inside another function (the
|
|
"enclosing function"), such that:
|
|
<ul>
|
|
<li> the nested function has access to the enclosing function's
|
|
variables; and
|
|
<li> the nested function is local to the enclosing function, that is,
|
|
it can be called from elsewhere unless the enclosing function gives
|
|
you a pointer to the nested function.
|
|
</ul>
|
|
<p></p>
|
|
<p>
|
|
Nested functions can be useful because they help control the
|
|
visibility of a function.
|
|
</p>
|
|
<p>
|
|
Consider <b>Listing 13</b> given below:
|
|
</p>
|
|
<p></p>
|
|
<pre>
|
|
<p>
|
|
<i>#Listing 13</i>
|
|
/* myprint.c */
|
|
#include <stdio.h>
|
|
#include <stdlib.h>
|
|
|
|
int main()
|
|
{
|
|
int i;
|
|
void my_print(int k)
|
|
{
|
|
printf("%d\n",k);
|
|
}
|
|
scanf("%d",&i);
|
|
my_print(i);
|
|
return 0;
|
|
}
|
|
</p></pre>
|
|
<p>
|
|
Compile this program with <i>cc -S myprint.c</i> and interpret the
|
|
assembly code. Also try compiling the program with the command
|
|
<i>cc -pedantic myprint.c</i>. What do you observe?
|
|
</p>
|
|
</ol>
|
|
</body>
|
|
</html>
|
|
|
|
|
|
|
|
|
|
<!-- *** BEGIN author bio *** -->
|
|
<P>
|
|
<P>
|
|
<!-- *** BEGIN bio *** -->
|
|
<P>
|
|
<img ALIGN="LEFT" ALT="[BIO]" SRC="../gx/2002/note.png">
|
|
<em>
|
|
I have just given my final year B.Tech examinations in Computer Science and
|
|
Engineering and a native of Kerala, India.
|
|
</em>
|
|
<br CLEAR="all">
|
|
<!-- *** END bio *** -->
|
|
|
|
<!-- *** END author bio *** -->
|
|
|
|
|
|
<!-- *** BEGIN copyright *** -->
|
|
<hr>
|
|
<CENTER><SMALL><STRONG>
|
|
Copyright © 2003, Hiran Ramankutty.
|
|
Copying license <A HREF="../copying.html">http://www.linuxgazette.com/copying.html</A><BR>
|
|
Published in Issue 94 of <i>Linux Gazette</i>, September 2003
|
|
</STRONG></SMALL></CENTER>
|
|
<!-- *** END copyright *** -->
|
|
<HR>
|
|
|
|
<!--startcut ==========================================================-->
|
|
<CENTER>
|
|
<!-- *** BEGIN navbar *** -->
|
|
<A HREF="ecol.html"><< Prev</A> | <A HREF="index.html">TOC</A> | <A HREF="../index.html">Front Page</A> | <A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue94/ramankutty.html">Talkback</A> | <A HREF="../faq/index.html">FAQ</A> | <A HREF="kolp.html">Next >></A>
|
|
<!-- *** END navbar *** -->
|
|
</CENTER>
|
|
</BODY></HTML>
|
|
<!--endcut ============================================================-->
|