old-www/HOWTO/GCC-Frontend-HOWTO-4.html

97 lines
4.5 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<META NAME="GENERATOR" CONTENT="SGML-Tools 1.0.9">
<TITLE> GCC Frontend HOWTO: GCC Front End </TITLE>
<LINK HREF="GCC-Frontend-HOWTO-5.html" REL=next>
<LINK HREF="GCC-Frontend-HOWTO-3.html" REL=previous>
<LINK HREF="GCC-Frontend-HOWTO.html#toc4" REL=contents>
</HEAD>
<BODY>
<A HREF="GCC-Frontend-HOWTO-5.html">Next</A>
<A HREF="GCC-Frontend-HOWTO-3.html">Previous</A>
<A HREF="GCC-Frontend-HOWTO.html#toc4">Contents</A>
<HR>
<H2><A NAME="s4">4. GCC Front End </A></H2>
<P>GCC (GNU Compiler Collection) is essentially divided into
a front end and a back end. The main reason for this division is
for providing code reusability. As all of us know GCC supports a
variety of languages. This includes C, C++, Java etc.
<P>
If you
want to introduce a new language into GCC, you can. The only
thing you have to do is to create a new front end for that
language.
<P>The back end is same for all the languages. Different
front ends exist for different languages. So creating a compiler
in GCC means creating a new front end. Experimentally let us
try to introduce a new language into the GCC.
<P>We have to keep certain things in mind before we introduce
a language. The first thing is that we are adding a segment to
a huge code. For perfect working we have to add some routines,
declare some variables etc. which may be required by the other
code segments. Secondly some errors produced by us may take
the back end into an inconsistent state. Little information
will be available to us about the mistake. So we have to go
through our code again and again and understand what went wrong.
Sometimes this can be accomplished only through trial and error
method.
<P>
<H2><A NAME="ss4.1">4.1 Tree and rtl </A>
</H2>
<P>Let me give a general introduction to tree structure and rtl.
From my experience a person developing a new small front end need
not have a thorough idea regarding tree structure and rtl. But you
should have a general idea regarding these.
<P>
Tree is the central data structure of the gcc front end. It is
a perfect one for the compiler development. A tree node is capable
of holding integers, reals ,complex, strings etc. Actually
a tree is a pointer type. But the object to which it points may vary.
If we are just taking the example of an integer node of a tree, it is a
structure containing an integer value, some flags and some space
for rtl (These flags are common to all the nodes). There are a number
of flags. Some of them are flags indicating whether the node is
read only , whether it is of unsigned type etc. The complete
information about trees can be obtained from the files - tree.c,
tree.h and tree.def.
But for our requirement there are a large number of functions and
macros provided by the GCC, which helps us to manipulate the tree.
In our program we won't be directly handling the trees. Instead we
use function calls.
<P>RTL stands for register transfer language. It is the
intermediate code handled by GCC. Each and every tree structure
that we are building has to be changed to rtl so that the back
end can work with it perfectly. I am not trying in anyway to
explain about rtl. Interested readers are recommended to see the
manual of GCC. As with the trees GCC provides us a number of
functions
to produce the rtl code from the trees. So in our compiler we
are trying to build the tree and convert it to rtl for each
and every line of the program.
Most of the rtl routines take the trees as arguments and emit
the rtl statements.
<P>So I hope that you are having a vague idea about what
we are going to do. The tree building and rtl generation can
be considered as two different phases. But they are intermixed
and can't be considered separately. There is one more phase that the
front end is expected to do. It is the preliminary optimization phase.
It includes techniques like constant folding, arithmetic simplification
etc. which we can handle in the front end. But I am totally ignoring
that phase to simplify our front end. Our optimization is completely
dependent on the back end. I also assume that you are perfect
programmers. It is to avoid error routines from front end.
All the steps are taken to simplify the code as much as possible.
<P>From now onwards our compiler has only three phases - lexical,
syntax and intermediate code generation. Rest is the head ache of the
back end.
<P>
<HR>
<A HREF="GCC-Frontend-HOWTO-5.html">Next</A>
<A HREF="GCC-Frontend-HOWTO-3.html">Previous</A>
<A HREF="GCC-Frontend-HOWTO.html#toc4">Contents</A>
</BODY>
</HTML>