LDP/LDP/howto/linuxdoc/From-PowerUp-To-Bash-Prompt...

1550 lines
61 KiB
Plaintext

<!doctype linuxdoc system>
<ARTICLE>
<TITLE>From Power Up To Bash Prompt
<AUTHOR>Greg O'Keefe, <tt>gcokeefe@postoffice.utas.edu.au</tt>
<DATE>v0.7, April 2000
<ABSTRACT>
This is a brief description of what happens in a Linux system, from the time
that you turn on the power, to the time that you log in and get a bash prompt.
It is organised by package to make it easier for people who want to build a
system from source code. Understanding this will be helpful when you need to
solve problems or configure your system.
</ABSTRACT>
<TOC>
<SECT>Introduction
<P>
I find it frustrating that many things happen inside my Linux machine that I do
not understand. If, like me, you want to really understand your system rather
than just knowing how to use it, this document should be a good place to start.
This kind of background knowledge is also needed if you want to be a top notch
Linux problem solver.
<P>
I assume that you have a working Linux box, and understand some basic things
about Unix and PC hardware. If not, an excellent place to start learning is
Eric S. Raymond's
<URL
URL="http://www.linuxdoc.org/HOWTO/Unix-and-Internet-Fundamentals-HOWTO.html"
NAME ="The Unix and Internet Fundamentals HOWTO" >
It is short, very readable and covers all the basics.
<P>
The main thread in this document is how Linux starts itself up.
But it also tries to be a more comprehensive learning resource.
I have included exercises in each section. If you actually do some of these,
you will learn much more than you could by just reading.
<P>
There are also links to source code downloads. The reason for this is that
I hope some readers will undertake the best Linux learning exercise that I
know of, which is building a system from source code.
Giambattista Vico, an Italian philosopher (1668-1744) said
``verum ipsum factum'', which means ``understanding arises through making''.
Thanks to Alex (see <REF ID="acknowledge" NAME="Acknowledgements">)
for this quote.
<P>
If you want to ``roll your own'', you should also see Gerard Beekmans'
<URL URL="http://www.linuxfromscratch.org" NAME="Linux From Scratch HOWTO"> (LFS).
LFS has detailed instructions on building a complete useable system from
source code. On the LFS website, you will also find a mailing list for people
building systems this way. What I have included in this document, is
instructions
(see <REF ID="building" NAME="Building a Minimal Linux System From Source">)
for building a ``toy'' system, purely as a learning exercise.
<P>
Packages are presented in the order in which they appear in the system
startup process. This means that if you install the packages in this order
you can reboot after each installation, and see the system get a little closer
to giving you a bash prompt each time. There is a reassuring sense of
progress in this.
<P>
I recommend that you first read the main text of each section, skipping the
exercises and references. Then decide how deep an understanding you want to
develop, and how much effort you are prepared to put in. Then start at the
beginning again, doing the exercises and additional reading as you go.
<SECT>Hardware
<P>
When you first turn on your computer it tests itself to make sure everything is
in working order. This is called the ``Power on self test''. Then a program
called the bootstrap loader, located in the ROM BIOS, looks for
a boot sector. A boot sector is the first sector of a disk and has a small
program that can load an operating system. Boot sectors are marked with a magic
number 0xAA55 = 43603 at byte 0x1FE = 510. That's the last two bytes of the
sector. This is how the hardware can tell
whether the sector is a boot sector or not.
<P>
The bootstrap loader has a list of places to look for a boot sector. My old
machine looks in the primary floppy drive, then the primary hard drive.
More modern machines can also look for a boot sector on a CD-ROM.
If it finds a boot sector, it loads it into memory and passes control to the
program that loads the operating system.
On a typical Linux system, this program will be LILO's first stage
boot loader. There are many different ways of setting your system up to boot
though. See the <EM>LILO User's Guide</EM> for details. See section
<REF ID="lilo-links" NAME="LILO"> for a URL.
<P>
Obviously there is a lot more to say about what PC hardware does. But this is
not the place to say it. See one of the many good books about PC hardware.
<SECT1>Configuration
<P>
The machine stores some information about itself in its CMOS. This
includes what disks and RAM are in the system. The machine's BIOS contains a
program to let you modify these settings. Check the messages on your screen as
the machine is turned on to see how to access it. On my machine, you press the
delete key before it begins loading its operating system.
<SECT1>Exercises
<LABEL ID="hardware-ex">
<P>
A good way to learn about PC hardware is to build a machine out of second hand
parts. Get at least a 386 so you can easily run Linux on it. It won't cost much.
Ask around, someone might give you some of the parts you need.
<P>
Check out, download compile and make a boot disk for
<URL URL="http://learning.taslug.org.au/resources" NAME=Unios>.
(They used to have a home page at <URL URL="http://www.unios.org">,
but it disappeared)
This is just a bootable ``Hello World!'' program, consisting of just over 100
lines of assembler code. It would be good to see it converted to a format
that the GNU assembler <TT>as</TT> can understand.
<P>
Open the boot disk image for unios with a hex editor. This image is 512 bytes
long, exactly one sector. Find the magic number 0xAA55. Do the same for
the boot sector from a bootable floppy disk or your own computer. You can
use the <TT>dd</TT> command to copy it to a file: <TT>dd if=/dev/fd0 of=boot.sector</TT>.
Be <EM>very</EM> careful to get <TT>if</TT> (input file) and <TT>of</TT>
(output file) the right way round!
<P>
Check out the source code for LILO's boot loader.
<SECT1>More Information
<P>
<ITEMIZE>
<ITEM>
<URL URL="http://www.linuxdoc.org/HOWTO/Unix-and-Internet-Fundamentals-HOWTO.html" NAME="The Unix and Internet Fundamentals HOWTO">
by Eric S. Raymond,
especially section 3, <EM>What happens when you switch on a computer?</EM>
<ITEM>The first chapter of <EM>The LILO User's Guide</EM> gives an excellent
explanation of PC disk partitions and booting.
See section <REF ID="lilo-links" NAME="LILO"> for a URL.
<ITEM><EM>The NEW Peter Norton Programmer's Guide to the IBM PC & PS/2</EM>,
by Peter Norton and Richard Wilton, Microsoft Press 1988
There is a newer Norton book, which looks good, but I can't afford it right now!
<ITEM>One of the many books available on upgrading PC's
</ITEMIZE>
<SECT>Lilo
<P>
When the computer loads a boot sector on a normal Linux system, what it loads is actually a part of lilo, called the ``first stage boot loader''. This is a tiny program who's only job in life is to load and run the ``second stage boot loader''.
<P>
The second stage loader gives you a prompt (if it was installed that way) and loads the operating system you choose.
<P>
When your system is up and running, and you run <TT>lilo</TT>, what you are actually running is the ``map installer''. This reads the configuration file <TT>/etc/lilo.conf</TT> and writes the boot loaders, and information about the operating systems it can load, to the hard disk.
<P>
There are lots of different ways to set your system up to boot. What I have just explained is the most obvious and ``normal'' way, at least for a system who's main operating system is Linux. The Lilo Users' Guide explains several examples of ``boot concepts''. It is worth reading these, and trying some of them out.
<SECT1>Configuration
<P>
The configuration file for lilo is <TT>/etc/lilo.conf</TT>. There is a manual
page for it: type <TT>man lilo.conf</TT> into a shell to see it. The main thing
in <TT>lilo.conf</TT> is one entry for each thing that lilo is set up to boot. For a
Linux entry, this includes where the kernel is, and what disk partition to
mount as the root filesystem. For other operating systems, the main piece of
information is which partition to boot from.
<SECT1>Exercises
<P>
<EM>DANGER:</EM> take care with these exercises. It is easy enough to get
something wrong and screw up your master boot record and make your system
unuseable. Make sure you have a working rescue disk, and know how to use it to
fix things up again. See below for a link to tomsrtbt, the rescue disk I use
and recommend. The best precaution is to use a machine that doesn't matter.
<P>
Set up lilo on a floppy disk. It doesn't matter if there is nothing other than
a kernel on the floppy - you will get a ``kernel panic'' when the kernel is
ready to load init, but at least you will know that lilo is working.
<P>
If you like you can press on and see how much of a system you can get going on
the floppy. This is probably the second best Linux learning activity around.
See the Bootdisk HOWTO (url below), and tomsrtbt (url below) for clues.
<P>
Get lilo to boot unios (see section <REF ID="hardware-ex" NAME="hardware
exercises"> for a URL). As an extra challenge, see if you can do this on a
floppy disk.
<P>
Make a boot-loop. Get lilo in the master boot record to boot lilo in one of the
primary partition boot sectors, and have that boot lilo in the master boot
record... Or perhaps use the master boot record and all four primary partitions
to make a five point loop. Fun!
<SECT1>More Information
<P>
<LABEL ID="lilo-links">
<ITEMIZE>
<ITEM>The lilo man page.
<ITEM>The Lilo package (see <REF ID="downloads" NAME="downloads">)
contains the ``LILO User's Guide''
<TT>lilo-u-21.ps.gz</TT> (or a later version).
You may already have this document though.
Check <TT>/usr/doc/lilo</TT> or there abouts.
The postscript version is better than the plain text,
since it contains diagrams and tables.
<ITEM><URL URL="http://www.toms.net/rb" NAME="tomsrtbt"> the coolest single
floppy linux. Makes a great rescue disk.
<ITEM><URL URL="http://www.linuxdoc.org/HOWTO/Bootdisk-HOWTO/"
NAME="The Bootdisk HOWTO">
</ITEMIZE>
<SECT>The Linux Kernel
<P>
The kernel does quite a lot really. I think a fair way of summing it up is that it makes the hardware do what the programs want, fairly and efficiently.
<P>
The processor can only execute one instruction at a time, but Linux systems
appear to be running lots of things simultaneously. The kernel acheives this by
switching from task to task really quickly. It makes the best use of the processor
by keeping track of which processes are ready to go, and which ones are waiting
for something like a record from a hard disk file, or some keyboard input.
This kernel task is called scheduling.
<P>
If a program isn't doing anything, then it doesn't need to be in RAM. Even a
program that is doing something, might have parts that aren't doing anything.
The address space of each process is divided into pages. The Kernel keeps track of
which pages of which processes are being used the most. The pages that aren't
used so much can be moved out to the swap partition. When they are needed again,
another unused page can be paged out to make way for it. This is virtual memory
management.
<P>
If you have ever compiled your own Kernel, you will have noticed that there are
many many options for specific devices. The kernel contains a lot of specific
code to talk to diverse kinds of hardware, and present it all in a nice uniform
way to the application programs.
<P>
The Kernel also manages the filesystem, interprocess communication, and a lot
of networking stuff.
<P>
Once the kernel is loaded, the first thing it does is look for an <TT>init</TT> program to run.
<SECT1>Configuration
<P>
Most of the configuration of the kernel is done when you build it, using
<TT>make menuconfig</TT>, or <TT>make xconfig</TT> in <TT>/usr/src/linux/</TT>
(or wherever your Linux kernel source is). You can reset the default video
mode, root filesystem, swap device and RAM disk size using <TT>rdev</TT>. These
parameters and more can also be passed to the kernel from lilo. You can give lilo
parameters to pass to the kernel either in lilo.conf, or at the lilo prompt.
For example if you wanted to use hda3 as your root file system instead of hda2,
you might type
<VERB>
LILO: linux root=/dev/hda3
</VERB>
<P>
If you are building a system from source, you can make life a lot simpler by
creating a ``monolithic'' kernel. That is one with no modules. Then you don't
have to copy kernel modules to the target system.
<P>
NOTE: The <TT>System.map</TT> file is used by the kernel logger to determine
the module names generating messages. The program <TT>top</TT> also uses this
information. When you copy the kernel to the target system, copy
<TT>System.map</TT> too.
<SECT1>Exercises
<P>Think about this: <TT>/dev/hda3</TT> is a special type of file that
describes a hard disk partition. But it lives on a file system just like all
other files. The kernel wants to know which partition to mount as the root
filesystem - it doesn't have a file system yet. So how can it read
<TT>/dev/hda3</TT> to find out which partition to mount?
<P>
If you haven't already: build your own kernel. Read all the help information
for each option.
<P>
See how small a kernel you can make that still works. You can learn a lot by leaving the wrong things out!
<P>
Read ``The Linux Kernel'' (URL below) and as you do, find the parts of the source code that it refers to. The book (as I write) refers to kernel version 2.0.33, which is pretty out of date. It might be easier to follow if you download this old version and read the source there. Its amazing to find bits of C code called ``process'' and ``page''.
<P>
Hack! See if you can make it spit out some extra messages or something.
<SECT1>More Information
<LABEL ID="Kernel">
<P>
<ITEMIZE>
<ITEM><TT>/usr/src/linux/README</TT> and the contents of
<TT>/usr/src/linux/Documentation/</TT>
(These may be in some other place on your system)
<ITEM> <URL URL="http://mirror.aarnet.edu.au/linux/LDP/HOWTO/Kernel-HOWTO.html"
NAME="The Kernel HOWTO">
<ITEM>The help available when you configure a kernel using
<TT>make menuconfig</TT> or <TT>make xconfig</TT>
<ITEM> <URL URL="http://mirror.aarnet.edu.au/linux/LDP/LDP/"
NAME="The Linux Kernel (and other LDP Guides)">
<ITEM> Kernel source download see <REF ID="downloads" NAME="downloads">
</ITEMIZE>
<SECT>The GNU C Library
<P>
The next thing that happens as your computer starts up is that init is loaded
and run. However, init, like almost all programs, uses functions from libraries.
<P>
You may have seen an example C program like this:
<P>
<VERB>
main() {
printf("Hello World!\n");
}
</VERB>
The program contains no definition of <TT>printf</TT>, so where does it come from?
It comes from the standard C libraries, on a GNU/Linux system, glibc.
If you compile it under Visual C++, then it comes from a Microsoft
implementation of the same standard functions. There are zillions of
these standard functions, for math, string, dates/times memory allocation
and so on. Everything in Unix (including Linux) is either written in C
or has to try hard to pretend it is, so everything uses these functions.
<P>
If you look in <TT>/lib</TT> on your linux system you will see lots of files called
<TT>libsomething.so</TT> or <TT>libsomething.a</TT> etc. They are libraries of these functions.
Glibc is just the GNU implementation of these functions.
<P>
There are two ways programs can use these library functions. If you <EM>statically</EM>
link a program, these library functions are copied into the executable that gets
created. This is what the <TT>libsomething.a</TT> libraries are for. If you
<EM>dynamically</EM> link a program (and this is the default), then when the program
is running and needs the library code, it is called from the <TT>libsomething.so</TT>
file.
<P>
The command <TT>ldd</TT> is your friend when you want to work out which
libraries are needed by a particular program. For example, here are the
libraries that <TT>bash</TT> uses:
<P>
<VERB>
[greg@Curry power2bash]$ ldd /bin/bash
libtermcap.so.2 => /lib/libtermcap.so.2 (0x40019000)
libc.so.6 => /lib/libc.so.6 (0x4001d000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
</VERB>
<SECT1>Configuration
<P>
Some of the functions in the libraries depend on where you are. For example, in Australia we write dates as dd/mm/yy, but Americans write mm/dd/yy. There is a program that comes with the <TT>glibc</TT> distribution called <TT>localedef</TT> which enables you to set this up.
<SECT1>Exercises
<P>
Use <TT>ldd</TT> to find out what libraries your favourite applications use.
<P>
Use <TT>ldd</TT> to find out what libraries <TT>init</TT> uses.
<P>
Make a toy library, with just one or two functions in it. The program
<TT>ar</TT> is used to create them, the man page for <TT>ar</TT> might be a
good place to start investigating how this is done. Write, compile and link
a program that uses this library.
<SECT1>More Information
<P>
<ITEMIZE>
<ITEM>source code, see section <REF ID="downloads" NAME="downloads">
</ITEMIZE>
<SECT>Init
<P>
I will only talk about the ``System V'' style of init that Linux systems mostly
use. There are alternatives. In fact, you can put any program you like in
<TT>/sbin/init</TT>, and the kernel will run it when it has finished loading.
<P>
It is <TT>init</TT>'s job to get everthing running the way it should be.
It checks that
the file systems are ok and mounts them. It starts up ``daemons'' to log system
messages, do networking, serve web pages, listen to your mouse and so on. It
also starts the getty processes that put the login prompts on your virtual
terminals.
<P>
There is a whole complicated story about switching ``run-levels'', but I'm
going to mostly skip that, and just talk about system start up.
<P>
Init reads the file <TT>/etc/inittab</TT>, which tells it what to do.
Typically, the first thing it is told to do is to run an initialisation script.
The program that executes (or interprets) this script is <TT>bash</TT>,
the same program that gives you a command prompt.
In Debian systems, the initialisation script is <TT>/etc/init.d/rcS</TT>, on Red Hat,
<TT>/etc/rc.d/rc.sysinit</TT>. This is where the filesystems get checked and
mounted, the clock set, swap space enabled, hostname gets set etc.
<P>
Next, another script is called to take us into the default run-level. This just
means a set of subsystems to start up. There is a set of directories
<TT>/etc/rc.d/rc0.d</TT>,
<TT>/etc/rc.d/rc1.d</TT>, ..., <TT>/etc/rc.d/rc6.d</TT> in Red Hat, or
<TT>/etc/rc0.d</TT>,
<TT>/etc/rc1.d</TT>, ..., <TT>/etc/rc6.d</TT> in Debian, which correspond to the
run-levels. If we are going into runlevel 3 on a Debian system, then the script
runs all the scripts in <TT>/etc/rc3.d</TT> that start with `S' (for start).
These scripts are really just links to scripts in another directory usually
called <TT>init.d</TT>.
<P>
So our run-level script was called by <TT>init</TT>, and it is looking in a directory for scripts starting with `S'. It might find <TT>S10syslog</TT> first. The numbers tell the run-level script which order to run them in. So in this case <TT>S10syslog</TT> gets run first, since there were no scripts starting with S00 ... S09. But <TT>S10syslog</TT> is really a link to <TT>/etc/init.d/syslog</TT> which is a script to start and stop the system logger. Because the link starts with an `S', the run-level script knows to execute the <TT>syslog</TT> script with a ``start'' parameter. There are corresponding links starting with `K' (for kill), which specify what to shut down and in what order when leaving the run-level.
<P>
To change what subsystems start up by default, you must set up these links in
the <TT>rcN.d</TT> directory, where N is the default runlevel set in your
<TT>inittab</TT>.
<P>
The last important thing that init does is to start some <TT>getty</TT>'s.
These are ``respawned'' which means that if they stop, <TT>init</TT> just
starts them again. Most distributions come with six virtual terminals. You may
want less than this to save memory, or more so you can leave lots of
things running and
quickly flick to them as you need them. You may also want to run a
<TT>getty</TT> for a
text terminal or a dial in modem. In this case you will need to edit the
<TT>inittab</TT> file.
<SECT1>Configuration
<P>
<TT>/etc/inittab</TT> is the top level configuration file for init.
<P>
The <TT>rcN.d</TT> directories, where N = 0, 1, ..., 6 determine what
subsystems are started.
<P>
Somewhere in one of the scripts invoked by init, the <TT>mount -a</TT> command
will be issued. This means mount all the file systems that are supposed to be
mounted. The file <TT>/etc/fstab</TT> defines what is supposed to be mounted.
If you want to change what gets mounted where when your system starts up, this
is the file you will need to edit. There is a man page for <TT>fstab</TT>.
<SECT1>Exercises
<P>
Find the <TT>rcN.d</TT> directory for the default run-level of your system and do a <TT>ls -l</TT> to see what the files are links to.
<P>
Change the number of gettys that run on your system.
<P>
Remove any subsystems that you don't need from your default run-level.
<P>
See how little you can get away with starting.
<P>
Set up a floppy disk with lilo, a kernel and a statically linked "hello world" program called <TT>/sbin/init</TT> and watch it boot up and say hello.
<P>
Watch carefully as your system starts up, and take notes about what it tells you is happening. Or print a section of your system log <TT>/var/log/messages</TT> from start up time. Then starting at <TT>inittab</TT>, walk through all the scripts and see what code does what. You can also put extra start up messages in, such as
<VERB>
echo "Hello, I am rc.sysinit"
</VERB>
This is a good exercise in learning Bash shell scripting too, some of the scripts are quite complicated. Have a good Bash reference handy.
<SECT1>More Information
<P>
<ITEMIZE>
<ITEM>see <REF ID="downloads" NAME="downloads"> for source code download url's
<ITEM>There are man pages for the <TT>inittab</TT> and <TT>fstab</TT> files.
Type (eg) <TT>man inittab</TT> into a shell to see it.
<ITEM>The Linux System Administrators Guide has a good
<URL URL="http://mirror.aarnet.edu.au/linux/LDP/LDP/"
NAME="section"> on init.
</ITEMIZE>
<SECT>The Filesystem
<P>
In this section, I will be using the word ``filesystem'' in two different ways.
There are filesystems on disk partitions and other devices,
and there is the filesystem as it is
presented to you by a running Linux system. In Linux, you ``mount'' a disk
filesystem onto the system's filesystem.
<P>
In the previous section I mentioned that init scripts check and mount the
filesystems. The commands that do this are <TT>fsck</TT> and <TT>mount</TT>
respectively.
<P>
A hard disk is just a big space that you can write ones and zeros on. A
filesystem imposes some structure on this, and makes it look like files within
directories within directories... Each file is represented by an inode, which
says who's file it is, when it was created and where to find its contents.
Directories are also represented by inodes, but these say where to find the
inodes of the files that are in the directory. If the system wants to read
<TT>/home/greg/bigboobs.jpeg</TT>, it first finds the inode for the root
directory <TT>/</TT> in the ``superblock'', then finds the inode for the
directory <TT>home</TT> in the contents of <TT>/</TT>, then finds the inode for
the directory <TT>greg</TT> in the contents of <TT>/home</TT>,
then the inode for <TT>bigboobs.jpeg</TT> which
will tell it which disk blocks to read.
<P>
If we add some data to the end of a file, it could happen that the data is
written before the inode is updated to say that the new blocks belong to the
file, or vice versa. If the power cuts out at this point, the filesystem will
be broken. It is this kind of thing that <TT>fsck</TT> attempts to detect and
repair.
<P>
The mount command takes a filesystem on a device, and adds it to the heirarchy
that you see when you use your system. Usually, the kernel mounts its root file
system read-only. The mount command is used to remount it read-write after
<TT>fsck</TT> has checked that it is ok.
<P>
Linux supports other kinds of filesystem too: msdos, vfat, minix and so on. The
details of the specific kind of filesystem are abstracted away by the virtual
file system (VFS). I won't go into any detail on this though. There is a
discussion of it in ``The Linux Kernel''
(see section <REF ID="Kernel" NAME="The Linux Kernel"> for a url)
<SECT1>Configuration
<P>
There are parameters to the command <TT>mke2fs</TT> which creates ext2
filesystems. These control the size of blocks, the number of inodes and so on.
Check the <TT>mke2fs</TT> man page for details.
<P>
What gets mounted where on your filesystem is controlled by the <TT>/etc/fstab</TT>
file. It also has a man page.
<SECT1>Exercises
<P>
Make a very small filesystem, and view it with a hex viewer. Identify inodes,
superblocks and file contents.
<P>
I believe there are tools that give you a graphical view of a filesystem.
Find one, try it out, and email me the url and a review!
<P>
Check out the ext2 filesystem code in the Kernel.
<SECT1>More Information
<P>
<ITEMIZE>
<ITEM>Chapter 9 of the LDP book ``The Linux Kernel'' is an excellent description
of filesystems. You can find it at the Australian LDP
<URL URL="http://mirror.aarnet.edu.au/linux/LDP/LDP/"
NAME="mirror">
<ITEM>The <TT>mount</TT> command is part of the util-linux package, there is a link
to it in <REF ID="downloads" NAME="downloads">.
<ITEM>man pages for <TT>mount</TT>, <TT>fstab</TT>, <TT>fsck</TT> and <TT>mke2fs</TT>
<ITEM>EXT2 File System Utilities
<URL URL="http://web.mit.edu/tytso/www/linux/e2fsprogs.html"
NAME="ext2fsprogs"> home page
<URL URL="ftp://mirror.aarnet.edu.au/pub/linux/metalab/system/filesystems/ext2/"
NAME="ext2fsprogs"> Australian mirror. There is also a Ext2fs-overview
document here, although it is out of date, and not as readable as chapter 9
of ``The Linux Kernel''
<ITEM> <LABEL ID="FHS">
<URL URL="ftp://tsx-11.mit.edu/pub/linux/docs/linux-standards/fsstnd/"
NAME="Unix File System Standard">
Another <URL URL="http://www.pathname.com/fhs/"
NAME="link"> to the Unix File System Standard.
This describes what should go where
in a Unix file system, and why. It also has minimum requirements for
the contents of <TT>/bin</TT>, <TT>/sbin</TT> and so on. This is a
good reference if your goal is to make a minimal yet complete system.
</ITEMIZE>
<SECT>Kernel Daemons
<P>
Unfortunately, this section contains more conjectures and questions than facts.
Perhaps you can help?
<P>
If you issue the <TT>ps aux</TT> command, you will see something like the following:
<P>
<VERB>
USER PID %CPU %MEM SIZE RSS TTY STAT START TIME COMMAND
root 1 0.1 8.0 1284 536 ? S 07:37 0:04 init [2]
root 2 0.0 0.0 0 0 ? SW 07:37 0:00 (kflushd)
root 3 0.0 0.0 0 0 ? SW 07:37 0:00 (kupdate)
root 4 0.0 0.0 0 0 ? SW 07:37 0:00 (kpiod)
root 5 0.0 0.0 0 0 ? SW 07:37 0:00 (kswapd)
root 52 0.0 10.7 1552 716 ? S 07:38 0:01 syslogd -m 0
root 54 0.0 7.1 1276 480 ? S 07:38 0:00 klogd
root 56 0.3 17.3 2232 1156 1 S 07:38 0:13 -bash
root 57 0.0 7.1 1272 480 2 S 07:38 0:01 /sbin/agetty 38400 tt
root 64 0.1 7.2 1272 484 S1 S 08:16 0:01 /sbin/agetty -L ttyS1
root 70 0.0 10.6 1472 708 1 R Sep 11 0:01 ps aux
</VERB>
<P>
This is a list of the processes running on the system. Note that <TT>init</TT>
is process number one. Processes 2, 3, 4 and 5 are kflushd, kupdate, kpiod and
kswapd. There is something strange here though: notice that in both the virtual
storage size (SIZE) and the Real Storage Size (RSS) columns, these processes
have zeroes. How can a process use no memory? These processes are really part
of the kernel. The kernel does not show up on process lists at all, and you can
only work out what memory it is using by subtracting the memory available from
the amount on your system. The brackets around the command name could signify
that these are kernel processes(?)
<P>
<TT>kswapd</TT> moves parts of programs that are not currently being used
from real storage (ie RAM) to the swap space (ie hard disk). <TT>kflushd</TT>
writes data from buffers to disk. This allows things to run faster. What
programs write can be kept in memory, in a buffer, then written to disk in
larger more efficient chunks. I don't know what <TT>kupdate</TT> and
<TT>kpiod</TT> are for.
<P>
This is where my knowledge ends. What do these last two daemons do? Why do
kernel daemons get explicit process numbers rather than just being anonymous
bits of kernel code? Does init actually start them, or are they already running
when init arrives on the scene?
<P>
I put a script to mount <TT>/proc</TT> and do a <TT>ps aux</TT> in <TT>/sbin/init</TT>. Process 1 was the script itself, and processess 2, 3, 4 and 5 were the kernel daemons just as under the real init. The kernel must put these processes there, because my script certainly didn't!
<P>
The following ramblings were contributed by David Leadbeater:
<P>
These processes seem to take care of disk reads and writes, they seem to be
started by the kernel but after it runs the init process, it seems that being
run as kernel processes rather than seperate processess they are protected from
being killed (kill -9 dosen't stop them), I am not sure why they are run as
seperate threads (it seems to be something with disk access)
<p><em>kflushd and kupdate</em>
These two processes are started to flush dirty (changed) buffers back to disk.
kflushd is run when the buffers are full and kupdate runs periodically (5
seconds?) to sync the disk and the buffers in memory.
<p><em>kpiod and kswapd</em>
These deal with paging out pages (sections) of memory into the swap file so
main memory never gets exhausted, these are similar to kflushd and kupdate in
that one is run when needed kpiod and the other kswapd is run peridically (1
second intervals)
<p><em>Other Kernel Daemons</em>
On a default install of RH6 kupdate is missing but update is running as a user
space daemon so it seems it needs to be run! Also another daemon mdrecoveryd is there, this seems to be dealing with software RAID, looking at the kernel source it seems that some SCSI drivers also start seperate processes.
<p>
I am still unsure of the meaning of the brackets but it seems that they appear
when the RSS of a process is 0 meaning it isn't using any memory?
<p>
(end of ramble, thanks David)
<P>
<SECT1>Configuration
<P>
I don't know of any configuration for these kernel daemons.
<SECT1>Exercises
<P>
Find out what these processes are for, how they work, and write a new ``Kernel Daemons'' section for this document and send it to me!
<SECT1>More Information
<P>
The Linux Documentation Project's ``The Linux Kernel''
(see section <REF ID="Kernel" NAME="The Linux Kernel"> for a url),
and the kernel source code are all I can think of.
<SECT>System Logger
<P>
Init starts the <TT>syslogd</TT> and <TT>klogd</TT> daemons. They write
messages to logs. The kernel's messages are handled by <TT>klogd</TT>, while
<TT>syslogd</TT> handles log messages from other processes. The main log is
<TT>/var/log/messages</TT>. This is a good place to look if something is going
wrong with your system. Often there will be a valuable clue in there.
<SECT1>Configuration
<P>
The file <TT>/etc/syslog.conf</TT> tells the loggers what messages to put where. Messages are identified by which service they come from, and what priority level they are. This configuration file consists of lines that say messages from service x with priority y go to z, where z is a file, tty, printer, remote host or whatever.
<P>
NOTE: Syslog requires the <TT>/etc/services</TT> file to be present. The services file allocates ports. I am not sure whether syslog needs a port allocated so that it can do remote logging, or whether even local logging is done through a port,
or whether it just uses <TT>/etc/services</TT> to convert the service names
you type <TT>/etc/syslog.conf</TT> into port numbers.
<SECT1>Exercises
<P>
Have a look at your system log. Find a message you don't understand, and find out what it means.
<P>
Send all your log messages to a tty. (set it back to normal once done)
<SECT1>More Information
<P>Australian sysklogd <URL URL="http://mirror.aarnet.edu.au/pub/linux/metalab/system/daemons/"
NAME="Mirror">
<SECT>Getty and Login
<P>
Getty is the program that enables you to log in through a serial device such as a virtual terminal, a text terminal, or a modem. It displays the login prompt. Once you enter your username, getty hands this over to <TT>login</TT> which asks for a password, checks it out and gives you a shell.
<P>
There are many getty's available. Some distributions, including Red Hat use
a very small one called <TT>mingetty</TT> that only works with virtual terminals.
<P>
The <TT>login</TT> program is part of the util-linux package, which also
contains a getty called <TT>agetty</TT>, which works fine. This package also
contains <TT>mkswap</TT>, <TT>fdisk</TT>,
<TT>passwd</TT>, <TT>kill</TT>, <TT>setterm</TT>, <TT>mount</TT>,
<TT>swapon</TT>, <TT>rdev</TT>, <TT>renice</TT>,
<TT>more</TT> (the program) and more (ie more programs).
<SECT1>Configuration
<P>
The message that comes on the top of your screen with your login prompt comes
from <TT>/etc/issue</TT>. Gettys are usually started in <TT>/etc/inittab</TT>.
Login checks user details in <TT>/etc/passwd</TT>, and if you have password
shadowing, <TT>/etc/shadow</TT>.
<SECT1>Exercises
<P>
Create a <TT>/etc/passwd</TT> by hand. Passwords can be set to null, and
changed with the program <TT>passwd</TT> once you log on. See the man page for
this file Use <TT>man 5 passwd</TT> to get the man page for the file rather
than the man page for the program.
<SECT>Bash
<P>
If you give <TT>login</TT> a valid username and password combination, it will
check in <TT>/etc/passwd</TT> to see which shell to give you. In most cases on
a Linux system this will be <TT>bash</TT>. It is <TT>bash</TT>'s job to read
your commands and see that they are acted on. It is simultaneously a user
interface, and a programming language interpreter.
<P>
As a user interface it reads your commands, and executes them itself if they
are ``internal'' commands like <TT>cd</TT>, or finds and executes a program if
they are ``external'' commands like <TT>cp</TT> or <TT>startx</TT>. It also
does groovy stuff like keeping a command history, and completing filenames.
<P>
We have already seen <TT>bash</TT> in action as a programming language
interpreter. The scripts that <TT>init</TT> runs to start the system up are
usually shell scripts, and are executed by <TT>bash</TT>. Having a proper
programming language, along with the usual system utilities available at the
command line makes a very powerful combination, if you know what you are doing.
For example (smug mode on) I needed to apply a whole stack of ``patches'' to a
directory of source code the other day. I was able to do this with the
following single command:
<VERB>
for f in /home/greg/sh-utils-1.16*.patch; do patch -p0 < $f; done;
</VERB>
<P>
This looks at all the files in my home directory whose names start with
<TT>sh-utils-1.16</TT> and end with <TT>.patch</TT>. It then takes each of
these in turn, and sets the variable <TT>f</TT> to it and executes the commands
between <TT>do</TT> and <TT>done</TT>. In this case there were 11 patch files,
but there could just as easily have been 3000.
<SECT1>Configuration
<P>
The file <TT>/etc/profile</TT> controls the system-wide behaviour of bash. What
you put in here will affect everybody who uses bash on your system. It will do
things like add directories to the <TT>PATH</TT>, set your <TT>MAIL</TT>
directory variable.
<P>
The default behaviour of the keyboard often leaves a lot to be desired. It is
actually readline that handles this. Readline is a separate package that
handles command line interfaces, providing the command history and filename
completion, as well as some advanced line editing features. It is compiled into
bash. By default, readline is configured using the file <TT>.inputrc</TT> in
your home directory. The bash variable INPUTRC can be used to override this for
bash. For example in Red Hat 6, <TT>INPUTRC</TT> is set to
<TT>/etc/inputrc</TT> in <TT>/etc/profile</TT>. This means that backspace,
delete, home and end keys work nicely for everyone.
<P>
Once bash has read the system-wide configuration file, it looks for your
personal configuration file. It checks in your home directory for
<TT>.bash_profile</TT>, <TT>.bash_login</TT> and <TT>.profile</TT>. It runs the
first one of these it finds. If you want to change the way bash behaves for
you, without changing the way it works for others, do it here. For example,
many applications use environment variables to control how they work. I have
the variable <TT>EDITOR</TT> set to <TT>vi</TT> so that I can use vi in
Midnight Commander (an excellent console based file manager) instead of its
editor.
<SECT1>Exercises
<P>
The basics of bash are easy to learn. But don't stop there: there is an
incredible depth to it. Get into the habit of looking for better ways to do
things.
<P>
Read shell scripts, look up stuff you don't understand.
<SECT1>More Information
<P>
<ITEMIZE>
<ITEM>source code download see <REF ID="downloads" NAME="downloads">
<ITEM>There is a ``Bash Reference Manual'' with this, which is comprehensive, but heavy going.
<ITEM>There is an O'Rielly book on Bash, not sure if it's good.
<ITEM>I don't know of any good free up to date bash tutorials. If you do, please
email me a url.
</ITEMIZE>
<SECT>Commands
<P>
You do most things in bash by issuing commands like <TT>cp</TT>. Most of these commands are
small programs, though some, like <TT>cd</TT> are built into the shell.
<P>
The commands come in packages, most of them from the Free Software Foundation (or GNU).
Rather than list the packages here, I'll direct you to the
<URL URL="http://www.linuxfromscratch.org" NAME="Linux From Scratch HOWTO">.
It has a full and up to date list of the packages that go into a Linux system
as well as instructions on how to build them.
<SECT>Building A Minimal Linux System From Source
<LABEL ID="building">
<P>
So far I have focussed on what the packages do. Here I will offer what clues
I can about making a minimal Linux system from source. This is a toy system
we are making here. If you want to build a real system to be used for real
work, see the
<URL URL="http://www.linuxfromscratch.org" NAME="Linux From Scratch HOWTO">.
<P>
It is possible to get a bash
prompt without installing everything I mention here. What I describe is
a base system, without nasty kludges, that can be built on easily.
<SECT1>What You Will Need
<P>
We will install a Linux distribution like Red Hat in one partition,
and use that to build a new Linux system in another partition.
I will call the system we are building the ``target'' and the
system we are using to build it with, the ``source'' (not to be
confused with <EM>source code</EM> which we will also be using.)
<P>
So you are going to need a machine with two spare partitions on it.
If you can, use a machine with nothing important on it.
You could use an existing Linux installation as the source system,
but I wouldn't recommend that. If you leave a parameter out of one
of the commands we will issue, you could accidentally install stuff to this
system. This could lead to incompatibilites and strife.
<P>
Older PC hardware, mostly 486's and earlier, have an annoying limitation
in their bios. They can not read from a hard disk past the first 512M.
This is not too much of a problem for Linux, because once it is up, it
does its own disk io, bypassing the bios.
But for Linux to get loaded by these old machines,
the kernel has to reside somewhere below 512M. If you have one of these machines
you will need to have a separate partition completely below the 512M
mark, to mount as <TT>/boot</TT> for any partitions that are over that
512M mark.
<P>
Last time I did this, I used Red Hat 6.1 as a source system. I installed
the base system plus
<ITEMIZE>
<ITEM>cpp
<ITEM>egcs
<ITEM>egcs-c++
<ITEM>patch
<ITEM>make
<ITEM>dev86
<ITEM>ncurses-devel
<ITEM>glibc-devel
<ITEM>kernel-headers
</ITEMIZE>
I also had X-window and Mozilla so I could read documentation easily,
but that's not really necessary. By the time I had finished working,
it had used about 350M of disk space. (Seems a bit high, I wonder why?)
<P>
The finished target system took 650M, but that includes all the source code and
intermediate build files. If space is tight, you should do a <TT>make clean</TT>
after each package is built. Still, this mind boggling bloat is a bit of a worry.
<P>
Finally, you are going to need the source code for the system we are going to
build. These are the ``packages'' that I have discussed in this document. These
can be obtained from a source cd, or from the internet. I'll give URL's for
the USA sites and for Australian mirrors.
<P>
<LABEL ID="downloads">
<ITEMIZE>
<ITEM>MAKEDEV
<URL URL="ftp://tsx-11.mit.edu/pub/linux/sources/sbin" NAME="USA">
Another
<URL URL="ftp://sunsite.unc.edu/pub/Linux/system/admin" NAME="USA">
site
<ITEM>Lilo
<URL URL="ftp://lrcftp.epfl.ch/pub/linux/local/lilo/" NAME="USA">,
<URL URL="ftp://mirror.aarnet.edu.au/pub/linux/metalab/system/boot/lilo/"
NAME="Australia">.
<ITEM>Linux Kernel
Use one of the mirrors listed at
<URL URL="http://www.kernel.org" NAME="home page">
rather than
<URL URL="ftp://ftp.kernel.org/pub/linux/kernel" NAME="USA">
because they are always overloaded.
<URL URL="ftp://kernel.mirror.aarnet.edu.au/pub/linux/kernel/"
NAME="Australia">
<ITEM>GNU libc
itself, and the linuxthreads addon are at
<URL URL="ftp://ftp.gnu.org/pub/gnu/glibc" NAME="USA">
<URL URL="ftp://mirror.aarnet.edu.au/pub/gnu/glibc" NAME="Australia">
<ITEM>GNU libc addons
You will also need the linuxthreads and libcrypt addons.
If libcrypt is not there it is because of some US export laws.
You can get it at
<URL URL="ftp://ftp.gwdg.de/pub/linux/glibc" NAME="libcrypt">
The linuxthreads addon is in the same places as libc itself
<ITEM>GNU ncurses
<URL URL="ftp://ftp.gnu.org/gnu/ncurses" NAME="USA">
<URL URL="ftp://mirror.aarnet.edu.au/pub/gnu/ncurses" NAME="Australia">
<ITEM>SysVinit
<URL URL="ftp://sunsite.unc.edu/pub/Linux/system/daemons/init"
NAME="USA">
<URL URL="ftp://mirror.aarnet.edu.au/pub/linux/metalab/system/daemons/init"
NAME="Australia">
<ITEM>GNU Bash
<URL URL="ftp://ftp.gnu.org/gnu/bash" NAME="USA">
<URL URL="ftp://mirror.aarnet.edu.au/pub/gnu/bash" NAME="Australia">
<ITEM>GNU sh-utils
<URL URL="ftp://ftp.gnu.org/gnu/sh-utils" NAME="USA">
<URL URL="ftp://mirror.aarnet.edu.au/pub/gnu/sh-utils"
NAME="Australia">
<ITEM>util-linux
<URL URL="ftp://ftp.win.tue.nl/pub/linux/utils/util-linux/"
NAME="Somewhere else">
<URL URL="ftp://mirror.aarnet.edu.au/pub/linux/metalab/system/misc"
NAME="Australia"> This package contains <TT>agetty</TT> and
<TT>login</TT>.
</ITEMIZE>
<P>
To sum up then, you will need:
<ITEMIZE>
<ITEM>A machine with two spare partitions of about 400M and 700M respectively
though you could probably get away with less
<ITEM>A Linux distribution (eg. a Red Hat cd) and a way of installing it
(eg. a cdrom drive)
<ITEM>The source code tarballs listed above
</ITEMIZE>
<P>
I'm assuming that you can install the source system yourself, without any
help from me. From here on, I'll assume that its done.
<P>
The first milestone in this little project is getting the kernel to boot up
and panic because it can't find an <TT>init</TT>. This means we are going to have
to install a kernel, and install lilo. To install lilo nicely though, we
will need the device files in the target <TT>/dev</TT> directory. Lilo
needs them to do the low level disk access necessary to write the boot
sector. MAKEDEV is the script that creates these device files.
(You can just copy them from the source system of course, but that's cheating!)
But first of all, we need a filesystem to put all of this into.
<SECT1>The Filesystem
<P>
Our new system is going to live in a file system. So first, we have to make
that file system using <TT>mke2fs</TT>. Then mount it somewhere. I'd suggest
<TT>/mnt/target</TT>. In what follows, I'll assume that this is where it is.
You could save yourself a bit of time by putting an
entry in <TT>/etc/fstab</TT> so that it mounts there automatically when the
source system comes up.
<P>
When we boot up the target system, the stuff that's now in <TT>/mnt/target</TT>
will be in <TT>/</TT>.
<P>
We need a directory structure on target. Have a look at the
File Heirarchy Standard (see section <REF ID="FHS" NAME="Filesystem">)
to work out what this should be, or just <TT>cd</TT>
to where the target is mounted and blindly do
<VERB>
mkdir bin boot dev etc home lib mnt root sbin tmp usr var
cd var; mkdir lock log run spool
cd ../usr; mkdir bin include lib local sbin share src
cd share/; mkdir man; cd man
mkdir man1 man2 man3 ... man9
</VERB>
Since the FHS and most packages disagree about where man pages should go,
we need a symlink
<VERB>
cd ..; ln -s share/man man
</VERB>
<SECT1>MAKEDEV
<P>
We will put the source code in the target <TT>/usr/src</TT> directory.
So for example, if your target file system is mounted on <TT>/mnt/target</TT>
and your tarballs are in <TT>/root</TT>, you would do
<VERB>
cd /mnt/target/usr/src
tar -xzvf /root/MAKEDEV-2.5.tar.gz
</VERB>
<P>
Don't be completely lame and copy the tarball to the place where you are going
to extract it ;->
<P>
Normally when you install software, you are installing it onto the system
that is running. We don't want to do that though, we want to install it
as though <TT>/mnt/target</TT> is the root filesystem. Different packages
have different ways of letting you do this. For MAKEDEV you do
<VERB>
ROOT=/mnt/target make install
</VERB>
<P>
You need to look out for these options in the README and INSTALL files
or by doing a <TT>./configure --help</TT>.
<P>
Have a look in MAKEDEV's <TT>Makefile</TT> to see what it does with the
<TT>ROOT</TT> varible that we set in that command. Then have a look
in the man page by doing <TT>man ./MAKEDEV.man</TT> to see how it
works. You'll find that the way to make our device files is to
<TT>cd /mnt/target/dev</TT> and do <TT>./MAKEDEV generic</TT>.
Do an <TT>ls</TT> to see all the wonderful device files it has made
for you.
<SECT1>Kernel
<P>
Next we make a kernel. I presume you've done this before, so I'll be brief.
It is easier to install lilo if the kernel
it is meant to boot is already there. Go back to the target <TT>usr/src</TT>
directory, and unpack the linux kernel source there. Enter the linux
source tree (<TT>cd linux</TT>) and configure the kernel
using your favourite method, for example <TT>make menuconfig</TT>.
You can make life slightly easier for yourself by configuring
a kernel without modules. If you configure any modules, then you
will have to edit the <TT>Makefile</TT>, find <TT>INSTALL_MOD_PATH</TT>
and set it to <TT>/mnt/target</TT>.
<P>
Now you can <TT>make dep</TT>, <TT>make bzImage</TT>, and if you configured
modules: <TT>make modules</TT>, <TT>make modules_install</TT>. Copy
the kernel <TT>arch/i386/boot/bzImage</TT> and the system map <TT>System.map</TT>
to the target boot directory <TT>/mnt/target/boot</TT>, and we are ready
to install lilo.
<SECT1>Lilo
<P>
Lilo comes with a neat script called <TT>QuickInst</TT>. Unpack the lilo
source into the target source directory, run this script with the command
<TT>ROOT=/mnt/target ./QuickInst</TT>. It will ask you questions about
how you want lilo installed.
<P>
Remember, since we have set <TT>ROOT</TT>,
to the target partition, you tell it file names relative to that. So
when it asks what kernel you want to boot by default, answer
<TT>/boot/bzImage</TT> <EM>not</EM> <TT>/mnt/target/boot/bzImage</TT>.
I found a little bug in the script, so it said
<VERB>
./QuickInst: /boot/bzImage: no such file
</VERB>
But if you just ignore it, it's ok.
<P>
Where should we get <TT>QuickInst</TT> to put the boot sector?
When we reboot we want to have the choice of booting into the source system
or the target system, or any other systems that are on this box.
And we want the instance of lilo that we are building now to load
the kernel of our new system. How are we going achieve both of these
things? Let's digress a little and look at how lilo boots DOS on a
dual boot Linux system. The <TT>lilo.conf</TT> file on such a system
probably looks something like this:
<P>
<VERB>
prompt
timeout = 50
default = linux
image = /boot/bzImage
label = linux
root = /dev/hda1
read-only
other = /dev/hda2
label = dos
</VERB>
<P>
If the machine is set up this way, then the master boot record gets read and
loaded by the bios, and it loads the lilo bootloader, which gives a prompt.
If you type in <TT>dos</TT> at the prompt, lilo loads the boot sector from
hda2, and it loads DOS.
<P>
What we are going to do is just the same, except that the boot sector in
hda2 is going to be another lilo boot sector - the one that <TT>QuickInst</TT>
is going to install. So the lilo from the Linux distribution will load the
lilo that we have built, and that will load the kernel that we have built.
You will see two lilo prompts when you reboot.
<P>
To cut a long story short, when <TT>QuickInst</TT> asks you where to put the
boot sector, tell it the device where your target filesystem is,
eg. <TT>/dev/hda2</TT>.
<P>
Now modify the <TT>lilo.conf</TT> on your source system, so it has
a line like
<VERB>
other = /dev/hda2
label = target
</VERB>
run lilo, and we should be able to do our first boot into the target system.
<SECT1>Glibc
<P>
Next we want to install <TT>init</TT>, but like almost every program that
runs under Linux, <TT>init</TT> uses library functions provided by the
GNU C library, glibc. So we will install that first.
<P>
Glibc is a very large and complicated package. It took 90 hours to build
on my old 386sx/16 with 8M RAM. But it only took 33 minutes on my Celeron
433 with 64M. I think memory is the main issue here. If you only have 8M
of RAM (or, shudder, less!) be prepared for a long build.
<P>
The glibc install documentation recommends building in a separate directory.
This enables you to start again easily, by just blowing that directory away.
You might also want to do that to save yourself about 265M of disk space!
<P>
Unpack the <TT>glibc-2.1.3.tar.gz</TT> (or whatever version) tarball into
<TT>/mnt/target/usr/src</TT>
as usual. Now, we need to unpack the ``add-ons'' into glibc's directory. So
<TT>cd glibc-2.1.3</TT>, and then unpack the <TT>glibc-crypt-2.1.3.tar.gz</TT>
and <TT>glibc-linuxthreads-2.1.3.tar.gz</TT> tarballs there.
<P>
Now we can create the build directory, configure, make and install glibc.
These are the commands I used, but read the documentation yourself and
make sure you do what is best for your circumstances.
Before you do though, you might want to do a <TT>df</TT> command to see
how much free space you have. You can do another after you've built and
installed glibc, to see what a space-hog it is.
<P><VERB>
cd ..
mkdir glibc-build
../glibc-2.1.3/configure --enable-add-ons --prefix=/usr
make
make install_root=/mnt/target install
</VERB>
<P>
Notice that we have yet another way of telling a package where to install.
<SECT1>SysVinit
<P>
Making and installing the SysVinit binaries is pretty straight forward.
I'll just be lazy and give you the commands, assuming that you have
unpacked and entered the SysVinit source code directory:
<P><VERB>
cd src
make
ROOT=/mnt/target make install
</VERB>
<P>
There are also a lot of scripts associated with <TT>init</TT>.
There are example scripts with the SysVinit package, which work fine.
But you have to install them manually. They are set up in a heirarchy
under <TT>debian/etc</TT> in the SysVinit source code tree. You can
just copy them straight across into the target <TT>etc</TT> directory,
with something like <TT>cd ../debian/etc; cp -r * /mnt/target/etc</TT>.
Obviously you will want to have a look before you copy them across!
<P>
Everything is in place now for the target kernel to load up <TT>init</TT>
when we reboot. The problem this time should be that the scripts won't
run, becasue <TT>bash</TT> isn't there to interpret them. Also, <TT>init</TT>
will try to run <TT>getty</TT>'s, but there is no <TT>getty</TT> for it to run.
Reboot now and make sure there is nothing else wrong.
<SECT1>Ncurses
<P>
The next thing we need is Bash, but bash needs ncurses, so we'll install
it first. Ncurses replaces termcap as the way of handling text screens,
but it can also provide backwards compatibility by supporting the termcap
calls. In the interests of having a clean simple modern system,
I think its best to
disable the old termcap method. You might strike trouble later on if
you are compiling an older application that uses termcap.
But at least you will know what is using what. If you need to you can recompile
ncurses with termcap support.
<P>
The commands I used are
<P><VERB>
./configure --prefix=/usr --with-install-prefix=/mnt/target --with-shared --disable-termcap
make
make install
</VERB>
<SECT1>Bash
<P>
It me took quite a lot of reading and thinking and trial and error
to get Bash to install itself where I thought it should go. The
configuration options I used are
<P><VERB>
./configure --prefix=/mnt/target/usr/local --exec-prefix=/mnt/target --with-curses
</VERB>
<P>
Once you have made and installed Bash, you need to make a symlink like this
<TT>cd /mnt/target/bin; ln -s bash sh</TT>. This is because scripts usually
have a first line like this
<P><VERB>
#!/bin/sh
</VERB>
<P>
If you don't have the symlink, your scripts won't be able to run, because
they will be looking for <TT>/bin/sh</TT> not <TT>/bin/bash</TT>.
<P>
You could reboot again at this point if you like. You should notice that
the scripts actually run this time, though you still can't login, because
there are no <TT>getty</TT> or <TT>login</TT> programs.
<SECT1>Util-linux (getty and login)
<P>
The util-linux package contains <TT>agetty</TT> and <TT>login</TT>. We need
both of these to be able to log in and get a bash prompt. After it is
instlalled, make a symlink from <TT>agetty</TT> to <TT>getty</TT> in the
target <TT>/sbin</TT> directory. <TT>getty</TT> is one of the programs
that is supposed to be there on all Unix-like systems, so the link
is a better idea than hacking <TT>inittab</TT> to run <TT>agetty</TT>.
<P>
I have one remaining problem with the compilation of util-linux. The
package also contains the program <TT>more</TT>, and I have not been
able to persuade the <TT>make</TT> process to have <TT>more</TT>
link against the ncurses 5 library on the target system rather than
the ncurses 4 on the source system. I'll be having a closer look at
that.
<P>
You will also need a <TT>/etc/passwd</TT> file on the target system.
This is where the <TT>login</TT> program will check to find out if
you are allowed in. Since this is only a toy system at this stage,
we can do outrageous things like setting up only the root user,
and not requiring any password!! Just put this in the target
<TT>/etc/passwd</TT>
<P><VERB>
root::0:0:root:/root:/bin/bash
</VERB>
<P>
The fields are separated by colons, and from left to right they are
user id, password (encrypted), user number, group number, user's name,
home directory and default shell.
<SECT1>Sh-utils
<P>
The last package we need is GNU sh-utils. The only program we need from
here at this stage is <TT>stty</TT>, which is used in <TT>/etc/init.d/rc</TT>
which is used to change runlevels, and to enter the initial runlevel.
I actually have, and used a package that contains only <TT>stty</TT>,
but I can't remember where it came from. Its a better idea to use the
GNU package, because there is other stuff in there that you will need
if you add to the system to make it useable.
<P>
Well that's it. You should now have a system that will boot up and prompt
you for a login. Type in ``root'', and you should get a shell. You won't
be able to do much with it. There isn't even an <TT>ls</TT> command here
for you to see your handiwork. Press tab twice so you can see the
available commands. This was about the most satisfying thing I found to do
with it.
<SECT1>Towards Useability
<P>
It might look like we have made a pretty useless system here. But really,
there isn't that far to go before it can do some work. One of the first
things you would have to do is have the root filesystem mount read-write.
There is a script from the SysVinit package, in <TT>/etc/init.d/mountall.sh</TT>
which does this, and
issues a <TT>mount -a</TT> so that everything gets mounted the way you
specify in <TT>/etc/fstab</TT>. Put a symlink called something like
<TT>S05mountall</TT> to it in the target's <TT>etc/rc2.d</TT>.
<P>
You may find that this script will use commands that you haven't
installed yet. If so, find the package that contains the commands
and install it. See section <REF ID="finding" NAME="Random Tips"> for
clues on how to find packages.
<P>
Look at the other scripts in <TT>/etc/init.d</TT>. Most of them will
need to be included in any serious system. Add them in one at a time,
make sure everthing is running smoothly before adding more.
<P>
Check the File Heirarchy Standard (see section <REF ID="FHS" NAME="Filesystem">).
It has lists of the commands that should be in <TT>/bin</TT> and
<TT>/sbin</TT>. Make sure that you have all these commands installed.
Even better, find the Posix documentation that specifies this stuff.
<P>
>From there, it's really just a matter of throwing in more and more packages
until everything you want it there. The sooner you can put the build tools
such as <TT>gcc</TT> and <TT>make</TT> in the better. Once that is done,
you can use the target system to build itself, which is much less complicated.
<SECT1>Random Tips
<LABEL ID="finding">
<P>
If you have a command called <TT>thingy</TT> on a Linux system with RPM, and
want a clue about where to get the source from, you can use the command:
<VERB>
rpm -qif `which thingy`
</VERB>
And if you have a Red Hat source CD, you can install the source code using
<VERB>
rpm -i /mnt/cdrom/SRPMS/what.it.just.said-1.2.srpm
</VERB>
<P>
This will put the tarball, and any Red Hat patches into
<TT>/usr/src/redhat/SOURCES</TT>.
<SECT1>More Information
<P>
<ITEMIZE>
<ITEM> There is a mini-howto on building software from source, the
<URL URL="http://www.linuxdoc.org/HOWTO/Software-Building.html"
NAME="Software Building mini-HOWTO">.
<ITEM> There is also a HOWTO on building a Linux system from scratch.
It focuses much more on getting the system built so it can be used,
rather than just doing it as a learning exercise.
<URL URL="http://www.linuxfromscratch.org"
NAME="The Linux From Scratch HOWTO">
</ITEMIZE>
<SECT>Conclusion
<P>
One of the best things about Linux, in my humble opinion, is that you can get
inside it and really find out how it all works. I hope that you enjoy this as
much as I do. And I hope that this little note has helped you do it.
<SECT>Administrivia
<SECT1>Copyright
<P>
This document is copyright (c) 1999, 2000 Greg O'Keefe. You are welcome
to use, copy, distribute or modify it, without charge, under the terms of the
<URL URL="http://www.gnu.org/copyleft/gpl.html"
NAME="GNU General Public Licence">.
Please acknowledge me if you use all or part of this in another document.
<SECT1>Homepage
<P>
The lastest version of this document lives at
<URL URL="http://learning.taslug.org.au/power2bash"
NAME="From Powerup To Bash Prompt">
<SECT1>Feedback
<P>
I would like to hear any comments, criticisms and suggestions for improvement
that you have. Please send them to me
<URL URL="mailto:gcokeefe@postoffice.utas.edu.au" NAME="Greg O'Keefe">
<SECT1>Acknowledgements
<LABEL ID="acknowledge">
<P>
Product names are trademarks of the respective holders, and are hereby
considered properly acknowledged.
<P>
There are some people I want to say thanks to, for helping to make this happen.
<P>
<DESCRIP>
<TAG>Everyone on the learning@TasLUG mailing list</TAG>
Thanks for reading all my mails and asking interesting questions.
You can join this list by sending a message to
<URL URL="mailto:majordomo@taslug.org.au" NAME="majordomo"> with
<VERB>
subscribe learning
</VERB>
in the message body.
<TAG>Michael Emery</TAG>
For reminding me about Unios.
<TAG>Tim Little</TAG>
For some good clues about <TT>/etc/passwd</TT>
<TAG>sPaKr on #linux in efnet</TAG>
Who sussed out that syslogd needs <TT>/etc/services</TT>,
and introduced me to the phrase ``rolling your own'' to
describe building a system from source code.
<TAG>Alex Aitkin</TAG>
For bringing Vico and his ``verum ipsum factum''
(understanding arises through making) to my attention.
<TAG>Dennis Scott</TAG>
For correcting my hexidecimal arithmetic.
<TAG>jdd</TAG>
For pointing out some typos.
<TAG>David Leadbeater</TAG>
For contributing some ``ramblings'' about the kernel deamons.
</DESCRIP>
<SECT1>Change History
<SECT2>0.6 -> 0.7
<P>
<ITEMIZE>
<ITEM>more emphasis on explanation, less on how to build a system,
building info gathered together in a separate section
and the system built is trimmed down,
direct readers to Gerard Beekmans' ``Linux From Scratch'' doc
for serious building
<ITEM>added some ramblings contributed by David Leadbeater
<ITEM>fixed a couple of url's, added link to unios download at
learning.taslug.org.au/resources
<ITEM>tested and fixed url's
<ITEM>generally rewrite, tidy up
</ITEMIZE>
<SECT2>0.5 -> 0.6
<P>
<ITEMIZE>
<ITEM>added change history
<ITEM>added some todos
</ITEMIZE>
<SECT1>TODO
<P>
<ITEMIZE>
<ITEM>explain kernel modules, depmod, modprobe, insmod and all that
(I'll have to find out first!)
<ITEM>mention the /proc filesystem, potential for exercises here
<ITEM>convert to docbook sgml
<ITEM>add more exercises, perhaps a whole section on larger exercises,
like creating a minimal system file by file from a distro
install.
</ITEMIZE>
</ARTICLE>