LDP/LDP/guide/docbook/sag/ch02.sgml

539 lines
25 KiB
Plaintext

<chapter id="overview">
<title>Overview of a Linux System</title>
<blockquote><para><quote>God saw everything that he
had made, and saw that it was very good. </quote> -- Bible
King James Version. Genesis 1:31</para></blockquote>
<para>This chapter gives an overview of a Linux system. First,
the major services provided by the operating system are described.
Then, the programs that implement these services are described
with a considerable lack of detail. The purpose of this chapter
is to give an understanding of the system as a whole, so that
each part is described in detail elsewhere.</para>
<sect1 id="various-parts">
<title>Various parts of an operating system</title>
<para>UNIX and 'UNIX-like' operating systems (such as Linux) consist
of a <glossterm>kernel</glossterm> and some
<glossterm>system programs</glossterm>. There are also some
<glossterm>application programs</glossterm> for doing work.
The kernel <indexterm id="ch02-kern-over"><primary>kernel</primary>
<secondary>overview</secondary></indexterm>is the heart of the operating
system. In fact, it is often mistakenly considered to be the
operating system itself, but it is not. An operating system provides
provides many more services than a plain kernel.</para>
<para>It keeps track of files on the disk, starts programs and runs
them concurrently, assigns memory and other resources to various
processes, receives packets from and sends packets to the network,
and so on. The kernel does very little by itself, but it provides
tools with which all services can be built. It also prevents anyone
from accessing the hardware directly, forcing everyone to use the
tools it provides.
This way the kernel provides some protection for users from each
other. The tools provided by the kernel are used via
<glossterm>system calls<glossterm>. See manual page section 2 for more
information on these. </para>
<para>The system programs use the tools provided by the kernel to
implement the various services required from an operating system.
System programs, and all other programs, run `on top of the
kernel', in what is called the <glossterm>user mode</glossterm>.
The difference between system and application programs is
one of intent: applications are intended for getting useful
things done (or for playing, if it happens to be a game),
whereas system programs are needed to get the system working.
A word processor is an application; <command>mount</command>
is a system program. The difference is often somewhat blurry,
however, and is important only to compulsive categorizers.</para>
<para>An operating system can also contain compilers and their
corresponding libraries (GCC and the C library in particular under
Linux), although not all programming languages need be part of
the operating system. Documentation, and sometimes even games,
can also be part of it. Traditionally, the operating system has
been defined by the contents of the installation tape or disks;
with Linux it is not as clear since it is spread all over the
FTP sites of the world.</para>
</sect1>
<sect1 id="kernel-parts">
<title>Important parts of the kernel</title>
<para>The Linux kernel <indexterm id="ch02-kern-over2"><primary>kernel
</primary><secondary>overview</secondary></indexterm> consists of several important
parts: process
management, memory management, hardware device drivers, filesystem
drivers, network management, and various other bits and pieces.
<xref linkend="kerneloverview">
shows some of them.</para>
<figure id="kerneloverview" float="1">
<title>Some of the more important parts of the Linux kernel</title>
<graphic fileref="overview-kernel.png">
</figure>
<para>Probably the most important parts of the kernel (nothing else
works without them) are memory management and
process management. Memory management <indexterm id="kern-mem">
<primary>kernel</primary><secondary>memory management</secondary>
</indexterm> takes care of assigning
memory areas and swap space areas to processes, parts of the
kernel, and for the buffer cache. Process management
<indexterm id="kern-proc-mgt"><primary>kernel</primary><secondary>
process management</secondary></indexterm> creates
processes, and implements multitasking by switching the
active process on the processor.</para>
<para>At the lowest level, the kernel contains a hardware device
driver <indexterm id="hardware-driver"><primary>kernel</primary>
<secondary>driver</secondary></indexterm>for each kind of hardware
it supports. Since the world is
full of different kinds of hardware, the number of hardware device
drivers is large. There are often many otherwise similar pieces
of hardware that differ in how they are controlled by software.
The similarities make it possible to have general classes of
drivers that support similar operations; each member of the class
has the same interface to the rest of the kernel but differs in
what it needs to do to implement them. For example, all disk
drivers look alike to the rest of the kernel, i.e., they all
have operations like `initialize the drive', `read sector N',
and `write sector N'.</para>
<para>Some software services provided by the kernel itself have
similar properties, and can therefore be abstracted into classes.
For example, the various network protocols have been abstracted
into one programming interface, the BSD socket library. Another
example is the <glossterm>virtual filesystem</glossterm>
<indexterm id="VFS"><primary>kernel</primary><secondary>virtual
filesystem (VFS)</secondary></indexterm> (VFS)
layer that abstracts the filesystem operations away from their
implementation. Each filesystem type provides an implementation
of each filesystem operation. When some entity tries to use
a filesystem, the request goes via the VFS, which routes the
request to the proper filesystem driver.</para>
<para>A more in-depth discussion of kernel internals can be found
at <ulink url="http://www.tldp.org/LDP/lki/index.html">
http://www.tldp.org/LDP/lki/index.html</ulink>. This document was
written for the 2.4 kernel. When I find one for the 2.6 kernel, I
will list it here.</para>
</sect1>
<sect1 id="major-services">
<title>Major services in a UNIX system</title>
<para>This section describes some of the more important UNIX
services, but without much detail. They are described more
thoroughly in later chapters.</para>
<sect2 id="init">
<title><command>init</command></title>
<para>The single most important service in a UNIX system is
provided by <command>init</command><indexterm id="ch02-init">
<primary>init</primary></indexterm>. <command>init</command>
is started as the first process of every UNIX system, as the last
thing the kernel does when it boots. When <command>init</command>
starts, it continues the boot process by doing various startup
chores (checking and mounting filesystems, starting daemons,
etc).</para>
<para>The exact list of things that <command>init</command>
does depends on which flavor it is; there are several to choose
from. <command>init</command> usually provides the concept of
<glossterm>single user mode</glossterm><indexterm id="ch02-single-user">
<primary>runlevels</primary><secondary>1 - single user
</secondary>, in which no one can
log in and root uses a shell at the console; the usual mode is
called <glossterm>multiuser mode</glossterm><indexterm id="multi-user">
<primary>runlevels</primary><secondary>3 - multi-user</secondary>
</indexterm>. Some flavors
generalize this as <glossterm>run levels</glossterm>; single
and multiuser modes are considered to be two run levels, and
there can be additional ones as well, for example, to run X on
the console.</para>
<para>Linux allows for up to 10
<glossterm>runlevels</glossterm><indexterm id="rlevels">
<primary>runlevels</primary></indexterm>, 0-9, but usually only some of
these are defined by default. Runlevel 0 <indexterm id="rlevel0">
<primary>runlevels</primary><secondary>0 - shutdown</secondary>
</indexterm>is defined as ``system
halt''. Runlevel 1 <indexterm id="rlevel1">
<primary>runlevels</primary><secondary>1 - single-user</secondary>
</indexterm>is defined as ``single user mode''. Runlevel 3
<indexterm id="rlevel3"><primary>runlevels</primary>
<secondary>3 - multi-user</secondary></indexterm> is defined as
"multi user" because it is the runlevel that the system boot into
under normal day to day conditions. Runlevel 5 <indexterm id="rlevel5">
<primary>runlevels</primary><secondary>5 - multi-user with GUI
</secondary> is typically the same as 3 except that a GUI
<indexterm id="ch02-xwindows"><primary>GUI</primary></indexterm>
gets started also.
Runlevel 6 <indexterm id="rlevel6"><primary>runlevels</primary>
<secondary>6 - reboot</secondary></indexterm>is defined as ``system
reboot''. Other runlevels are
dependent on how your particular distribution has defined them,
and they vary significantly between distributions. Looking at
the contents of <filename>/etc/inittab</filename>
<indexterm id="ch02-inittab"><primary>inittab</primary></indexterm>
usually will <indexterm id="ch02-rlevel-inittab"><primary>runlevels
</primary><secondary>inittab</secondary></indexterm>
give some hint what the predefined runlevels are and what they
have been defined as.</para>
<para>In normal operation, <command>init</command>
<indexterm id="ch02-init2"><primary>commands</primary>
<secondary>init</secondary></indexterm> makes
sure <command>getty</command><indexterm id="ch02-getty">
<primary>commands</primary><secondary>getty</secondary></indexterm>
is working (to allow users to log in)
and to adopt orphan processes (processes whose parent has died; in
UNIX <emphasis>all</emphasis> processes <emphasis>must</emphasis>
be in a single tree, so orphans must be adopted).</para>
<para>When the system is shut down, it is <command>init</command>
that is in charge of killing all other processes, unmounting all
filesystems and stopping the processor, along with anything else
it has been configured to do.</para>
</sect2>
<sect2 id="terminal-logins">
<title>Logins from terminals</title>
<para>Logins from terminals (via serial lines) and the console
(when not running X) are provided by the <command>getty</command>
<indexterm id="ch02-getty2"><primary>getty</primary></indexterm>
program. <command>init</command><indexterm id="ch02-init3">
<primary>init</primary></indexterm> starts a separate instance of
<command>getty</command> for each terminal upon which logins are to
be allowed. <command>getty</command> reads the username and runs
the <command>login</command><indexterm id="ch02-login">
<primary>login</primary></indexterm> program, which reads the password.
If the username and password are correct, <command>login</command> runs
the shell. When the shell terminates, i.e., the user logs out, or
when <command>login</command> terminated because the username and
password didn't match, <command>init</command> notices this and
starts a new instance of <command>getty</command>. The kernel has no
notion of logins, this is all handled by the
<glossterm>system programs</glossterm>.</para>
</sect2>
<sect2 id="syslog">
<title>Syslog</title>
<para>The kernel and many <glossterm>system programs</glossterm>
produce error, warning, and other messages. It is often important
that these messages can be viewed later, even much later, so they
should be written to a file. The program doing this is
<command>syslog</command> <indexterm id="ch02-syslog"><primary>syslog
</primary></indexterm>. It can be configured to sort the
messages to different files according to writer or degree of
importance. For example, kernel messages are often directed to a
separate file from the others, since kernel messages are often more
important and need to be read
regularly to spot problems.</para>
<para><xref linkend="system-logs"> will provide more on
this.</para>
</sect2>
<sect2 id="cron">
<title>Periodic command execution: <command>cron</command> and
<command>at</command></title>
<para>Both users and system administrators often need
to run commands periodically. For example, the system administrator
might want to run a command to clean the directories with temporary
files (<filename>/tmp</filename> and <filename>/var/tmp</filename>)
from old files, to keep the disks from filling up, since not all
programs clean up after
themselves correctly.</para>
<para>The <command>cron</command> <indexterm id="ch02-cron">
<primary>cron</primary></indexterm> service is set up to do this.
Each user can have a <filename>crontab</filename>
<indexterm id="ch02-crontab"><primary>cron</primary><secondary>crontab
</secondary></indexterm> file, where she
lists the commands she wishes to execute and the times they should
be executed. The <command>cron</command> daemon takes care of
starting the commands when specified.</para>
<para>The <command>at</command> <indexterm id="ch02-at"><primary>at
</primary></indexterm> service is similar to
<command>cron</command>, but it is once only: the command is
executed at the given time, but it is not repeated.</para>
<para>We will go more into this later. See the manual pages
cron(1), crontab(1), crontab(5), at(1) and atd(8) for more in
depth information.</para>
<para><xref linkend="task-automation"> will cover this.</para>
</sect2>
<sect2 id="gui">
<title>Graphical user interface</title>
<para><indexterm id="ch02-gui2"><primary>GUI</primary></indexterm>
UNIX and Linux don't incorporate the user interface
into the kernel; instead, they let it be implemented by user level
programs. This applies for both text mode and graphical
environments.</para>
<para>This arrangement makes the system more flexible, but has
the disadvantage that it is simple to implement a different user
interface for each program, making the system harder to
learn.</para>
<para>The graphical environment primarily used with Linux
is called the X Window System <indexterm id="ch02-xwin"><primary>GUI
</primary><secondary>X Windows</secondary></indexterm> (X for short).
X also does not implement a user interface; it only implements a
window system, i.e., tools with which a graphical user interface can
be implemented. Some popular window managers are: fvwm <indexterm id="fvwm">
<primary>GUI</primary><secondary>fvwm</secondary></indexterm>, icewm
<indexterm id="icewm"><primary>GUI</primary><secondary>icewm</secondary>
</indexterm>, blackbox <indexterm id="blackbox"><primary>GUI</primary>
<secondary>blackbox</secondary></indexterm>, and windowmaker
<indexterm id="windowmaker"><primary>GUI</primary><secondary>windowmaker
</secondary></indexterm>. There are also two popular desktop managers,
KDE <indexterm id="ch02-kde"><primary>KDE</primary></indexterm> and Gnome.
<indexterm id="ch02-gnome"><primary>GNOME</primary></indexterm></para>
</sect2>
<sect2 id="networking">
<title>Networking</title>
<para>Networking <indexterm id="ch02-net"><primary>networking</primary>
</indexterm>is the act of connecting two or more computers
so that they can communicate with each other. The actual methods
of connecting and communicating are slightly complicated, but
the end result is very useful.</para>
<para>UNIX operating systems have many networking features.
Most basic services (filesystems, printing, backups, etc) can
be done over the network. This can make system administration
easier, since it allows centralized administration, while
still reaping in the benefits of microcomputing and distributed
computing, such as lower costs and better fault tolerance.</para>
<para>However, this book merely glances at networking; see the
<citetitle>Linux Network Administrators' Guide</citetitle>
<ulink url="http://www.tldp.org/LDP/nag2/index.html">
http://www.tldp.org/LDP/nag2/index.html</ulink>
<indexterm id="ch02-nag"><primary>networking</primary><secondary>Network
Admin Guide (NAG)</secondary></indexterm> for
more information, including a basic description of how networks
operate.</para>
</sect2>
<sect2 id="network-logins">
<title>Network logins</title>
<para>Network logins<indexterm id="logging-in">
<primary>logging in</primary></indexterm> work a little differently
than normal logins. For each person logging in via the network
there is a separate virtual network connection,
and there can be any number of these depending on the available
bandwidth. It is therefore not possible to run a separate
<command>getty</command><indexterm id="ch02-getty3"><primary>getty
</primary></indexterm> for each possible virtual connection.
There are also several different ways to log in via a network,
<command>telnet</command><indexterm id="ch02-telnet"><primary>telnet
</primary></indexterm> and <command>ssh</command>
<indexterm id="ch02-ssh"><primary>ssh</primary></indexterm> being
the major ones in TCP/IP networks.</para>
<para>These days many Linux system administrators consider
<command>telnet</command> and <command>rlogin</command> to be
insecure and prefer <command>ssh</command>, the ``secure shell'',
which encrypts traffic going over the network, thereby making it far
less likely that the malicious can ``sniff'' your connection and gain
sensitive data like usernames and passwords. It is highly recommended
you use <command>ssh</command> rather than <command>telnet</command>
or <command>rlogin</command>.</para>
<para>Network logins<indexterm id="logging-in2">
<primary>Logging in</primary></indexterm> have, instead of a herd
of <command>getty</command>s<indexterm id="ch02-getty4">
<primary>getty</primary></indexterm>, a single daemon per way
of logging in
(<command>telnet</command><indexterm id="ch02-telnet2">
<primary>telnet</primary></indexterm> and <command>ssh</command>
<indexterm id="ch02-ssh2"><primary>ssh</primary></indexterm> have
separate daemons) that listens for all incoming login attempts.
When it notices one, it starts a new instance of itself to
handle that single attempt; the original instance continues to
listen for other attempts. The new instance works similarly
to <command>getty</command>.</para>
</sect2>
<sect2 id="nfs">
<title>Network file systems</title>
<para>One of the more useful things that can be done with
networking services is sharing files via a <glossterm>network
file system</glossterm>. Depending on your network this could
be done over the Network File System (NFS)<indexterm id="ch02-nfs">
<primary>Network File System (NFS)</primary></indexterm>, or over
the Common Internet File System (CIFS)<indexterm id="ch02-cifs">
<primary>Common Internet File System (CIFS)</primary></indexterm>.
NFS is typically a 'UNIX' based service. In Linux, NFS is supported
by the kernel<indexterm id="ch02-kern-nfs"><primary>kernel</primary>
<secondary>NFS</secondary></indexterm>. CIFS however is not. In Linux,
CIFS is supported by Samba<indexterm id="ch02-samba"><primary>Samba
</primary></indexterm> <ulink url="http://www.samba.org">
http://www.samba.org</ulink>.
<para>With a network file system any file operations done by
a program on one machine are sent over the network to another
computer. This fools the program to think that all the files
on the other computer are actually on the computer the program
is running on. This makes information sharing extremely simple,
since it requires no modifications to programs.</para>
<para>This will be covered in more detail in
<xref linkend="net-attached">.</para>
</sect2>
<sect2 id="mail">
<title>Mail</title>
<para>Electronic mail<indexterm id="ch02-email">
<primary>email</primary></indexterm> is the most popularly used method for
communicating via computer. An electronic letter is stored in a
file using a special format, and special mail programs are used
to send and read the letters.</para>
<para>Each user has an <glossterm>incoming mailbox</glossterm>
(a file in the special format), where all new mail is stored.
When someone sends mail, the mail program locates the receiver's
mailbox and appends the letter to the mailbox file. If the
receiver's mailbox is in another machine, the letter is sent to
the other machine, which delivers it to the mailbox as it best
sees fit.</para>
<para>The mail system consists of many programs. The
delivery of mail to local or remote mailboxes is done by one
program (the <glossterm>mail transfer agent</glossterm> (MTA)
<indexterm id="ch02-email2"><primary>mail transfer agent (MTA)
</primary></indexterm>, e.g., <command>sendmail</command>
<indexterm id="ch02-email3"><primary>mail transfer agent (MTA)
</primary><secondary>sendmail</secondary></indexterm>
or <command>postfix</command><indexterm id="ch02-email4">
<primary>mail transfer agent (MTA)</primary>
<secondary>postfix</secondary></indexterm>
), while the programs users use are many and varied
(<glossterm>mail user agent</glossterm> (MUA)
<indexterm id="ch02-email5"><primary>mail user agent</primary>
</indexterm>, e.g., <command>pine</command>
<indexterm id="ch02-email6"><primary>mail user agent</primary>
<secondary>pine</secondary></indexterm>, or
<command>evolution</command>
<indexterm id="ch02-email7"><primary>mail user agent</primary>
<secondary>evolution</secondary></indexterm>.
The mailboxes are usually stored
in <filename>/var/spool/mail</filename> until the user's MUA
retrieves them.</para>
<para>For more information on setting up and running mail services
you can read the Mail Administrator HOWTO at
<ulink url="http://www.tldp.org/HOWTO/Mail-Administrator-HOWTO.html">
http://www.tldp.org/HOWTO/Mail-Administrator-HOWTO.html</ulink>, or
visit the sendmail or postfix's website.
<ulink url="http://www.sendmail.org/">http://www.sendmail.org/</ulink>,
or <ulink url="http://www.postfix.org/">http://www.postfix.org/
</ulink>.</para>
</sect2>
<sect2 id="printing">
<title>Printing</title>
<para>Only one person can use a printer<indexterm id="ch02-print">
<primary>printing</primary></indexterm> at one time, but it is
uneconomical not to share printers between users. The printer is
therefore managed by software that implements a <glossterm>print
queue</glossterm><indexterm id="ch02-print2"><primary>printing
</primary><secondary>queue</secondary></indexterm>: all print
jobs are put into a queue and
whenever the printer is done with one job, the next one is sent
to it automatically. This relieves the users from organizing
the print queue and fighting over control of the printer.
Instead, they form a new queue <emphasis>at</emphasis>
the printer, waiting for their printouts, since no one ever seems to
be able to get the queue software to know exactly when anyone's printout
is really finished. This is a great boost to intra-office social
relations.</para>
<para>The print queue software also <glossterm>spools</glossterm>
<indexterm id="ch02-print3"><primary>printing</primary>
<secondary>spools</secondary></indexterm> the printouts on
disk, i.e., the text is kept in a file while
the job is in the queue. This allows an application program
to spit out the print jobs quickly to the print queue software;
the application does not have to wait until the job is actually
printed to continue. This is really convenient, since it
allows one to print out one version, and not have to wait for
it to be printed before one can make a completely revised new
version.</para>
<para>You can refer to the Printing-HOWTO located at
<ulink url="http://www.tldp.org/HOWTO/Printing-HOWTO/index.html">
http://www.tldp.org/HOWTO/Printing-HOWTO/index.html</ulink>
for more help in setting up printers.</para>
</sect2>
<sect2 id="fs-layout">
<title>The filesystem layout</title>
<para>The filesystem<indexterm id="ch02-filesystem">
<primary>filesystem</primary></indexterm> is divided into many parts;
usually along the lines of a root filesystem with
<filename>/bin</filename>
<indexterm id="ch02-filesystem2"><primary>filesystem</primary>
<secondary>/bin</secondary></indexterm>,
<filename>/lib</filename>
<indexterm id="ch02-filesystem3"><primary>filesystem</primary>
<secondary>/lib</secondary></indexterm>,
<filename>/etc</filename>
<indexterm id="ch02-filesystem4"><primary>filesystem</primary>
<secondary>/etc</secondary></indexterm>,
<filename>/dev</filename>
<indexterm id="ch02-filesystem5"><primary>filesystem</primary>
<secondary>/dev</secondary></indexterm>, and a few others;
a <filename>/usr</filename>
<indexterm id="ch02-filesystem6"><primary>filesystem</primary>
<secondary>/usr</secondary></indexterm> filesystem with
programs and unchanging data;
<filename>/var</filename>
<indexterm id="ch02-filesystem7"><primary>filesystem</primary>
<secondary>/var</secondary></indexterm> filesystem with changing
data (such as log files); and a
<filename>/home</filename>
<indexterm id="ch02-filesystem8"><primary>filesystem</primary>
<secondary>/home</secondary></indexterm> for everyone's personal
files. Depending on the hardware configuration and the decisions
of the system administrator, the division can be different;
it can even be all in one filesystem.</para>
<para><xref linkend="dir-tree-overview"> describes the filesystem
layout in some little detail; the Filesystem Hierarchy Standard
<indexterm id="ch02-fhs"><primary>Filesystem Hierarchy Standard (FHS)
</primary></indexterm>. covers
it in somewhat more detail. This can be found on the web at:
<ulink url="http://www.pathname.com/fhs/">
http://www.pathname.com/fhs/</ulink></para>
</sect2>
</sect1>
</chapter>