LDP/LDP/guide/docbook/sag/ch02.sgml

<chapter id="overview">
<title>Overview of a Linux System</title>

	<blockquote><para><quote>God saw everything that he
	had made, and saw that it was very good. </quote> --  Bible
	King James Version.  Genesis 1:31</para></blockquote>

	<para>This chapter gives an overview of a Linux system.  First,
	the major services provided by the operating system are described.
	Then, the programs that implement these services are described
	with a considerable lack of detail.  The purpose of this chapter
	is to give an understanding of the system as a whole, so that
	each part is described in detail elsewhere.</para>

<sect1 id="various-parts">
<title>Various parts of an operating system</title>

	<para>UNIX and  'UNIX-like' operating systems (such as Linux) consist
	of a <glossterm>kernel</glossterm> and some
	<glossterm>system programs</glossterm>.  There are also some
	<glossterm>application programs</glossterm> for doing work.
	The kernel <indexterm id="ch02-kern-over"><primary>kernel</primary>
	<secondary>overview</secondary></indexterm>is the heart of the operating
	system.  In fact, it is often mistakenly considered to be the
	operating system itself, but it is not.  An operating system provides
	provides many more services than a plain kernel.</para>

	<para>It keeps track of files on the disk, starts programs and runs
	them concurrently, assigns memory and other resources to various
	processes, receives packets from and sends packets to the network,
	and so on.  The kernel does very little by itself, but it provides
	tools with which all services can be built.  It also prevents anyone
	from accessing the hardware directly, forcing everyone to use the
	tools it provides.
	This way the kernel provides some protection for users from each
	other.  The tools provided by the kernel are used via
	<glossterm>system calls<glossterm>.  See manual page section 2 for more
	information on these.  </para>

	<para>The system programs use the tools provided by the kernel to
	implement the various services required from an operating system.
	System programs, and all other programs, run `on top of the
	kernel', in what is called the <glossterm>user mode</glossterm>.
	The difference between system and application programs is
	one of intent: applications are intended for getting useful
	things done (or for playing, if it happens to be a game),
	whereas system programs are needed to get the system working.
	A word processor is an application; <command>mount</command>
	is a system program.  The difference is often somewhat blurry,
	however, and is important only to compulsive categorizers.</para>

	<para>An operating system can also contain compilers and their
	corresponding libraries (GCC and the C library in particular under
	Linux), although not all programming languages need be part of
	the operating system.  Documentation, and sometimes even games,
	can also be part of it.  Traditionally, the operating system has
	been defined by the contents of the installation tape or disks;
	with Linux it is not as clear since it is spread all over the
	FTP sites of the world.</para>

</sect1>

<sect1 id="kernel-parts">
<title>Important parts of the kernel</title>

	<para>The Linux kernel <indexterm id="ch02-kern-over2"><primary>kernel
	</primary><secondary>overview</secondary></indexterm> consists of several important
	parts: process
	management, memory management, hardware device drivers, filesystem
	drivers, network management, and various other bits and pieces.
	<xref linkend="kerneloverview">
	shows some of them.</para>

		<figure id="kerneloverview" float="1">
		<title>Some of the more important parts of the Linux kernel</title>
		<graphic fileref="overview-kernel.png">
		</figure>

	<para>Probably the most important parts of the kernel (nothing else
	works without them) are memory management and
	process management.  Memory management <indexterm id="kern-mem">
	<primary>kernel</primary><secondary>memory management</secondary>
	</indexterm> takes care of assigning
	memory areas and swap space areas to processes, parts of the
	kernel, and for the buffer cache.  Process management
	<indexterm id="kern-proc-mgt"><primary>kernel</primary><secondary>
	process management</secondary></indexterm> creates
	processes, and implements multitasking by switching the
	active process on the processor.</para>

	<para>At the lowest level, the kernel contains a hardware device
	driver <indexterm id="hardware-driver"><primary>kernel</primary>
	<secondary>driver</secondary></indexterm>for each kind of hardware
	it supports.  Since the world is
	full of different kinds of hardware, the number of hardware device
	drivers is large.  There are often many otherwise similar pieces
	of hardware that differ in how they are controlled by software.
	The similarities make it possible to have general classes of
	drivers that support similar operations; each member of the class
	has the same interface to the rest of the kernel but differs in
	what it needs to do to implement them.	For example, all disk
	drivers look alike to the rest of the kernel, i.e., they all
	have operations like `initialize the drive', `read sector N',
	and `write sector N'.</para>

	<para>Some software services provided by the kernel itself have
	similar properties, and can therefore be abstracted into classes.
	For example, the various network protocols have been abstracted
	into one programming interface, the BSD socket library.  Another
	example is the <glossterm>virtual filesystem</glossterm>
	<indexterm id="VFS"><primary>kernel</primary><secondary>virtual
	filesystem (VFS)</secondary></indexterm> (VFS)
	layer that abstracts the filesystem operations away from their
	implementation.  Each filesystem type provides an implementation
	of each filesystem operation.  When some entity tries to use
	a filesystem, the request goes via the VFS, which routes the
	request to the proper filesystem driver.</para>

	<para>A more in-depth discussion of kernel internals can be found
	at <ulink url="http://www.tldp.org/LDP/lki/index.html">
	http://www.tldp.org/LDP/lki/index.html</ulink>.  This document was
	written for the 2.4 kernel.  When I find one for the 2.6 kernel, I
	will list it here.</para>

</sect1>

<sect1 id="major-services">
<title>Major services in a UNIX system</title>

	<para>This section describes some of the more important UNIX
	services, but without much detail.  They are described more
	thoroughly in later chapters.</para>

<sect2 id="init">
<title><command>init</command></title>

	<para>The single most important service in a UNIX system is
	provided by <command>init</command><indexterm id="ch02-init">
	<primary>init</primary></indexterm>.  <command>init</command>
	is started as the first process of every UNIX system, as the last
	thing the kernel does when it boots.  When <command>init</command>
	starts, it continues the boot process by doing various startup
	chores (checking and mounting filesystems, starting daemons,
	etc).</para>

	<para>The exact list of things that <command>init</command>
	does depends on which flavor it is; there are several to choose
	from.  <command>init</command> usually provides the concept of
	<glossterm>single user mode</glossterm><indexterm id="ch02-single-user">
	<primary>runlevels</primary><secondary>1 - single user
	</secondary>, in which no one can
	log in and root uses a shell at the console; the usual mode is
	called <glossterm>multiuser mode</glossterm><indexterm id="multi-user">
        <primary>runlevels</primary><secondary>3 - multi-user</secondary>
	</indexterm>.  Some flavors
	generalize this as <glossterm>run levels</glossterm>; single
	and multiuser modes are considered to be two run levels, and
	there can be additional ones as well, for example, to run X on
	the console.</para>

	<para>Linux allows for up to 10
	<glossterm>runlevels</glossterm><indexterm id="rlevels">
        <primary>runlevels</primary></indexterm>, 0-9, but usually only some of
	these are defined by default.  Runlevel 0 <indexterm id="rlevel0">
        <primary>runlevels</primary><secondary>0 - shutdown</secondary>
	</indexterm>is defined as ``system
	halt''.  Runlevel 1 <indexterm id="rlevel1">
        <primary>runlevels</primary><secondary>1 - single-user</secondary>
	</indexterm>is defined as ``single user mode''.  Runlevel 3
	<indexterm id="rlevel3"><primary>runlevels</primary>
	<secondary>3 - multi-user</secondary></indexterm> is defined as
	"multi user" because it is the runlevel that the system boot into
	under normal day to day conditions.  Runlevel 5 <indexterm id="rlevel5">
	<primary>runlevels</primary><secondary>5 - multi-user with GUI
	</secondary> is typically the same as 3 except that a GUI
	<indexterm id="ch02-xwindows"><primary>GUI</primary></indexterm>
	gets started also.
	Runlevel 6 <indexterm id="rlevel6"><primary>runlevels</primary>
	<secondary>6 - reboot</secondary></indexterm>is defined as ``system
	reboot''.  Other runlevels are
	dependent on how your particular distribution has defined them,
	and they vary significantly between distributions.  Looking at
	the contents of <filename>/etc/inittab</filename>
	<indexterm id="ch02-inittab"><primary>inittab</primary></indexterm>
	usually will <indexterm id="ch02-rlevel-inittab"><primary>runlevels
	</primary><secondary>inittab</secondary></indexterm>
	give some hint what the predefined runlevels are and what they
	have been defined as.</para>

	<para>In normal operation, <command>init</command>
	<indexterm id="ch02-init2"><primary>commands</primary>
	<secondary>init</secondary></indexterm> makes
	sure <command>getty</command><indexterm id="ch02-getty">
	<primary>commands</primary><secondary>getty</secondary></indexterm>
	is working (to allow users to log in)
	and to adopt orphan processes (processes whose parent has died; in
	UNIX <emphasis>all</emphasis> processes <emphasis>must</emphasis>
	be in a single tree, so orphans must be adopted).</para>

	<para>When the system is shut down, it is <command>init</command>
	that is in charge of killing all other processes, unmounting all
	filesystems and stopping the processor, along with anything else
	it has been configured to do.</para>

</sect2>

<sect2 id="terminal-logins">
<title>Logins from terminals</title>

	<para>Logins from terminals (via serial lines) and the console
	(when not running X) are provided by the <command>getty</command>
	<indexterm id="ch02-getty2"><primary>getty</primary></indexterm>
	program.  <command>init</command><indexterm id="ch02-init3">
	<primary>init</primary></indexterm> starts a separate instance of
	<command>getty</command> for each terminal upon which logins are to
	be allowed.  <command>getty</command> reads the username and runs
	the <command>login</command><indexterm id="ch02-login">
	<primary>login</primary></indexterm> program, which reads the password.
	If the username and password are correct, <command>login</command> runs
	the shell. When the shell terminates, i.e., the user logs out, or
	when <command>login</command> terminated because the username and
	password didn't match, <command>init</command> notices this and
	starts a new instance of <command>getty</command>. The kernel has no
	notion of logins, this is all handled by the
	<glossterm>system programs</glossterm>.</para>

</sect2>

<sect2 id="syslog">
<title>Syslog</title>

	<para>The kernel and many <glossterm>system programs</glossterm>
	produce error, warning, and other messages.  It is often important
	that these messages can be viewed later, even much later, so they
	should be written to a file.  The program doing this is
	<command>syslog</command> <indexterm id="ch02-syslog"><primary>syslog
	</primary></indexterm>.  It can be configured to sort the
	messages to different files according to writer or degree of
	importance.  For example, kernel messages are often directed to a
	separate file from the others, since kernel messages are often more
	important and need to be read
	regularly to spot problems.</para>

	<para><xref linkend="system-logs"> will provide more on
	 this.</para>

</sect2>

<sect2 id="cron">
<title>Periodic command execution: <command>cron</command> and
<command>at</command></title>

	<para>Both users and system administrators often need
	to run commands periodically.  For example, the system administrator
	might want to run a command to clean the directories with temporary
	files (<filename>/tmp</filename> and <filename>/var/tmp</filename>)
	from old files, to keep the disks from filling up, since not all
	programs clean up after
	themselves correctly.</para>

	<para>The <command>cron</command> <indexterm id="ch02-cron">
	<primary>cron</primary></indexterm> service is set up to do this.
	Each user can have a <filename>crontab</filename>
	<indexterm id="ch02-crontab"><primary>cron</primary><secondary>crontab
	</secondary></indexterm> file, where she
	lists the commands she wishes to execute and the times they should
	be executed.  The <command>cron</command> daemon takes care of
	starting the commands when specified.</para>

	<para>The <command>at</command> <indexterm id="ch02-at"><primary>at
	</primary></indexterm> service is similar to
	<command>cron</command>, but it is once only: the command is
	executed at the given time, but it is not repeated.</para>

	<para>We will go more into this later. See the manual pages
	cron(1), crontab(1), crontab(5), at(1) and atd(8) for more in
	depth information.</para>

	<para><xref linkend="task-automation"> will cover this.</para>

</sect2>

<sect2 id="gui">
<title>Graphical user interface</title>

	<para><indexterm id="ch02-gui2"><primary>GUI</primary></indexterm>
	UNIX and Linux don't incorporate the user interface
	into the kernel; instead, they let it be implemented by user level
	programs.  This applies for both text mode and graphical
	environments.</para>

	<para>This arrangement makes the system more flexible, but has
	the disadvantage that it is simple to implement a different user
	interface for each program, making the system harder to
	learn.</para>

	<para>The graphical environment primarily used with Linux
	is called the X Window System <indexterm id="ch02-xwin"><primary>GUI
	</primary><secondary>X Windows</secondary></indexterm> (X for short).
	X also does not implement a user interface; it only implements a
	window system, i.e., tools with which a graphical user interface can
	be implemented.  Some popular window managers are: fvwm <indexterm id="fvwm">
	<primary>GUI</primary><secondary>fvwm</secondary></indexterm>, icewm
	<indexterm id="icewm"><primary>GUI</primary><secondary>icewm</secondary>
        </indexterm>, blackbox <indexterm id="blackbox"><primary>GUI</primary>
	<secondary>blackbox</secondary></indexterm>, and windowmaker
	<indexterm id="windowmaker"><primary>GUI</primary><secondary>windowmaker
	</secondary></indexterm>. There are also two popular desktop managers,
	KDE <indexterm id="ch02-kde"><primary>KDE</primary></indexterm> and Gnome.
	<indexterm id="ch02-gnome"><primary>GNOME</primary></indexterm></para>

</sect2>

<sect2 id="networking">
<title>Networking</title>

	<para>Networking <indexterm id="ch02-net"><primary>networking</primary>
	</indexterm>is the act of connecting two or more computers
	so that they can communicate with each other.  The actual methods
	of connecting and communicating are slightly complicated, but
	the end result is very useful.</para>

	<para>UNIX operating systems have many networking features.
	Most basic services (filesystems, printing, backups, etc) can
	be done over the network.  This can make system administration
	easier, since it allows centralized administration, while
	still reaping in the benefits of microcomputing and distributed
	computing, such as lower costs and better fault tolerance.</para>

	<para>However, this book merely glances at networking; see the
	<citetitle>Linux Network Administrators' Guide</citetitle>
	<ulink url="http://www.tldp.org/LDP/nag2/index.html">
	http://www.tldp.org/LDP/nag2/index.html</ulink>
	<indexterm id="ch02-nag"><primary>networking</primary><secondary>Network
	Admin Guide (NAG)</secondary></indexterm> for
	more information, including a basic description of how networks
	operate.</para>

</sect2>

<sect2 id="network-logins">
<title>Network logins</title>

	<para>Network logins<indexterm id="logging-in">
	<primary>logging in</primary></indexterm> work a little differently
	than normal logins.  For each person logging in via the network
	there is a separate virtual network connection,
	and there can be any number of these depending on the available
	bandwidth.  It is therefore not possible to run a separate
	<command>getty</command><indexterm id="ch02-getty3"><primary>getty
	</primary></indexterm> for each possible virtual connection.
	There are also several different ways to log in via a network,
	<command>telnet</command><indexterm id="ch02-telnet"><primary>telnet
	</primary></indexterm> and <command>ssh</command>
	<indexterm id="ch02-ssh"><primary>ssh</primary></indexterm> being
	the major ones in TCP/IP networks.</para>

	<para>These days many Linux system administrators consider
	<command>telnet</command> and <command>rlogin</command> to be
	insecure and prefer <command>ssh</command>, the ``secure shell'',
	which encrypts traffic going over the network, thereby making it far
	less likely that the malicious can ``sniff'' your connection and gain
	sensitive data like usernames and passwords.  It is highly recommended
	you use <command>ssh</command> rather than <command>telnet</command>
	or <command>rlogin</command>.</para>

	<para>Network logins<indexterm id="logging-in2">
	<primary>Logging in</primary></indexterm> have, instead of a herd
	of <command>getty</command>s<indexterm id="ch02-getty4">
	<primary>getty</primary></indexterm>, a single daemon per way
	of logging in
	(<command>telnet</command><indexterm id="ch02-telnet2">
	<primary>telnet</primary></indexterm> and <command>ssh</command>
	<indexterm id="ch02-ssh2"><primary>ssh</primary></indexterm> have
	separate daemons) that listens for all incoming login attempts.
	When it notices one, it starts a new instance of itself to
	handle that single attempt; the original instance continues to
	listen for other attempts.  The new instance works similarly
	to <command>getty</command>.</para>

</sect2>

<sect2 id="nfs">
<title>Network file systems</title>
	<para>One of the more useful things that can be done with
	networking services is sharing files via a <glossterm>network
	file system</glossterm>.  Depending on your network this could
	be done over the Network File System (NFS)<indexterm id="ch02-nfs">
	<primary>Network File System (NFS)</primary></indexterm>, or over
	the Common Internet File System (CIFS)<indexterm id="ch02-cifs">
	<primary>Common Internet File System (CIFS)</primary></indexterm>.
	NFS is typically a 'UNIX' based service.  In Linux, NFS is supported
	by the kernel<indexterm id="ch02-kern-nfs"><primary>kernel</primary>
	<secondary>NFS</secondary></indexterm>.  CIFS however is not.  In Linux,
	CIFS is supported by Samba<indexterm id="ch02-samba"><primary>Samba
	</primary></indexterm> <ulink url="http://www.samba.org">
	http://www.samba.org</ulink>.

	<para>With a network file system any file operations done by
	a program on one machine are sent over the network to another
	computer.  This fools the program to think that all the files
	on the other computer are actually on the computer the program
	is running on.	This makes information sharing extremely simple,
	since it requires no modifications to programs.</para>

	<para>This will be covered in more detail in
	<xref linkend="net-attached">.</para>
</sect2>

<sect2 id="mail">
<title>Mail</title>
	<para>Electronic mail<indexterm id="ch02-email">
	<primary>email</primary></indexterm> is the most popularly used method for
	communicating via computer.  An electronic letter is stored in a
	file using a special format, and special mail programs are used
	to send and read the letters.</para>

	<para>Each user has an <glossterm>incoming mailbox</glossterm>
	(a file in the special format), where all new mail is stored.
	When someone sends mail, the mail program locates the receiver's
	mailbox and appends the letter to the mailbox file.  If the
	receiver's mailbox is in another machine, the letter is sent to
	the other machine, which delivers it to the mailbox as it best
	sees fit.</para>

	<para>The mail system consists of many programs.  The
	delivery of mail to local or remote mailboxes is done by one
	program (the <glossterm>mail transfer agent</glossterm> (MTA)
	<indexterm id="ch02-email2"><primary>mail transfer agent (MTA)
	</primary></indexterm>, e.g., <command>sendmail</command>
	<indexterm id="ch02-email3"><primary>mail transfer agent (MTA)
	</primary><secondary>sendmail</secondary></indexterm>
	or <command>postfix</command><indexterm id="ch02-email4">
	<primary>mail transfer agent (MTA)</primary>
	<secondary>postfix</secondary></indexterm>
	), while the programs users use are many and varied
	(<glossterm>mail user agent</glossterm> (MUA)
	<indexterm id="ch02-email5"><primary>mail user agent</primary>
	</indexterm>, e.g., <command>pine</command>
	<indexterm id="ch02-email6"><primary>mail user agent</primary>
	<secondary>pine</secondary></indexterm>, or
	<command>evolution</command>
	<indexterm id="ch02-email7"><primary>mail user agent</primary>
	<secondary>evolution</secondary></indexterm>.
	The mailboxes are usually stored
	in <filename>/var/spool/mail</filename> until the user's MUA
	retrieves them.</para>

	<para>For more information on setting up and running mail services
	you can read the Mail Administrator HOWTO at
	<ulink url="http://www.tldp.org/HOWTO/Mail-Administrator-HOWTO.html">
	http://www.tldp.org/HOWTO/Mail-Administrator-HOWTO.html</ulink>, or
	visit the sendmail or postfix's website.
	<ulink url="http://www.sendmail.org/">http://www.sendmail.org/</ulink>,
	or <ulink url="http://www.postfix.org/">http://www.postfix.org/
	</ulink>.</para>

</sect2>

<sect2 id="printing">
<title>Printing</title>
	<para>Only one person can use a printer<indexterm id="ch02-print">
	<primary>printing</primary></indexterm> at one time, but it is
	uneconomical not to share printers between users.  The printer is
	therefore managed by software that implements a <glossterm>print
	queue</glossterm><indexterm id="ch02-print2"><primary>printing
	</primary><secondary>queue</secondary></indexterm>: all print
	jobs are put into a queue and
	whenever the printer is done with one job, the next one is sent
	to it automatically.  This relieves the users from organizing
	the print queue and fighting over control of the printer.
	Instead, they form a new queue <emphasis>at</emphasis>
	the printer, waiting for their printouts, since no one ever seems to
	be able to get the queue software to know exactly when anyone's printout
	is really finished.  This is a great boost to intra-office social
	relations.</para>

	<para>The print queue software also <glossterm>spools</glossterm>
	<indexterm id="ch02-print3"><primary>printing</primary>
	<secondary>spools</secondary></indexterm> the printouts on
	disk, i.e., the text is kept in a file while
	the job is in the queue.  This allows an application program
	to spit out the print jobs quickly to the print queue software;
	the application does not have to wait until the job is actually
	printed to continue.  This is really convenient, since it
	allows one to print out one version, and not have to wait for
	it to be printed before one can make a completely revised new
	version.</para>

	<para>You can refer to the Printing-HOWTO located at
	<ulink url="http://www.tldp.org/HOWTO/Printing-HOWTO/index.html">
	http://www.tldp.org/HOWTO/Printing-HOWTO/index.html</ulink>
	for more help in setting up printers.</para>

</sect2>

<sect2 id="fs-layout">
<title>The filesystem layout</title>
	<para>The filesystem<indexterm id="ch02-filesystem">
	<primary>filesystem</primary></indexterm> is divided into many parts;
	usually along the lines of a root filesystem with
	<filename>/bin</filename>
	<indexterm id="ch02-filesystem2"><primary>filesystem</primary>
	<secondary>/bin</secondary></indexterm>,
	<filename>/lib</filename>
	<indexterm id="ch02-filesystem3"><primary>filesystem</primary>
	<secondary>/lib</secondary></indexterm>,
	<filename>/etc</filename>
	<indexterm id="ch02-filesystem4"><primary>filesystem</primary>
	<secondary>/etc</secondary></indexterm>,
	<filename>/dev</filename>
	<indexterm id="ch02-filesystem5"><primary>filesystem</primary>
	<secondary>/dev</secondary></indexterm>, and a few others;
	a <filename>/usr</filename>
	<indexterm id="ch02-filesystem6"><primary>filesystem</primary>
	<secondary>/usr</secondary></indexterm> filesystem with
        programs and unchanging data;
	<filename>/var</filename>
	<indexterm id="ch02-filesystem7"><primary>filesystem</primary>
	<secondary>/var</secondary></indexterm> filesystem with changing
	data (such as log files); and a
	<filename>/home</filename>
	<indexterm id="ch02-filesystem8"><primary>filesystem</primary>
        <secondary>/home</secondary></indexterm> for everyone's personal
	files.	Depending on the hardware configuration and the decisions
	of the system administrator, the division can be different;
	it can even be all in one filesystem.</para>

	<para><xref linkend="dir-tree-overview"> describes the filesystem
	layout in some little detail; the Filesystem Hierarchy Standard
	<indexterm id="ch02-fhs"><primary>Filesystem Hierarchy Standard (FHS)
        </primary></indexterm>. covers
	it in somewhat more detail.  This can be found on the web at:
	<ulink url="http://www.pathname.com/fhs/">
	http://www.pathname.com/fhs/</ulink></para>
</sect2>
</sect1>
</chapter>