Overview of a Linux System
God saw everything that he had made, and saw that it was very good. -- Bible King James Version. Genesis 1:31
This chapter gives an overview of a Linux system. First, the major services provided by the operating system are described. Then, the programs that implement these services are described with a considerable lack of detail. The purpose of this chapter is to give an understanding of the system as a whole, so that each part is described in detail elsewhere. Various parts of an operating system UNIX and 'UNIX-like' operating systems (such as Linux) consist of a kernel and some system programs. There are also some application programs for doing work. The kernel kernel overviewis the heart of the operating system. In fact, it is often mistakenly considered to be the operating system itself, but it is not. An operating system provides provides many more services than a plain kernel. It keeps track of files on the disk, starts programs and runs them concurrently, assigns memory and other resources to various processes, receives packets from and sends packets to the network, and so on. The kernel does very little by itself, but it provides tools with which all services can be built. It also prevents anyone from accessing the hardware directly, forcing everyone to use the tools it provides. This way the kernel provides some protection for users from each other. The tools provided by the kernel are used via system calls. See manual page section 2 for more information on these. The system programs use the tools provided by the kernel to implement the various services required from an operating system. System programs, and all other programs, run `on top of the kernel', in what is called the user mode. The difference between system and application programs is one of intent: applications are intended for getting useful things done (or for playing, if it happens to be a game), whereas system programs are needed to get the system working. A word processor is an application; mount is a system program. The difference is often somewhat blurry, however, and is important only to compulsive categorizers. An operating system can also contain compilers and their corresponding libraries (GCC and the C library in particular under Linux), although not all programming languages need be part of the operating system. Documentation, and sometimes even games, can also be part of it. Traditionally, the operating system has been defined by the contents of the installation tape or disks; with Linux it is not as clear since it is spread all over the FTP sites of the world. Important parts of the kernel The Linux kernel kernel overview consists of several important parts: process management, memory management, hardware device drivers, filesystem drivers, network management, and various other bits and pieces. shows some of them.
Some of the more important parts of the Linux kernel
Probably the most important parts of the kernel (nothing else works without them) are memory management and process management. Memory management kernelmemory management takes care of assigning memory areas and swap space areas to processes, parts of the kernel, and for the buffer cache. Process management kernel process management creates processes, and implements multitasking by switching the active process on the processor. At the lowest level, the kernel contains a hardware device driver kernel driverfor each kind of hardware it supports. Since the world is full of different kinds of hardware, the number of hardware device drivers is large. There are often many otherwise similar pieces of hardware that differ in how they are controlled by software. The similarities make it possible to have general classes of drivers that support similar operations; each member of the class has the same interface to the rest of the kernel but differs in what it needs to do to implement them. For example, all disk drivers look alike to the rest of the kernel, i.e., they all have operations like `initialize the drive', `read sector N', and `write sector N'. Some software services provided by the kernel itself have similar properties, and can therefore be abstracted into classes. For example, the various network protocols have been abstracted into one programming interface, the BSD socket library. Another example is the virtual filesystem kernelvirtual filesystem (VFS) (VFS) layer that abstracts the filesystem operations away from their implementation. Each filesystem type provides an implementation of each filesystem operation. When some entity tries to use a filesystem, the request goes via the VFS, which routes the request to the proper filesystem driver. A more in-depth discussion of kernel internals can be found at http://www.tldp.org/LDP/lki/index.html. This document was written for the 2.4 kernel. When I find one for the 2.6 kernel, I will list it here.
Major services in a UNIX system This section describes some of the more important UNIX services, but without much detail. They are described more thoroughly in later chapters. <command>init</command> The single most important service in a UNIX system is provided by init init. init is started as the first process of every UNIX system, as the last thing the kernel does when it boots. When init starts, it continues the boot process by doing various startup chores (checking and mounting filesystems, starting daemons, etc). The exact list of things that init does depends on which flavor it is; there are several to choose from. init usually provides the concept of single user mode runlevels1 - single user , in which no one can log in and root uses a shell at the console; the usual mode is called multiuser mode runlevels3 - multi-user . Some flavors generalize this as run levels; single and multiuser modes are considered to be two run levels, and there can be additional ones as well, for example, to run X on the console. Linux allows for up to 10 runlevels runlevels, 0-9, but usually only some of these are defined by default. Runlevel 0 runlevels0 - shutdown is defined as ``system halt''. Runlevel 1 runlevels1 - single-user is defined as ``single user mode''. Runlevel 3 runlevels 3 - multi-user is defined as "multi user" because it is the runlevel that the system boot into under normal day to day conditions. Runlevel 5 runlevels5 - multi-user with GUI is typically the same as 3 except that a GUI GUI gets started also. Runlevel 6 runlevels 6 - rebootis defined as ``system reboot''. Other runlevels are dependent on how your particular distribution has defined them, and they vary significantly between distributions. Looking at the contents of /etc/inittab inittab usually will runlevels inittab give some hint what the predefined runlevels are and what they have been defined as. In normal operation, init commands init makes sure getty commandsgetty is working (to allow users to log in) and to adopt orphan processes (processes whose parent has died; in UNIX all processes must be in a single tree, so orphans must be adopted). When the system is shut down, it is init that is in charge of killing all other processes, unmounting all filesystems and stopping the processor, along with anything else it has been configured to do. Logins from terminals Logins from terminals (via serial lines) and the console (when not running X) are provided by the getty getty program. init init starts a separate instance of getty for each terminal upon which logins are to be allowed. getty reads the username and runs the login login program, which reads the password. If the username and password are correct, login runs the shell. When the shell terminates, i.e., the user logs out, or when login terminated because the username and password didn't match, init notices this and starts a new instance of getty. The kernel has no notion of logins, this is all handled by the system programs. Syslog The kernel and many system programs produce error, warning, and other messages. It is often important that these messages can be viewed later, even much later, so they should be written to a file. The program doing this is syslog syslog . It can be configured to sort the messages to different files according to writer or degree of importance. For example, kernel messages are often directed to a separate file from the others, since kernel messages are often more important and need to be read regularly to spot problems. will provide more on this. Periodic command execution: <command>cron</command> and <command>at</command> Both users and system administrators often need to run commands periodically. For example, the system administrator might want to run a command to clean the directories with temporary files (/tmp and /var/tmp) from old files, to keep the disks from filling up, since not all programs clean up after themselves correctly. The cron cron service is set up to do this. Each user can have a crontab croncrontab file, where she lists the commands she wishes to execute and the times they should be executed. The cron daemon takes care of starting the commands when specified. The at at service is similar to cron, but it is once only: the command is executed at the given time, but it is not repeated. We will go more into this later. See the manual pages cron(1), crontab(1), crontab(5), at(1) and atd(8) for more in depth information. will cover this. Graphical user interface GUI UNIX and Linux don't incorporate the user interface into the kernel; instead, they let it be implemented by user level programs. This applies for both text mode and graphical environments. This arrangement makes the system more flexible, but has the disadvantage that it is simple to implement a different user interface for each program, making the system harder to learn. The graphical environment primarily used with Linux is called the X Window System GUI X Windows (X for short). X also does not implement a user interface; it only implements a window system, i.e., tools with which a graphical user interface can be implemented. Some popular window managers are: fvwm GUIfvwm, icewm GUIicewm , blackbox GUI blackbox, and windowmaker GUIwindowmaker . There are also two popular desktop managers, KDE KDE and Gnome. GNOME Networking Networking networking is the act of connecting two or more computers so that they can communicate with each other. The actual methods of connecting and communicating are slightly complicated, but the end result is very useful. UNIX operating systems have many networking features. Most basic services (filesystems, printing, backups, etc) can be done over the network. This can make system administration easier, since it allows centralized administration, while still reaping in the benefits of microcomputing and distributed computing, such as lower costs and better fault tolerance. However, this book merely glances at networking; see the Linux Network Administrators' Guide http://www.tldp.org/LDP/nag2/index.html networkingNetwork Admin Guide (NAG) for more information, including a basic description of how networks operate. Network logins Network logins logging in work a little differently than normal logins. For each person logging in via the network there is a separate virtual network connection, and there can be any number of these depending on the available bandwidth. It is therefore not possible to run a separate gettygetty for each possible virtual connection. There are also several different ways to log in via a network, telnettelnet and ssh ssh being the major ones in TCP/IP networks. These days many Linux system administrators consider telnet and rlogin to be insecure and prefer ssh, the ``secure shell'', which encrypts traffic going over the network, thereby making it far less likely that the malicious can ``sniff'' your connection and gain sensitive data like usernames and passwords. It is highly recommended you use ssh rather than telnet or rlogin. Network logins Logging in have, instead of a herd of gettys getty, a single daemon per way of logging in (telnet telnet and ssh ssh have separate daemons) that listens for all incoming login attempts. When it notices one, it starts a new instance of itself to handle that single attempt; the original instance continues to listen for other attempts. The new instance works similarly to getty. Network file systems One of the more useful things that can be done with networking services is sharing files via a network file system. Depending on your network this could be done over the Network File System (NFS) Network File System (NFS), or over the Common Internet File System (CIFS) Common Internet File System (CIFS). NFS is typically a 'UNIX' based service. In Linux, NFS is supported by the kernelkernel NFS. CIFS however is not. In Linux, CIFS is supported by SambaSamba http://www.samba.org. With a network file system any file operations done by a program on one machine are sent over the network to another computer. This fools the program to think that all the files on the other computer are actually on the computer the program is running on. This makes information sharing extremely simple, since it requires no modifications to programs. This will be covered in more detail in . Mail Electronic mail email is the most popularly used method for communicating via computer. An electronic letter is stored in a file using a special format, and special mail programs are used to send and read the letters. Each user has an incoming mailbox (a file in the special format), where all new mail is stored. When someone sends mail, the mail program locates the receiver's mailbox and appends the letter to the mailbox file. If the receiver's mailbox is in another machine, the letter is sent to the other machine, which delivers it to the mailbox as it best sees fit. The mail system consists of many programs. The delivery of mail to local or remote mailboxes is done by one program (the mail transfer agent (MTA) mail transfer agent (MTA) , e.g., sendmail mail transfer agent (MTA) sendmail or postfix mail transfer agent (MTA) postfix ), while the programs users use are many and varied (mail user agent (MUA) mail user agent , e.g., pine mail user agent pine, or evolution mail user agent evolution. The mailboxes are usually stored in /var/spool/mail until the user's MUA retrieves them. For more information on setting up and running mail services you can read the Mail Administrator HOWTO at http://www.tldp.org/HOWTO/Mail-Administrator-HOWTO.html, or visit the sendmail or postfix's website. http://www.sendmail.org/, or http://www.postfix.org/ . Printing Only one person can use a printer printing at one time, but it is uneconomical not to share printers between users. The printer is therefore managed by software that implements a print queueprinting queue: all print jobs are put into a queue and whenever the printer is done with one job, the next one is sent to it automatically. This relieves the users from organizing the print queue and fighting over control of the printer. Instead, they form a new queue at the printer, waiting for their printouts, since no one ever seems to be able to get the queue software to know exactly when anyone's printout is really finished. This is a great boost to intra-office social relations. The print queue software also spools printing spools the printouts on disk, i.e., the text is kept in a file while the job is in the queue. This allows an application program to spit out the print jobs quickly to the print queue software; the application does not have to wait until the job is actually printed to continue. This is really convenient, since it allows one to print out one version, and not have to wait for it to be printed before one can make a completely revised new version. You can refer to the Printing-HOWTO located at http://www.tldp.org/HOWTO/Printing-HOWTO/index.html for more help in setting up printers. The filesystem layout The filesystem filesystem is divided into many parts; usually along the lines of a root filesystem with /bin filesystem /bin, /lib filesystem /lib, /etc filesystem /etc, /dev filesystem /dev, and a few others; a /usr filesystem /usr filesystem with programs and unchanging data; /var filesystem /var filesystem with changing data (such as log files); and a /home filesystem /home for everyone's personal files. Depending on the hardware configuration and the decisions of the system administrator, the division can be different; it can even be all in one filesystem. describes the filesystem layout in some little detail; the Filesystem Hierarchy Standard Filesystem Hierarchy Standard (FHS) . covers it in somewhat more detail. This can be found on the web at: http://www.pathname.com/fhs/