LDP/LDP/guide/docbook/Tuning-Linux/kernel.sgml

628 lines
24 KiB
Plaintext

<chapter id="kernel">
<title>Kernel Tuning</title>
<para>
Unlike most other UNIX or UNIX-like operating systems, the source
code to the kernel is available for easy tuning and upgrading.
With kernels being packaged in distributions, the necessity for most
users to compile their own kernels is pretty low. There are advantages
to compiling your own kernels, though:
</para>
<itemizedlist>
<listitem>
<para>
Compile a version of the kernel specific to your chipset.
</para>
</listitem>
<listitem>
<para>
Add in a new driver or security patch
</para>
</listitem>
<listitem>
<para>
Compile a kernel with modules compiled into the kernel
</para>
</listitem>
</itemizedlist>
<para>
This chapter covers each of these situations, and also tuning of running
kernels using applications like <application>sysconf</application> and
<application>powertweak</application>.
</para>
<section id="kernelversions">
<title>Kernel versions</title>
<para>
There are a few things to consider when deciding what kernel revision
you should be using in a production environment.
</para>
<para>
Officially, Linux has two major threads of development. The stable
releases have an even number as its minor number. The Linux 2.2 and 2.4
kernels are examples of stable releases. These are considered by Linus to
be stable enough for production use. Few known bugs or issues exist with
these kernels.
</para>
<para>
The other release is the development releases. These have experimental
drivers or changes to them, and can cause crashes if used in production
environments. However, these releases will be more likely to support
newer hardware.
</para>
<para>
Most Linux distributions will often make a compromise between stability
and performance by creating their own version of the kernel and source
code. These kernels start with a base of a stable kernel, then add in
some newer drivers. The resulting kernel is then tested to make sure it
is stable, then released.
</para>
<para>
As a generalization, stable kernels are good for production use, while
development kernels may have better performance. For your use, you should
monitor the changes in the two sets of kernels and see if there are any
performance increases that will make your system faster, then test that
against a known stable kernel. You can then make a determination of
stability of the system versus performance.
</para>
</section> <!-- kernelversions -->
<section id="kernelcompile">
<title>Compiling the Linux Kernel</title>
<para>
There is a specific order to getting your kernel compiled, and there
is nothing preventing you from having multiple kernels and modules
on the same system. This allows you to boot test kernels while keeping
a known stable kernel in case you have trouble. Boot loaders like LILO
will allow you to select a kernel at bootup.
</para>
<para>
The easiest place to pick up the latest kernels is <ulink
url="http://www.kernel.org">http://www.kernel.org</ulink>. This is
the official site for Linux kernels. These kernels are almost always
in tar-bzip2 format, instead of being in RPM or DEB formats. Check with
your distribution provider for packaged formats of the kernel.
</para>
<para>
Unpack and untar the kernel. For packages compressed with bzip2,
the process is:
</para>
<screen>
# tar -jxvf linux-2.2.19.tar.bz2
</screen>
<para>
This will create all the header files and source code for the
kernel. There are three methods for configuring the kernel to compile,
and each have some prerequisites. All methods will obviously require
the C compiler.
</para>
<section id="kernelcompileconfig">
<title>Using <command>make config</command></title>
<indexterm>
<primary>kernel</primary>
<secondary>compile</secondary>
<tertiary>config</tertiary>
</indexterm>
<para>
The siplest way (and the first) is to use <command>make config</command>
from within <filename>/usr/src/linux</filename>. This is a very
tedious way of compiling, since you must answer each question
and the Linux kernel has a large number of questions now.
</para>
<para>
But an advantage to <command>make config</command> is that it
does not require any other development tools to be installed
in order to use it. On very lightweight systems, using this
method will save drive space and memory, since we are not
running very complicated or large programs.
</para>
<para>
SCREENSHOT OF make config HERE
</para>
</section> <!-- kernelcompileconfig -->
<section id="kernelcompilemenuconfig">
<title>Using <command>make menuconfig</command></title>
<indexterm>
<primary>kernel</primary>
<secondary>compile</secondary>
<tertiary>menuconfig</tertiary>
</indexterm>
<para>
The next level of kernel configuration involves using a character
based menuing system. For systems that have only a text console
or <application>ssh</application> access, this makes an easier way
to configure a kernel. It requires both ncurses and its development
libraries to be installed in order for this to work. The program
that runs in <command>make menuconfig</command> is part of the
kernel source package and is compiled when needed.
</para>
<para>
With an on-screen menu, it is easier to organize and select the
options you want to compile in. Help menus are available as well.
This option is not good for those who want small systems, since it
requires extra libraries.
</para>
<para>
INSERT SCREENSHOT OF make menuconfig
</para>
</section> <!-- kernelcompilemenuconfig -->
<section id="kernelcompilexconfig">
<title>Using <command>make xconfig</command></title>
<indexterm>
<primary>kernel</primary>
<secondary>compile</secondary>
<tertiary>xconfig</tertiary>
</indexterm>
<para>
The most user friendly, but hardest to use, is the <command>make
xconfig</command> option for configuring the kernel. This
requires most of the X libraries to be installed, along with
Tcl and Tk.
</para>
<para>
Since many server systems may not have X or its associated libraries,
Tcl, or Tk installed, you may want to step back to one of the other
methods of configuring the kernel. You would also need to make sure
that you can export X to your desktop if you are configuring
a remote system.
</para>
<figure id="kernelmakexconfig">
<title><command>make xconfig</command></title>
<mediaobject>
<imageobject>
<imagedata format="JPG" fileref="makexconfig.jpg">
</imageobject>
</mediaobject>
</figure>
</section> <!-- kernelcompilexconfig -->
<section id="kernelcompileoptions">
<title>Options for configuring the kernel</title>
<para>
Now that you have chosen a way to configure the kernel, how
should you go about your configuration? What options should you
pick for best performance, lowest memory consumption, and
best security options?
</para>
<para>
Many of the answers to these questions depends on your particular
situation. Selecting the right processor family for your CPI
is one of the best things to do. Each processor family has its
own way of alinging memory and some CPU-specific options and
optimizations.
</para>
<para>
If your hardware setup is going to be stable, you can
compile in drivers for various kinds of hardware for
slightly better performance, and to prevent trojan
modules from being loaded. See <xref linkend="kernelcompileoptions">
for more on this.
</para>
<para>
Selecting DMA access for IDE devices under <quote>Block devices</quote>
will automatically turn on DMA access, reducing CPU load. You can see
<xref linkend="disktuningideos"> for more information.
</para>
<para>
If your box will be configured as a router without firewalling,
you can enable <quote>IP: optimize as router not host</quote> under
<quote>Networking options</quote>. This reduces the latency of
packets within the kernel by skipping a few unnecessary steps.
But these steps are required if you will be doing firewalling
or acting as a server.
</para>
<para>
Bonding device support under <quote>Network device support</quote>
will enable bonding multiple network devices together to create
one large pipe to other bonded Linux boxes or some Cisco routers.
</para>
<para>
If you plan on having a large number of users logging in and you would
need a large number of pseudo terminals (PTYs), you can enter
a number in <quote>Maximum number of Unix98 PTYs in use (0-2048)</quote>
under <quote>Character devices</quote>.
</para>
<para>
For systems that will not be onsite and require high availability,
you can enable <quote>watchdog timer support</quote>. Watchdogs
can be implemented either in software or hardware. Software watchdogs
open and close files to make sure that everything is working. If the
driver cannot open the file, it tries to trigger a reboot of the system,
since this indicates that there is some error with the system.
Watchdog timers are usually implemented ISA cards, but some
single board computers will have the necessary chips built in. Hardware
watchdogs are a bit better as it can monitor extra features such
as temperature and fan speed and can trigger reboots on those events as
well. Hardware watchdogs are a bit more proactive as they continually
talk to the kernel driver. If the kernel driver does not respond,
the watchdog triggers a reset.
</para>
</section> <!-- kernelcompileoptions -->
<section id="kernelcompilemodules">
<title>Kernel Modules and You</title>
<para>
While this book is not specifically covering security, it is
fair to bring it up every now and then. For those who want
hard core, as-uncrackable-as-possible systems, you may want
to consider removing module support from your system.
</para>
<para>
The reason for this is that if a cracker were to gain access
to your system, one of the more advanced methods of covering their
tracks is to insert a kernel module that hides much of their activity.
</para>
<para>
Another tactic is to replace an existing module with a trojan horse.
Once the module is loaded, its payload deploys in kernel space instead
of userspace, giving the rogue code more access to the system.
</para>
<para>
Another reason to remove modules from your system is to create a smaller
size kernel. With modules compiled into the kernel, they get
compressed along with the rest of the kernel using bzip2. Normally
module files are uncompressed, but bzip2 can compress sizes by
more than half. This can save crucial space if it is required.
</para>
<para>
For most users, however, you will want to use modules, as it gives a
very versitile way of installing new drivers by just loading
the module.
</para>
</section> <!-- kernelcompilemodules -->
<section id="kernelcompilebuild">
<title>Building The Kernel</title>
<para>
With a configured kernel, you can now go through the build process.
This is documented in a number of places, but it is included
here for completeness.
</para>
<para>
The first thing you should take a look at if you want to store the
kernel configuration is to get
<filename>/usr/src/linux/.config</filename>, as it has the ouput
from the configuration options selected previously. You can
move the .config file to a new system, then build the kernel
and skip the configuration part.
</para>
<para>
The <command>make dep</command> command sets up the dependencies needed
so the kernel can be built. This is a crucial step, as the kernel
will not compile without it.
</para>
<para>
Most users will want to use <application>bzip2</application> for
building the kernel, as it makes a much smaller kernel than
<application>gzip</application> or uncompressed. You can build a
<application>bzip2</application> kernel by using <command>make
bzImage</command>. The compressed image is built and put into <filename
class="directory">/usr/src/linux/arch/i386/</filename>.
You may want to just use <command>make bzlilo</command> to put the
kernel into <filename class="directory">/</filename> and run
<command>lilo</command> to let you boot the new kernel.
Some users may not want their kernel in <filename
class="directory">/</filename>, since many distributions create a small
<filename class="directory">/boot</filename> partition that contains
the kernel and enough to boot the kernel. In this event, you will
want to copy the kernel by and into <filename
class="directory">/boot</filename>.
</para>
<para>
If you will be using modules, you will also need to run <command>make
modules</command> and <command>make modules_install</command>. These
two commands will build and install the modules for your kernel.
</para>
</section> <!-- kernelcompilebuild -->
</section> <!-- kernelcompile -->
<section id="kernelrunning">
<title>Tuning a Runing Linux System</title>
<indexterm>
<primary>/proc</primary>
</indexterm>
<para>
When Linux first got started, changing options to the kernel often
required recompiling. Getting information out of Linux required
applications be recompiled to look at the right memory location
to get things like user and process lists. After the release
of the 1.0 kernel, the <filename class="directory">/proc</filename>
filesystem arrived as a way to access information in a kernel-independent
way. Because of this, a command like <command>ps</command> will
keep operating even if the kernel rev changes. The <filename
class="directory">/proc</filename> filesystem also allows you
to change the kernel itself wile running, allowing you to add
in features like TCP/IP forwarding, manage RAID cards, and so on.
</para>
<section id="kernelrunningproc">
<title>The <filename class="directory">/proc</filename> filesystem</title>
<para>
This portion is not strictly about tuning a system, but getting
information out of your urnning system. In the event of serious
memory or hardware issues, you can sometimes get information
out of Linux.
</para>
<para>
The <filename class="directory">/proc</filename> heirarchy may change
depending on the kernel revision, but is generally arranged
into different classes.
</para>
<variablelist>
<varlistentry>
<term>
Processes
</term>
<listitem>
<para>
Contains information about each of the running processes,
organized by PID. Each process has its own directory
of information, including status, list of file descriptors,
command line, environment variables for the application,
and working directory when the application was started.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
bus
</term>
<listitem>
<para>
Contains information about the system busses. At this point,
it works with the USB, PCI, and PCMCIA busses. More, like
IEEE-1394 will list devices and controllers under this directory.
The information about each device is less human readable than
older directoried like <filename class="directory">/proc/pci
</filename>, but human readability is not the point of
<filename class="directory">/proc</filename>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
cpuinfo
</term>
<listitem>
<para>
If you want to find out information about the CPU(s) installed in
your system, you can take a look at the cpuinfo file. For each
processor, you will get things like CPU speed, vendor,
model type (Pentium II, Pentium III, etc.), and the
ever-popular bogomips rating.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
filesystems
</term>
<listitem>
<para>
This file has a list of the filesystems that the kernel currently
knows about. We say currently, since modules can be loaded or
removed to add or remove additional filesystem types. If
you are having trouble mounting a filesystem, this is an easy
way to make sure the filesystem type is supported by the kernel.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
ide
</term>
<listitem>
<para>
If your system has
<indexterm>
<primary>IDE</primary>
</indexterm>
IDE devices, this directory contains information about the IDE
devices that are in your system. It is organized by IDE
channels that are available, and each device has informaiton about
the model name and number, serial number, capacity, and any
other information specific to that IDE type. Tuning of IDE
drives should be done using <command>hdparm</command>, and you can
see <xref linkend="disktuningideos"> for usage.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
meminfo
</term>
<listitem>
<para>
This file has the raw memory information that you would normally
see as output of <command>free</command>. The first
part of the ouput contains the memory listing in bytes, followed
by most of the same information in kilobytes.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
modules
</term>
<listitem>
<para>
You could easily mistake the contents of the modules file
for the output of <command>lsmod</command>. There is a listing
of the modules, size of the module, and dependencies for
the module.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
partitions
</term>
<listitem>
<para>
This file has a list of all the disk partitions that are known
to Linux, including those not mounted or listed in <filename>
/etc/fstab</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
pci
</term>
<listitem>
<para>
This file may go away in the long term, in favor of <filename
class="directory">/proc/bus/pci</filename>, but <filename>
/proc/pci</filename> is more human readable, listing
the device type, vendor, and device name. But this
will only happen if the Linux kernel knows of the device.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
scsi
</term>
<listitem>
<para>
<indexterm>
<primary>SCSI</primary>
</indexterm>
SCSI-based systems can look at and tune their parameters here.
Its setup if much the same as for IDE, listing chains, followed by
devices on each chain.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
swaps
</term>
<listitem>
<para>
Lists all mounted swap partitions (or files) mounted.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
version
</term>
<listitem>
<para>
The version file contains version information of the kernel you
are using. It's very helpful in that it has the user and host
that the kernel was compiled on, GCC version used to compile the
kernel, and date and time this kernel was compiled. This
makes it much more helpful than <command>uname</command>, since
you can quickly check a series of running machines and verify
they all report the same kernel revision.
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
Not listed above is <filename class="directory">/proc/sys</filename>.
This directory is where most of the runtime changes to the kernel
take place. Access to this used to be done manually, using
<command>cat</command> and <command>echo</command> to view or change
settings of a running kernel. If you wanted runtime settings to be
changed each time the machine booted, you had to add commands to
the <filename>rc.local</filename> file for each setting.
</para>
<para>
An easier way to manage these settings is via the use of
<command>sysctl</command>.
<indexterm>
<primary>sysctl</primary>
</indexterm>
</para>
<cmdsynopsis>
<command>sysctl</command>
<arg>-n</arg>
<arg><replaceable>variable</replaceable></arg>
<arg>-w <replaceable>variable</replaceable>=<replaceable>value</replaceable></arg>
<arg>-a</arg>
<arg>-p <replaceable>filename</replaceable></arg>
</cmdsynopsis>
<para>
If you use an argument of <replaceable>variable</replaceable>, you will
get the current value of
<filename>/proc/sys/<replaceable>variable</replaceable></filename>. You
can set it by adding an equal sign and <replaceable>value</replaceable>.
You can get a list of what is available along with their current values
by using -a.
</para>
<para>
As part of the Linux boot sequence on more modern distributions,
<command>sysctl</command> is called with -p, which reads
settings from <filename>/etc/sysctl.conf</filename>. This file can be
edited by a text file, and contains a list of lines of the form
<replaceable>variable</replaceable>=<replaceable>value</replaceable>.
</para>
<screen>
#
# /etc/sysctl.conf - Configuration file for setting system variables
# See sysctl.conf (5) for information.
#
#kernel.domainname = example.com
#net/ipv4/icmp_echo_ignore_broadcasts=1
net.ipv4.conf.all.hidden = 1
net.ipv4.conf.lo.hidden=1
</screen>
<para>
As you can see, <replaceable>variable</replaceable> can separate out
directories using either a slash (/), or a dot (.). Using the dot makes
it look more like SNMP variables.
</para>
<para>
Since everything under <filename class="directory">/proc</filename>
are regular files, security for viewing or setting are handled by
regular file permissions. In some cases, only root can make changes,
in others, only root can view or set.
</para>
</section> <!-- kernelrunningproc -->
<section id="kernelrunningpowertweak">
<title>Using <command>powertweak</command></title>
<indexterm>
<primary>powertweak</primary>
</indexterm>
<para>
Another interface to change options of a running kernel is through the
use of <command>powertweak</command>. Powertweak also runs on startup
of linux, using a daemon called <application>powertweakd</application>
to issue changes to the kernel. Configuration files are stored in the
kernel in XML format.
</para>
<para>
An advantage of using <command>powertweak</command> over
<command>sysctl</command> is that powertweak has a graphical
and text interface. There is a pretty extensive help system that
can give you information about each setting.
</para>
<para>
For those that just want to have optimized settings for a particular
use, there are pre-made settings for laptops, web servers, routers,
and Oracle usage. As an example, the laptop setting changes the kernel
disk flush so the hard drive can spin down. These are just generic
settings, as each different machine has different PCI devices
that can also be set.
</para>
<para>
You can get the latest version of <application>powertweak</application>
from <ulink
url="http://www.powertweak.org/">http://www.powertweak.org/</ulink>
</para>
<figure id="kernelpowertweak"
<title>Powertweak</title>
<mediaobject>
<imageobject>
<imagedata format="JPG" fileref="powertweak.jpg">
</imageobject>
</mediaobject>
</figure>
</section> <!-- kernelrunningpowertweak -->
</section> <!-- kernelrunning -->
</chapter> <!-- kernel -->