mirror of https://github.com/tLDP/LDP
219 lines
9.8 KiB
Plaintext
219 lines
9.8 KiB
Plaintext
<chapter id="fundamentals">
|
|
<title>System Performance Fundamentals</title>
|
|
<para>
|
|
There are a few fundamental concepts important to understanding
|
|
the performance characteristics of a system. Most of these are
|
|
very obvious, they are just not often thought of when dealing
|
|
with tuning a system.
|
|
</para>
|
|
<section id="fundamentalscomponents">
|
|
<title>Viewing a server as a system of components</title>
|
|
<para>
|
|
The first, and most important, concept is looking at a server as a
|
|
system of components. While you may get tremendous performance
|
|
gains by moving from 7200 RPM to 10,000 RPM drives, you may not. To
|
|
determine what needs to be done, and what benefit improving a
|
|
particular component will be, you must first lay out the system
|
|
and evaluate each component.
|
|
</para>
|
|
<para>
|
|
In a typical server sytem the major components are:
|
|
</para>
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>Processor</term>
|
|
<listitem>
|
|
<para>
|
|
The processor, or processors, in a system are quite obviously
|
|
the little black chip, or card edge socket, or big card/chip
|
|
and heatsink combination on the motherboard. But the
|
|
processor subsytem referred to here includes not only the
|
|
processor, but also the L1 and L2 processor caches, the
|
|
system bus, and various motherboard components.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>Physical Disk</term>
|
|
<listitem>
|
|
<para>
|
|
The physical disk subsytem includes the disk drive spindles
|
|
themselves, as well as the interface between drive and
|
|
controller/adapter card, the adapter card or cards, and
|
|
the bus interface between controller/adapter card and the
|
|
motherboard.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>Network</term>
|
|
<listitem>
|
|
<para>
|
|
The definition of the network subsystem varies depending on
|
|
who you ask. All definitions include the network card and
|
|
its bus interface, and the physical media it connects to,
|
|
such as Ethernet. If you are looking at improving the
|
|
performance of a server this is all that concerns
|
|
you. However, in a larger analysis of performance this
|
|
subsystem may include and number of network links between your
|
|
server and clients.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>Memory</term>
|
|
<listitem>
|
|
<para>
|
|
The memory subsystem includes two major types of memory, real
|
|
and virtual. Real memory is the chip, chips, or cards in the
|
|
computer that provide physical memory. Virtual memory includes
|
|
this memory as well as the virtual memory stored on disk,
|
|
often referred to as a paging file or swap partition.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>Operating System</term>
|
|
<listitem>
|
|
<para>
|
|
The single most complex component of any system is the
|
|
operating system. While there are more opportunities for
|
|
tuning here, there are also more opportunities to make a
|
|
simple mistake that will cost you thousands in additional
|
|
hardware to handle the load.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</section>
|
|
<section id="fundamentalsbottlenecks">
|
|
<title>Bottlenecks and Process Times</title>
|
|
<para>
|
|
The next concept to understand is the total time through a system
|
|
for one request. Let's say you want a hamburger from McDonald's.
|
|
You walk in the door and there are four people in line in front
|
|
of you (just like normal). The person taking and filling orders
|
|
takes care of one person in 2 minutes. So the person being waited
|
|
on now will be done in 2 minutes, when the next person in line will
|
|
be waited on, which takes 2 minutes. In all it will be 8 minutes
|
|
before you give the friendly counter person your order, and 10
|
|
minutes total before you get your food.
|
|
</para>
|
|
<para>
|
|
If we cut the time it takes for the counter person to wait on
|
|
someone from 2 minutes to 1, everything moves faster. Each person
|
|
takes 1 minute, so you have your order in 5. But since each person
|
|
is in line a shorter period of time, the probability that there
|
|
will be 4 people in line in front of you decreases as well. So
|
|
improving processing time of a resource that has a queue waiting
|
|
for it provides a potentially exponential improvement in total time
|
|
through the system. This is why we address bottlenecked resources
|
|
first.
|
|
</para>
|
|
<para>
|
|
The reason we say <quote>potentially exponential</quote> improvement
|
|
is that the
|
|
actual improvement depends on a number of factors, which are
|
|
discussed further. Basically, the process time
|
|
is most likely not constant, but depends on many factors, including
|
|
availability of other system resources on which it could be
|
|
blocked waiting for a response. In addition, the arrival rate of
|
|
requests is probably not constant, but instead spread on some type
|
|
of distribution. So if you happen to arrive during a burst of
|
|
customers, like at lunch hour, when the kitchen is backed up getting
|
|
hamburgers out for the drive through, you could still see a 2 minute
|
|
processing time and 4 people in line in front of you. Actually
|
|
determining the probabilities of these events is an exercise left to
|
|
the reader, as is hitting yourself in the head with a brick.
|
|
The idea here is just to introduce you to the concepts so you
|
|
understand why your system performs as it does.
|
|
</para>
|
|
<para>
|
|
Another concept in high performance computing is parallel computing,
|
|
used for applications like Beowulf.
|
|
If your application is written correctly, this is like walking into
|
|
McDonalds and seeing 4 people in line, but there are 4 lines
|
|
open. So you can walk right up to a counter
|
|
and get your burger in about two minutes. The down side of this is
|
|
that the manager of the store has to pay for more people working in
|
|
the store, or you have to buy more machines.
|
|
There are also potential bottlenecks, such as only one
|
|
person in the back making fries, or the person in line in front of
|
|
you is still trying to decide what they want.
|
|
</para>
|
|
</section> <!-- fundamentalsbottlenecks -->
|
|
<section id="fundamentalsubsystems">
|
|
<title>Interrelation between subsystems</title>
|
|
<para>
|
|
The last concept to cover here is the interrelation between
|
|
subsystems on the total systems performance. If a system is
|
|
short on memory, it will page excessively. This will put
|
|
additional load on the drive subsystem. While tuning the drive
|
|
subsystem will definitely help, the better answer is to add more
|
|
memory.
|
|
</para>
|
|
<para>
|
|
As another example, your system may be performing just fine except
|
|
for the network connection, which is 10Base-T. When you upgrade
|
|
the network card to 100Base-T, performance improves only marginally.
|
|
Why? It could very well be that the network was the bottleneck, but
|
|
removing that bottleneck immediately revealed a bottleneck on the disk
|
|
system.
|
|
</para>
|
|
<para>
|
|
This makes it very important that you accurately measure where
|
|
the bottleneck is. It can not only improve the time it takes
|
|
you to speed up the system, but also save money, since you
|
|
only need to upgrade one subsystem instead of many.
|
|
</para>
|
|
<para>
|
|
The best way to find out the interrelations depends on the application
|
|
you are using. As we go through various subsystems and applications
|
|
in this book, you will have to keep this in mind. For example, the
|
|
Squid web cache has heavy memory and disk requirements. Using less
|
|
memory will increase the disk usage, and decrease performance. By
|
|
looking at memory usage, you can see that memory is being swapped to
|
|
the hard drive. Increasing the available memory so that no more swapping
|
|
of squid will be your best performance bet.
|
|
</para>
|
|
</section> <!-- fundamentalssubsystems -->
|
|
<section id="fundamentalshowmuch">
|
|
<title>How Much Machine is Required?</title>
|
|
<para>
|
|
A frequent question when deciding to buy new hardware is "How much
|
|
hardware should be purchased for the task?". A good rule of thumb is "as
|
|
much as you can afford", but that may not be the whole answer. You should
|
|
consider the following when defining specifications.
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
If you cannot afford a fully loaded system, can it be expanded after
|
|
purchase? If you application would work best with 2GB of RAM, but you
|
|
can only afford 1GB, can the motherboard be expanded to add the extra
|
|
GB later on? Purchasing a motherboard that can only accept 1GB means
|
|
you will need to buy a new motherboard (at least) later on, increasing
|
|
cost.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
How well does the design scale? If you plan on buying racks of
|
|
machines, can you add new racks later and integrate them easily with
|
|
the existing system? Things like switches that can accept new blades
|
|
to add functionality may save space and money in the long run, while
|
|
reducing issues like cabling.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
How easy is it to install, upgrade, or repair? While using desktop
|
|
systems in a rack is possible and will save money, replacing or
|
|
repairing the systems is harder to do. A good question to ask
|
|
yourself is "What is my time worth?".
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</section> <!-- fundamentalshowmuch -->
|
|
</chapter> <!-- fundamentals -->
|