LDP/LDP/guide/docbook/Tuning-Linux/fundamentals.sgml

<chapter id="fundamentals">
  <title>System Performance Fundamentals</title>
  <para>
    There are a few fundamental concepts important to understanding
    the performance characteristics of a system.  Most of these are
    very obvious, they are just not often thought of when dealing
    with tuning a system.
  </para>
  <section id="fundamentalscomponents">
    <title>Viewing a server as a system of components</title>
    <para>
     The first, and most important, concept is looking at a server as a
     system of components.  While you may get tremendous performance
     gains by moving from 7200 RPM to 10,000 RPM drives, you may not.  To
     determine what needs to be done, and what benefit improving a
     particular component will be, you must first lay out the system
     and evaluate each component.
   </para>
   <para>
     In a typical server sytem the major components are:
   </para>
    <variablelist>
      <varlistentry>
        <term>Processor</term>
	<listitem>
	  <para>
            The processor, or processors, in a system are quite obviously
            the little black chip, or card edge socket, or big card/chip
            and heatsink combination on the motherboard.  But the
            processor subsytem referred to here includes not only the
            processor, but also the L1 and L2 processor caches, the
            system bus, and various motherboard components.
          </para>
	</listitem>
      </varlistentry>
      <varlistentry>
        <term>Physical Disk</term>
	<listitem>
	  <para>
            The physical disk subsytem includes the disk drive spindles
            themselves, as well as the interface between drive and
            controller/adapter card, the adapter card or cards, and
            the bus interface between controller/adapter card and the
            motherboard.
	  </para>
	</listitem>
      </varlistentry>
      <varlistentry>
	<term>Network</term>
	<listitem>
	  <para>
	    The definition of the network subsystem varies depending on
            who you ask.  All definitions include the network card and
	    its bus interface, and the physical media it connects to,
	    such as Ethernet.  If you are looking at improving the
	    performance of a server this is all that concerns
            you.  However, in a larger analysis of performance this
	    subsystem may include and number of network links between your
	    server and clients.
	  </para>
	</listitem>
      </varlistentry>
      <varlistentry>
	<term>Memory</term>
	<listitem>
	  <para>
	    The memory subsystem includes two major types of memory, real
            and virtual.  Real memory is the chip, chips, or cards in the
	    computer that provide physical memory.  Virtual memory includes
	    this memory as well as the virtual memory stored on disk,
            often referred to as a paging file or swap partition.
	  </para>
	</listitem>
      </varlistentry>
      <varlistentry>
        <term>Operating System</term>
        <listitem>
	  <para>
	    The single most complex component of any system is the
	    operating system.  While there are more opportunities for
	    tuning here, there are also more opportunities to make a
	    simple mistake that will cost you thousands in additional
	    hardware to handle the load.
	  </para>
	</listitem>
      </varlistentry>
    </variablelist>
  </section>
  <section id="fundamentalsbottlenecks">
    <title>Bottlenecks and Process Times</title>
    <para>
      The next concept to understand is the total time through a system
      for one request.  Let's say you want a hamburger from McDonald's.
      You walk in the door and there are four people in line in front
      of you (just like normal).  The person taking and filling orders
      takes care of one person in 2 minutes.  So the person being waited
      on now will be done in 2 minutes, when the next person in line will
      be waited on, which takes 2 minutes.  In all it will be 8 minutes
      before you give the friendly counter person your order, and 10
      minutes total before you get your food.
    </para>
    <para>
      If we cut the time it takes for the counter person to wait on
      someone from 2 minutes to 1, everything moves faster.  Each person
      takes 1 minute, so you have your order in 5.  But since each person
      is in line a shorter period of time, the probability that there
      will be 4 people in line in front of you decreases as well.  So
      improving processing time of a resource that has a queue waiting
      for it provides a potentially exponential improvement in total time
      through the system.  This is why we address bottlenecked resources
      first.
    </para>
    <para>
      The reason we say <quote>potentially exponential</quote> improvement
      is that the
      actual improvement depends on a number of factors, which are
      discussed further.  Basically, the process time
      is most likely not constant, but depends on many factors, including
      availability of other system resources on which it could be
      blocked waiting for a response.  In addition, the arrival rate of
      requests is probably not constant, but instead spread on some type
      of distribution.  So if you happen to arrive during a burst of
      customers, like at lunch hour, when the kitchen is backed up getting
      hamburgers out for the drive through, you could still see a 2 minute
      processing time and 4 people in line in front of you.  Actually
      determining the probabilities of these events is an exercise left to
      the reader, as is hitting yourself in the head with a brick.
      The idea here is just to introduce you to the concepts so you
      understand why your system performs as it does.
    </para>
    <para>
      Another concept in high performance computing is parallel computing,
      used for applications like Beowulf.
      If your application is written correctly, this is like walking into
      McDonalds and seeing 4 people in line, but there are 4 lines
      open.  So you can walk right up to a counter
      and get your burger in about two minutes.  The down side of this is
      that the manager of the store has to pay for more people working in
      the store, or you have to buy more machines.
      There are also potential bottlenecks, such as only one
      person in the back making fries, or the person in line in front of
      you is still trying to decide what they want.
    </para>
  </section> <!-- fundamentalsbottlenecks -->
  <section id="fundamentalsubsystems">
    <title>Interrelation between subsystems</title>
    <para>
      The last concept to cover here is the interrelation between
      subsystems on the total systems performance.  If a system is
      short on memory, it will page excessively.  This will put
      additional load on the drive subsystem.  While tuning the drive
      subsystem will definitely help, the better answer is to add more
      memory.
    </para>
    <para>
      As another example, your system may be performing just fine except
      for the network connection, which is 10Base-T.  When you upgrade
      the network card to 100Base-T, performance improves only marginally.
      Why?  It could very well be that the network was the bottleneck, but
      removing that bottleneck immediately revealed a bottleneck on the disk
      system.
    </para>
    <para>
      This makes it very important that you accurately measure where
      the bottleneck is.  It can not only improve the time it takes
      you to speed up the system, but also save money, since you
      only need to upgrade one subsystem instead of many.
    </para>
    <para>
      The best way to find out the interrelations depends on the application
      you are using.  As we go through various subsystems and applications
      in this book, you will have to keep this in mind.  For example, the
      Squid web cache has heavy memory and disk requirements.  Using less
      memory will increase the disk usage, and decrease performance.  By
      looking at memory usage, you can see that memory is being swapped to
      the hard drive.  Increasing the available memory so that no more swapping
      of squid will be your best performance bet.
    </para>
  </section> <!-- fundamentalssubsystems -->
  <section id="fundamentalshowmuch">
    <title>How Much Machine is Required?</title>
    <para>
      A frequent question when deciding to buy new hardware is "How much
      hardware should be purchased for the task?".  A good rule of thumb is "as
      much as you can afford", but that may not be the whole answer.  You should
      consider the following when defining specifications.
    </para>
    <itemizedlist>
      <listitem>
        <para>
	  If you cannot afford a fully loaded system, can it be expanded after
	  purchase?  If you application would work best with 2GB of RAM, but you
	  can only afford 1GB, can the motherboard be expanded to add the extra
	  GB later on?  Purchasing a motherboard that can only accept 1GB means
	  you will need to buy a new motherboard (at least) later on, increasing
	  cost.
	</para>
      </listitem>
      <listitem>
        <para>
	  How well does the design scale?  If you plan on buying racks of
	  machines, can you add new racks later and integrate them easily with
	  the existing system?  Things like switches that can accept new blades
	  to add functionality may save space and money in the long run, while
	  reducing issues like cabling.
	</para>
      </listitem>
      <listitem>
        <para>
	  How easy is it to install, upgrade, or repair?  While using desktop
	  systems in a rack is possible and will save money, replacing or
	  repairing the systems is harder to do.  A good question to ask
	  yourself is "What is my time worth?".
	</para>
      </listitem>
    </itemizedlist>
  </section>  <!-- fundamentalshowmuch -->
</chapter> <!-- fundamentals -->