extra modifications

2001-05-28 15:04:34 +00:00 · 2001-05-28 15:04:34 +00:00 · 62a6f8c6aa
parent f0512f3894
commit 62a6f8c6aa
5 changed files with 282 additions and 29 deletions
--- a/LDP/guide/docbook/Tuning-Linux/apps.sgml
+++ b/LDP/guide/docbook/Tuning-Linux/apps.sgml
@ -1,8 +1,12 @@
 <chapter id="apps">
  <title>Application Tuning</title>
  <para>
-    Here I would think we would discuss the various requirements of
-    different applications.  See Squid and Sendmail for examples.
+    While one can tune hardware and the OS and get great measurements from it,
+    most applications for Linux have their own rules for improving performance.
+    Tuning a hard drive for an application that uses a lot of memory will not
+    improve the speed of that application, whereas investing in more memory will
+    create a vast improvement in speed.  Let us examine some of the applications
+    for Linux and how you can tune a box specifically for it.
  </para>

  <section id="appssquid">
@ -80,6 +84,11 @@
    <section id="appsmailsendmail">
      <title>Sendmail</title>
      <para>
+        <indexterm>
+	  <primary>
+	    sendmail
+	  </primary>
+	</indexterm>
        A sendmail system will probably be the most likely to be
 	network bottlenecked, but that bottleneck will also
 	most likely be outside the system itself.  There are no
@ -106,6 +115,16 @@
    <section id="appsmaildelivery">
      <title>Mail Delivery (POP and IMAP)</title>
        <para>
+	  <indexterm>
+	    <primary>
+	      Post Office Protocol (POP)
+	    </primary>
+	  </indexterm>
+	  <indexterm>
+	    <primary>
+	      IMAP
+	    </primary>
+	  </indexterm>
 	  Memory and processor requirements will depend on the number
 	  of usersand whether you are providing POP or IMAP services,
 	  or only SMTP.  POP and IMAP require logins, and each login
@ -117,7 +136,7 @@
 	  footprint.  In POP, e-mail is typically downloaded from the
 	  server to the client.  The POP server merely handles authentication
 	  and moving the data to the client.  When a mail message is sent
-	  to the client, it is removed from the server.
+	  to the client, it is typically removed from the server.
 	</para>
 	<para>
 	  Using IMAP puts much more stress on the system.  IMAP will copy
@ -145,7 +164,7 @@
 	  affect on disk or memory usage.
 	</para>
 	<para>
-	  While many shops offer local mail delivery, it is not advisable
+	  It is not advisable
 	  to put a shell account on the same machine as one running IMAP
 	  or POP, due to the CPU and memory used by these protocols.
 	  Nor is it advisable to use NFS to export a mailbox to a shell
@ -168,7 +187,148 @@

  <section id="appsapache">
    <title>Apache web server</title>
-      <para></para>
+    <para>
+      Apache works best when it is delivering static content.  But if you're
+      only delivering static content, you should consider using
+      <application>Tux</application>, the Linux kernel HTTP server.
+    </para>
+    <para>
+      There are a number of small quick-tips you can use to increase the
+      throughput and responsiveness of Apache.
+    </para>
+    <para>
+      One such tip is to turn off DNS resolving in Apache.  When Apache receives
+      a request to send data to a remote server, Apache will request a reverse
+      DNS lookup on that IP address.  While the requests winds its way through
+      the DNS heirarchy, Apache is stuck waiting for the answer before it can
+      send the page back to the requesting client.  By turning this feature off,
+      Apache will merely log the IP address without name in its log file, and
+      then deliver the page.  A problem with this is if the machine is cracked
+      through the web server, you will have to do your own reverse DNS lookup to
+      find out what country and domain the machine is from.  A way to have the
+      best of both worlds is to run a caching-only name server run on the local
+      network or on the web server box itself.  This allows Apache to quickly
+      get the DNS information, and store a hostname and domain instead of just
+      an IP address.
+    </para>
+    <para>
+      Cutting the size of graphic images is also a way to improve performance.
+      Using algorithms such as JPG or PNM will cut the size of the image file
+      without reducing the quality of the image by much.  This allows more image
+      files to go out per second.
+    </para>
+    <para>
+      Instead of reinventing the wheel, see if Apache has a module for the
+      feature you need.  Modules for authentication against SMB, NIS, and LDAP
+      servers all exist for Apache, and you will not have to add the
+      functionality to your server side scripts.
+    </para>
+    <para>
+      Make sure your database is tuned properly.  This has been covered in
+      <xref linkend="appsdatabase">, but a poorly tuned database can slow
+      everything down.  If you expect heavy usage, do not put the database and
+      web server on the same machine.  One machine running the database, and
+      another running the web server connected by high speed (gigabit Ethernet)
+      will allow both machines to operate quickly.
+    </para>
+    <para>
+      Since so many web sites are using server-based applications and CGI, it
+      makes sense to take a look at the server interpreters being used.  If you
+      use a language like Java that is known to be pretty slow, you can expect
+      lower performance as compared to using applications compiled in C.
+    </para>
+    <para>
+      But Java has its own advantages over C.  The biggest is that Java is
+      designed to be much more portable than C is.  Since Java is an object
+      oriented language, its code reuse is much better, allowing delopers to
+      create applications quicky.
+    </para>
+    <para>
+      There are three other major server languages used by Apache.  These are
+      Perl, Python, and PHP.  All three are included in most Linux
+      distributions, and if they're not on your system, each is easy to install
+      and get working.
+    </para>
+    <section id="appsapacheperl">
+      <title>Perl</title>
+      <para>
+        Perl is pretty much the granddaddy of CGI and server programs.  It has
+	excellent string manipulation abilities, plenty of libraries to let you
+	create web applications quickly, and has drivers for just about anything
+	you want to do.  It is known as the programmer's swiss army knife, and
+	for good reason.
+      </para>
+      <para>
+        The downside to perl is that it is an interpreted language, meaning each
+	time a perl application is run, the perl interpreter has to be loaded,
+	the application has to be loaded into memory, and the interpreter has to
+	compile the application.  This can create a heavy load on a system.
+      </para>
+      <para>
+        <indexterm>
+	  <primary>
+	    mod_perl
+	  </primary>
+	</indexterm>
+        The best way around this is to use the mod_perl library with
+	<application>apache</application>.  This library first loads the perl
+	interpreter so it is always in memory and ready to run.  It also caches
+	frequently-used perl CGI scripts in memory already in a precompiled
+	state.  When a perl application starts up, many of the bottlenecks
+	(loading and interpreting) have already been done.
+        The cost to the sysadmin and programmer is that mod_perl takes up a lot
+	of memory, and its memory use will increase as more perl scripts are
+	added to a web server.
+      </para>
+      <note>
+        <para>
+	  Adding more memory and CPU speed to a machine running mod_perl will
+	  improve its performance.
+	</para>
+      </note>
+    </section> <!-- appsapacheperl -->
+    <section id="appsapachepython">
+      <title>Python</title>
+      <para>
+        Python was one of the first of the interpreted languages to implement
+	object oriented programming, allowing programmers to reuse and share
+	code.  It also learned a bit from perl and it's interpreted nature by
+	comiling python code into a pre-compiled format.  This format is not
+	portable like Java, but does much of the grunt work of compiling a
+	python application.  The python interpreter then spends less time
+	compiling the application each time it's run.
+      </para>
+      <para>
+        Programmers may cringe a bit when they start writing in python.  Syntax
+	is much more strict than C or Perl, using tabs and newlines to delineate
+	parts of code.  And, like perl, python requires the interpreter to load
+	into memory and run.
+      </para>
+      <para>
+        There is also a mod_python for apache that operates simiar to mod_perl.
+	More memory and CPU speed will be a great improvement.
+      </para>
+    </section> <!-- appsapachepython -->
+    <section id="appsapachephp">
+      <title>PHP</title>
+      <para>
+        PHP is a relative newcomer to the languages listed here.  But PHP was
+	designed for server side applications.  The interpreter is a module in
+	<application>apache</application>, and PHP code can exist within a HTML
+	page, allowing authors to combine the web page and application into one
+	file.
+      </para>
+      <para>
+        PHP has two downsides to it from a programer's perspective.  First, PHP
+	is still relatively new.  It is not quite as seasoned as Perl or Python,
+	each of which have over ten years worth of development behind them.
+	Second, PHP does not have modules that can be easily loaded and used
+	like Perl and Python have.  If your implementation of PHP does not have
+	the graphics libraries installes, you will have to recompile PHP and add
+	them in.  Neither of these should prevent you from giving a serious look
+	at using PHP for your server side application.
+      </para>
+    </section> <!-- appsapachephp -->
  </section> <!-- appsapache -->

  <section id="appssamba">
--- a/LDP/guide/docbook/Tuning-Linux/disk.sgml
+++ b/LDP/guide/docbook/Tuning-Linux/disk.sgml
@ -36,6 +36,12 @@
  <section id="diskoverview">
    <title>Overview of Disk Technologies</title>
    <para>
+      <indexterm>
+        <primary>IDE</primary>
+      </indexterm>
+      <indexterm>
+        <primary>SCSI</primary>
+      </indexterm>
      There are, as you may know, two major hard disk technologies
      that work under Linux: Integrated Drive Electronics (IDE), and Small
      Computer System Interface (SCSI).  At the lowest level, IDE and
@ -219,6 +225,9 @@
          drive.
        </para>
        <para>
+	  <indexterm>
+	    <primary>DMA</primary>
+	  </indexterm>
          Now that the drive can be accessed, there are two methods
          for that drive to get its data from the drive to the CPU:
          PIO, and DMA.  PIO requires the CPU to shuttle data from 
@ -249,6 +258,9 @@
          information).
        </para>
        <para>
+	  <indexterm>
+	    <primary>hdparm</primary>
+	  </indexterm>
          Once Linux is started, all IDE tuning occurs using
          <command>hdparm</command>.  Using <command>hdparm</command>
          followed by a drive (<command>hdparm /dev/hda</command>) will
@ -402,6 +414,10 @@ geometry     = 1559/240/63, sectors = 23572080, start = 0
 	<section id="disktuningscsiraid">
 	  <title>SCSI RAID</title>
 	  <para>
+	    <indexterm>
+	      <primary>SCSI</primary>
+	      <secondary>RAID</secondary>
+	    </indexterm>
 	    Depending on the RAID level you select, you can optimize
 	    a drive array ranging from high performance to high availability.
 	    In addition, many hardware RAID cards support standby drives,
@ -479,6 +495,16 @@ geometry     = 1559/240/63, sectors = 23572080, start = 0
 	    each with the ability to hold up to 16 devices per card,
 	    not counting the card itself.
 	  </para>
+	  <para>
+	    One way of tuning the SCSI bus is to make sure it is properly
+	    terminated.  Without proper termination, the SCSI bus may ratchet
+	    its speed down, or fail altogether.  Termination should occur at
+	    both ends of a physical SCSI chain, but most SCSI chipsets include
+	    internal termination for their end.  Purchase the correct
+	    termination for your cable and put it at the far end of the cable.
+	    This will make sure there is no signal reflections anywhere in the
+	    cable that can cause interference.
+	  </para>
 	</section> <!-- disktuningscsiraid -->
      </section> <!-- disktuningscsi -->
    </section> <!-- disktuningoptions -->
@ -497,8 +523,8 @@ geometry     = 1559/240/63, sectors = 23572080, start = 0
 	from a regular UNIX operating system.  Over time,
 	second extended filesystem (ext2fs) arrived and was the
 	standard filesystem for Linux for many years.  Now, the number
-	of native filesystems for Linux include ext2fs, reiserfs,
-	GFS, and Coda.
+	of native filesystems for Linux include ext3, reiserfs,
+	GFS, Coda, XFS, JFS, and Intermezzo.
      </para>
      <para>
        Such features that are available for Linux include inodes,
@ -508,29 +534,45 @@ geometry     = 1559/240/63, sectors = 23572080, start = 0
 	file data, entries in directories, and directories themselves.
 	The directory of inodes is kept in the superblock, and these
 	superblocks are duplicated many times through the filesystem
-	so if one superblock gets corrupted, another can fill in the
-	missing data.
+	so if one superblock gets corrupted, another can be used to
+	recover the missing data.
      </para>
      <para>
        In the event the OS shuts down without unmounting its filesystems,
 	a filesystem check must be run in order to make sure the
 	inodes point to the data it should be pointing at.  A journalling
-	filesystem ensures that all writes to a filesystem are
+	filesystem (ext3, reiserfs, JFS, XFS) ensures that all writes
+	to a filesystem are
 	finished before reporting success to the OS.
      </para>
-
      <para>
-        The data files can be of multiple forms.  The most common
-	form that users will experience are regular files, which
-	holds some form of data.  There are three other main types of
-	files that are in a typical UNIX system: character, block, and
-	socket.  Character and block files typically refer to entries
-	in /dev, and are accessed either one byte at a time (character)
-	or in blocks of characters (block).  These two types of files
-	provide an interface to the kernel that normal applications can use.
-	Sockets are files that let applications on the same system send
-	data back and forth.
+        Each type of filesystem has its advantages and disadvantages.  A
+	filesystem like ext2 or ext3 are better tuned to large files, so access
+	for reads and writes happen very quickly.  But if there are a large
+	amount of small files in a directory, its performance starts to suffer.
+	A filesystem like reiserfs is better tunes to smaller files, but
+	increases overhead for larger files.
      </para>
+      <para>
+        For applications that are writing or reading the hard drive, block sizes
+	will allow Linux to write larger blocks of data to the hard drive in one
+	operation.  For example, a block size of 64k will try to write to the
+	hard drive in 64kb chunks.  Depending on the hard drive and interface,
+	larger block results in better performance.  If the block size is not
+	set properly, it can result in poorer performance.  If the optimal block
+	size is 64k, but is set for 32k, it would take two operations to write
+	the block to the hard drive.  If it is set to 96kb, then it would take
+	OS will wait for a timeout period or the rest of the block size to fill
+	up before it writes the data to disk, dropping the latency of writing
+	data to the disk.
+      </para>
+      <para>
+        Block sizes are usually reserved for operations where raw data is being
+	written to or read from the hard drive.  But applications like
+	<application>dd</application> can use varying size block sizes when
+	writing data to the drive, allowing you to tune various block sizes.
+      </para>
+
    </section> <!-- diskfilesystems -->
  </section> <!-- disktuning -->
 </chapter> <!-- disk -->
--- a/LDP/guide/docbook/Tuning-Linux/fundamentals.sgml
+++ b/LDP/guide/docbook/Tuning-Linux/fundamentals.sgml
@ -136,7 +136,8 @@
      open.  So you can walk right up to a counter
      and get your burger in about two minutes.  The down side of this is
      that the manager of the store has to pay for more people working in
-      the store.  There are also potential bottlenecks, such as only one
+      the store, or you have to buy more machines.
+      There are also potential bottlenecks, such as only one
      person in the back making fries, or the person in line in front of
      you is still trying to decide what they want.
    </para>
@ -176,4 +177,42 @@
      of squid will be your best performance bet.
    </para>
  </section> <!-- fundamentalssubsystems -->
+  <section id="fundamentalshowmuch">
+    <title>How Much Machine is Required?</title>
+    <para>
+      A frequent question when deciding to buy new hardware is "How much
+      hardware should be purchased for the task?".  A good rule of thumb is "as
+      much as you can afford", but that may not be the whole answer.  You should
+      consider the following when defining specifications.
+    </para>
+    <itemizedlist>
+      <listitem>
+        <para>
+	  If you cannot afford a fully loaded system, can it be expanded after
+	  purchase?  If you application would work best with 2GB of RAM, but you
+	  can only afford 1GB, can the motherboard be expanded to add the extra
+	  GB later on?  Purchasing a motherboard that can only accept 1GB means
+	  you will need to buy a new motherboard (at least) later on, increasing
+	  cost.
+	</para>
+      </listitem>
+      <listitem>
+        <para>
+	  How well does the design scale?  If you plan on buying racks of
+	  machines, can you add new racks later and integrate them easily with
+	  the existing system?  Things like switches that can accept new blades
+	  to add functionality may save space and money in the long run, while
+	  reducing issues like cabling.
+	</para>
+      </listitem>
+      <listitem>
+        <para>
+	  How easy is it to install, upgrade, or repair?  While using desktop
+	  systems in a rack is possible and will save money, replacing or
+	  repairing the systems is harder to do.  A good question to ask
+	  yourself is "What is my time worth?".
+	</para>
+      </listitem>
+    </itemizedlist>
+  </section>  <!-- fundamentalshowmuch -->
 </chapter> <!-- fundamentals -->
--- a/LDP/guide/docbook/Tuning-Linux/kernel.sgml
+++ b/LDP/guide/docbook/Tuning-Linux/kernel.sgml
@ -26,7 +26,8 @@
  </itemizedlist>
  <para>
    This chapter covers each of these situations, and also tuning of running
-    kernels using <application>sysconf</application>.
+    kernels using applications like <application>sysconf</application> and
+    <application>powertweak</application>.
  </para>

  <section id="kernelversions">
--- a/LDP/guide/docbook/Tuning-Linux/network.sgml
+++ b/LDP/guide/docbook/Tuning-Linux/network.sgml
@ -301,6 +301,14 @@ restarting autonegotiation...
      force when the gear you are talking to is not managed, and use
      autonegotiate if the gear is managed.
    </para>
+    <note>
+      <para>
+        This program only works with chipsets and drivers that support MII.  The
+	Intel eepro100 cards implement this, but others may not.  If your driver
+	does not support MII, you may need to force the setting at boot time
+	when the driver is loaded.
+      </para>
+    </note>
  </section> <!-- networkethernet -->
  <section id="networktcpip">
    <title>Tuning TCP/IP performance</title>
@ -343,7 +351,7 @@ restarting autonegotiation...
      When using Domain Name Servers (<acronym>DNS</acronym>), you may
      run into cases where DNS resolution is a performance bottleneck.  We
      will get into this more in <xref linkend="appsapache">, but
-      some applications recommend for best performance to log and use only the
+      some applications recommend for best performance to log the
      raw TCP/IP addresses that come in and do not try to resolve it to
      a name.  For security reasons, you may want to change this so you can
      quickly find out what machine is trying to break into your web server.
@ -488,7 +496,7 @@ restarting autonegotiation...
      tuning for performance is not as great importance as tuning for power
      usage.  Using 802.11b is often very power-consuming and can quickly
      drain the batteries.  Some cards (such as the Lucent Orinoco card)
-      have the ability to only turn its antenna on and off at regular
+      have the ability to turn its antenna on and off at regular
      intervals.  Instead of the antenna being on all the time, it turns
      on a few times a second.  With the transmitter being turned off
      now more than half the time, the battery usage is decreased.  There is
@ -557,12 +565,14 @@ wvlan0    IEEE 802.11-DS  ESSID:"default"  Nickname:"HERMES I"
      <listitem>
        <para>
 	  Check the infrastructure and building materials.  Thick wood or metal
-	  walls will cause a lot of interference.
+	  walls will cause a lot of interference.  Line of sight to the base
+	  station is best.
 	</para>
      </listitem>
      <listitem>
        <para>
-	  Some base stations support external antennas.  They will greatly
+	  Some base stations and wireless cards support external antennas.
+	  They will greatly
 	  improve the range and quality of the link.
 	</para>
      </listitem>
@ -576,7 +586,8 @@ wvlan0    IEEE 802.11-DS  ESSID:"default"  Nickname:"HERMES I"
        <para>
 	  Turn off other devices that use 2.4Ghz.  Some phones and other
 	  wireless gadgets use the same frequency, and if not built properly,
-	  will cause interference.
+	  will cause the wireless Ethernet cards to continually scan through
+	  frequencies for the correct one, dropping performance.
 	</para>
      </listitem>
    </itemizedlist>
@ -747,7 +758,7 @@ kBytes/s&lt;/TD&gt;&lt;/TR&gt;
 </screen>
      <para>
        All the configuration information has been pulled from
-	<application>snmpd</application>.  You can pipe the output
+	<application>snmpd</application>.  You can redirect the output
 	of <command>cfgmaker</command> into
 	<filename>/etc/mrtg.cfg</filename>.
      </para>