mirror of https://github.com/tLDP/LDP
171 lines
8.7 KiB
Plaintext
171 lines
8.7 KiB
Plaintext
<chapter id="measure">
|
|
<title>Measuring your performance</title>
|
|
<para>
|
|
In order to see if your system is any faster, you need to measure
|
|
the performance of the system before and after you try your tuning
|
|
changes. There's a variety of applications for testing your hard drive,
|
|
CPU, memory, and overall system performance. This allows you to also test a
|
|
proposed configuration versus the existing configuration.
|
|
</para>
|
|
<para>
|
|
In all cases of testing, you should have a relatively quiet system, meaning
|
|
that there is a minimum of applications running. For true testing, you
|
|
should reboot the machine between each test, and run each test at least
|
|
5 times, then take the average. Rebooting clears any in-memory caches that
|
|
can affect the tuning numbers, and also makes sure there is a larger amount
|
|
of system RAM free.
|
|
</para>
|
|
<section id="measuredrive">
|
|
<title>Hard Drive</title>
|
|
<para>
|
|
The easiest way to find raw hard drive performance is using
|
|
<command>hdparm</command>, which we describe in <xref
|
|
linkend="disktuningideos">. But this is raw hard drive performance, not
|
|
taking into account things like overhead from the filesystem or write
|
|
performance.
|
|
</para>
|
|
<para>
|
|
The other application you can use to test how the filesystem works, or for
|
|
devices that do not work with <command>hdparm</command> is to use
|
|
<command>dd</command>. The <command>dd</command> command is used to write
|
|
or read data and perform some conversion along the way. The nice part of
|
|
the command for us is that it can create a file of any size containing
|
|
ASCII NULL (0), which allows us to test consistently.
|
|
Since the data we want to write is being generated in the CPU and memory
|
|
which is always faster than the hard drive, this gives a good look at the
|
|
disk bottleneck.
|
|
</para>
|
|
<para>
|
|
This is used in conjunction with <command>time</command> which gives
|
|
the amount of CPU, system, and user (real) time used to run a particular
|
|
command. You can then divide out the size of the file created by the
|
|
number of user seconds to run the command to get a Mbps rating.
|
|
</para>
|
|
|
|
</section> <!-- measuredrive -->
|
|
|
|
<section id="measurecpu">
|
|
<title>CPU and System</title>
|
|
<para>
|
|
Since CPUs have a variety of functions crammed into a small space, it is
|
|
hard to test how fast a particular CPU is. A standard number for Linux is
|
|
called <quote>BogoMIPS</quote>, which Linus needed for some timing
|
|
routines in the kernel. BogoMIPS calculate how fast a CPU can do nothing,
|
|
and so will vary depending on the kind of chip used. Because of how
|
|
BogoMIPS are calculated, they should not be used for any form of
|
|
performance measurement.
|
|
</para>
|
|
<para>
|
|
There are a variety of CPU functions that can be measured, including
|
|
Million Instructions Per Second (MIPS), Floating Point Operations Per
|
|
Second (FLOPS), and memory to CPU speed (in MBps). Each of these will
|
|
vary greatly depending on the choice of CPU. MIPS are almost worthless,
|
|
since some chips can have one instruction that runs multiple other
|
|
instructions. The MMX instruction set for Intel-based hardware is a good
|
|
example for this, as MMX is set to perform matrix operations from just a
|
|
few instructions, whereas without MMX, it would take many more operations
|
|
to do the same calulations. When measuring chip speed, does an MMX
|
|
instruction count as one or multiple instructions? Since the speed of
|
|
running one MMX instruction is slightly faster than running the multiple
|
|
instructions that make it up, it will make an MMX-based chip appear
|
|
slower.
|
|
</para>
|
|
<para>
|
|
Many online reviewers use games as a measurement of the CPU. The idea is
|
|
that 3D games provide a good stress test of the CPU, memory interface,
|
|
system bus, and video card. The result is the number of Frames Per Second
|
|
(FPS). More FPS means a faster overall system. Since most servers do not
|
|
have a 3D card in them, this may not be a good choice of measurement.
|
|
Workstations may get a benefit from this kind of testing, however.
|
|
</para>
|
|
<para>
|
|
One other measurement form is that of other third-party applications.
|
|
Applications like SETI@Home, distributed.net can give generic speed
|
|
ratings that are good to compare against other machines running the same
|
|
software. Other CPU-intensive applications like MP3 encoders can also be
|
|
a good guide of how fast a CPU is. The down side to these are that the
|
|
results have to be compared to the results of other machines running the
|
|
exact same software. If the software is not available for that
|
|
OS/Hardware combination, the test is worthless.
|
|
</para>
|
|
<para>
|
|
So what does this leave us with? Unfortunately, not a lot. The best bet
|
|
appears to be MIPS and FLOPS, since it does test instructions per second,
|
|
and as
|
|
long as the chipset remains the same (Intel based, SPARC, Alpha, etc.),
|
|
the
|
|
measurements can be compared pretty easily and the comparisons will be
|
|
close enough.
|
|
</para>
|
|
</section> <!-- measurecpu -->
|
|
|
|
<section id="measurenetwork">
|
|
<title>Network</title>
|
|
<para>
|
|
Measuring the network speed is not quite easy, since there are bottlenecks
|
|
outside the network interface and your machine. Cable problems, poor
|
|
choice of switching gear, and congestion on the line or the remote machine
|
|
you want to test against can all reduce the throughput of your network
|
|
interface. In addition, protocols like SSH or HTTP have additional
|
|
processing that may need to be done that occupies the CPU and reduces
|
|
throughput.
|
|
</para>
|
|
<para>
|
|
If the machine you are testing is going to be a server, you can create a
|
|
small group of client machines on a private network. These machines
|
|
should have the same type of network card and OS revision. This will
|
|
create a stable baseline of testing.
|
|
</para>
|
|
<para>
|
|
Depending on the networking application you will be using, there may be
|
|
applications that already exist to automate this testing for you.
|
|
Programs like <application>webbench</application> can coordinate the
|
|
clients talking to the server, and be able to read the performance of the
|
|
server in terms of pages per second.
|
|
</para>
|
|
<para>
|
|
</para>
|
|
</section> <!-- measurenetwork -->
|
|
|
|
<section id="measurevideo">
|
|
<title>Video card</title>
|
|
<para>
|
|
If you want to tune a workstation, or create a killer Quake III box, you
|
|
will want to pay attention to the video subsystem and see how well it
|
|
performs. For 2D applications, you can test the system using
|
|
<command>x11perf</command>. This program will perform a variety of tests
|
|
using the X server and drivers. Since it is designed for performance
|
|
testing, it is designed to run each test five times, then take the
|
|
average. The machine should have no other users or activity going on, and
|
|
you should disable the screen saver. You can disable the screensaver
|
|
either with the command <command>xset s off</command> or by killing the
|
|
process called <quote>xscreensaver</quote>. You may need to run both
|
|
commands in order to turn off the screen saver.
|
|
</para>
|
|
<para>
|
|
Testing with <command>x11perf</command> will take several hours depending
|
|
on the speed of your CPU and graphics card. Once complete, you will have
|
|
a log file that you can then compare against other machines that have also
|
|
done testing, or against a baseline test you ran before tuning to see if
|
|
the X server speed has been improved. To do this comparison, you can use
|
|
<command>x11perfcomp</command> to compare two or more tests. Higher
|
|
numbers are better, as the resulting numbers are in terms of objects per
|
|
second.
|
|
</para>
|
|
<para>
|
|
You can test out 3D performance using applications like Quake III, that
|
|
run the application through a set world and events, most of which will
|
|
stress the system. Mark down the resolution, bit depth, and frames per
|
|
second reported from Quake III, and you now have a baseline to work from.
|
|
</para>
|
|
<para>
|
|
A caveat to using Quake or 3D applications is that this is testing more
|
|
than the video card. Other subsystems, like the CPU, video drivers, GLX
|
|
(3D) drivers, and memory are also tested. If you want to use this method
|
|
for comparing speed, you should also be sure the other subsystems are
|
|
tuned as well.
|
|
</para>
|
|
</section> <!-- measurevideo -->
|
|
|
|
</chapter> <!-- measure -->
|