mirror of https://github.com/tLDP/LDP
measuring performance
This commit is contained in:
parent
62a6f8c6aa
commit
48ef4e6ac6
|
@ -0,0 +1,170 @@
|
|||
<chapter id="measure">
|
||||
<title>Measuring your performance</title>
|
||||
<para>
|
||||
In order to see if your system is any faster, you need to measure
|
||||
the performance of the system before and after you try your tuning
|
||||
changes. There's a variety of applications for testing your hard drive,
|
||||
CPU, memory, and overall system performance. This allows you to also test a
|
||||
proposed configuration versus the existing configuration.
|
||||
</para>
|
||||
<para>
|
||||
In all cases of testing, you should have a relatively quiet system, meaning
|
||||
that there is a minimum of applications running. For true testing, you
|
||||
should reboot the machine between each test, and run each test at least
|
||||
5 times, then take the average. Rebooting clears any in-memory caches that
|
||||
can affect the tuning numbers, and also makes sure there is a larger amount
|
||||
of system RAM free.
|
||||
</para>
|
||||
<section id="measuredrive">
|
||||
<title>Hard Drive</title>
|
||||
<para>
|
||||
The easiest way to find raw hard drive performance is using
|
||||
<command>hdparm</command>, which we describe in <xref
|
||||
linkend="disktuningideos">. But this is raw hard drive performance, not
|
||||
taking into account things like overhead from the filesystem or write
|
||||
performance.
|
||||
</para>
|
||||
<para>
|
||||
The other application you can use to test how the filesystem works, or for
|
||||
devices that do not work with <command>hdparm</command> is to use
|
||||
<command>dd</command>. The <command>dd</command> command is used to write
|
||||
or read data and perform some conversion along the way. The nice part of
|
||||
the command for us is that it can create a file of any size containing
|
||||
ASCII NULL (0), which allows us to test consistently.
|
||||
Since the data we want to write is being generated in the CPU and memory
|
||||
which is always faster than the hard drive, this gives a good look at the
|
||||
disk bottleneck.
|
||||
</para>
|
||||
<para>
|
||||
This is used in conjunction with <command>time</command> which gives
|
||||
the amount of CPU, system, and user (real) time used to run a particular
|
||||
command. You can then divide out the size of the file created by the
|
||||
number of user seconds to run the command to get a Mbps rating.
|
||||
</para>
|
||||
|
||||
</section> <!-- measuredrive -->
|
||||
|
||||
<section id="measurecpu">
|
||||
<title>CPU and System</title>
|
||||
<para>
|
||||
Since CPUs have a variety of functions crammed into a small space, it is
|
||||
hard to test how fast a particular CPU is. A standard number for Linux is
|
||||
called <quote>BogoMIPS</quote>, which Linus needed for some timing
|
||||
routines in the kernel. BogoMIPS calculate how fast a CPU can do nothing,
|
||||
and so will vary depending on the kind of chip used. Because of how
|
||||
BogoMIPS are calculated, they should not be used for any form of
|
||||
performance measurement.
|
||||
</para>
|
||||
<para>
|
||||
There are a variety of CPU functions that can be measured, including
|
||||
Million Instructions Per Second (MIPS), Floating Point Operations Per
|
||||
Second (FLOPS), and memory to CPU speed (in MBps). Each of these will
|
||||
vary greatly depending on the choice of CPU. MIPS are almost worthless,
|
||||
since some chips can have one instruction that runs multiple other
|
||||
instructions. The MMX instruction set for Intel-based hardware is a good
|
||||
example for this, as MMX is set to perform matrix operations from just a
|
||||
few instructions, whereas without MMX, it would take many more operations
|
||||
to do the same calulations. When measuring chip speed, does an MMX
|
||||
instruction count as one or multiple instructions? Since the speed of
|
||||
running one MMX instruction is slightly faster than running the multiple
|
||||
instructions that make it up, it will make an MMX-based chip appear
|
||||
slower.
|
||||
</para>
|
||||
<para>
|
||||
Many online reviewers use games as a measurement of the CPU. The idea is
|
||||
that 3D games provide a good stress test of the CPU, memory interface,
|
||||
system bus, and video card. The result is the number of Frames Per Second
|
||||
(FPS). More FPS means a faster overall system. Since most servers do not
|
||||
have a 3D card in them, this may not be a good choice of measurement.
|
||||
Workstations may get a benefit from this kind of testing, however.
|
||||
</para>
|
||||
<para>
|
||||
One other measurement form is that of other third-party applications.
|
||||
Applications like SETI@Home, distributed.net can give generic speed
|
||||
ratings that are good to compare against other machines running the same
|
||||
software. Other CPU-intensive applications like MP3 encoders can also be
|
||||
a good guide of how fast a CPU is. The down side to these are that the
|
||||
results have to be compared to the results of other machines running the
|
||||
exact same software. If the software is not available for that
|
||||
OS/Hardware combination, the test is worthless.
|
||||
</para>
|
||||
<para>
|
||||
So what does this leave us with? Unfortunately, not a lot. The best bet
|
||||
appears to be MIPS and FLOPS, since it does test instructions per second,
|
||||
and as
|
||||
long as the chipset remains the same (Intel based, SPARC, Alpha, etc.),
|
||||
the
|
||||
measurements can be compared pretty easily and the comparisons will be
|
||||
close enough.
|
||||
</para>
|
||||
</section> <!-- measurecpu -->
|
||||
|
||||
<section id="measurenetwork">
|
||||
<title>Network</title>
|
||||
<para>
|
||||
Measuring the network speed is not quite easy, since there are bottlenecks
|
||||
outside the network interface and your machine. Cable problems, poor
|
||||
choice of switching gear, and congestion on the line or the remote machine
|
||||
you want to test against can all reduce the throughput of your network
|
||||
interface. In addition, protocols like SSH or HTTP have additional
|
||||
processing that may need to be done that occupies the CPU and reduces
|
||||
throughput.
|
||||
</para>
|
||||
<para>
|
||||
If the machine you are testing is going to be a server, you can create a
|
||||
small group of client machines on a private network. These machines
|
||||
should have the same type of network card and OS revision. This will
|
||||
create a stable baseline of testing.
|
||||
</para>
|
||||
<para>
|
||||
Depending on the networking application you will be using, there may be
|
||||
applications that already exist to automate this testing for you.
|
||||
Programs like <application>webbench</application> can coordinate the
|
||||
clients talking to the server, and be able to read the performance of the
|
||||
server in terms of pages per second.
|
||||
</para>
|
||||
<para>
|
||||
</para>
|
||||
</section> <!-- measurenetwork -->
|
||||
|
||||
<section id="measurevideo">
|
||||
<title>Video card</title>
|
||||
<para>
|
||||
If you want to tune a workstation, or create a killer Quake III box, you
|
||||
will want to pay attention to the video subsystem and see how well it
|
||||
performs. For 2D applications, you can test the system using
|
||||
<command>x11perf</command>. This program will perform a variety of tests
|
||||
using the X server and drivers. Since it is designed for performance
|
||||
testing, it is designed to run each test five times, then take the
|
||||
average. The machine should have no other users or activity going on, and
|
||||
you should disable the screen saver. You can disable the screensaver
|
||||
either with the command <command>xset s off</command> or by killing the
|
||||
process called <quote>xscreensaver</quote>. You may need to run both
|
||||
commands in order to turn off the screen saver.
|
||||
</para>
|
||||
<para>
|
||||
Testing with <command>x11perf</command> will take several hours depending
|
||||
on the speed of your CPU and graphics card. Once complete, you will have
|
||||
a log file that you can then compare against other machines that have also
|
||||
done testing, or against a baseline test you ran before tuning to see if
|
||||
the X server speed has been improved. To do this comparison, you can use
|
||||
<command>x11perfcomp</command> to compare two or more tests. Higher
|
||||
numbers are better, as the resulting numbers are in terms of objects per
|
||||
second.
|
||||
</para>
|
||||
<para>
|
||||
You can test out 3D performance using applications like Quake III, that
|
||||
run the application through a set world and events, most of which will
|
||||
stress the system. Mark down the resolution, bit depth, and frames per
|
||||
second reported from Quake III, and you now have a baseline to work from.
|
||||
</para>
|
||||
<para>
|
||||
A caveat to using Quake or 3D applications is that this is testing more
|
||||
than the video card. Other subsystems, like the CPU, video drivers, GLX
|
||||
(3D) drivers, and memory are also tested. If you want to use this method
|
||||
for comparing speed, you should also be sure the other subsystems are
|
||||
tuned as well.
|
||||
</para>
|
||||
</section> <!-- measurevideo -->
|
||||
|
||||
</chapter> <!-- measure -->
|
Loading…
Reference in New Issue