330 lines
19 KiB
HTML
330 lines
19 KiB
HTML
<!--startcut ==============================================-->
|
|
<!-- *** BEGIN HTML header *** -->
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
|
<HTML><HEAD>
|
|
<title>Is Your Memory Not What It Used To Be? LG #81</title>
|
|
</HEAD>
|
|
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#0000AF"
|
|
ALINK="#FF0000">
|
|
<!-- *** END HTML header *** -->
|
|
|
|
<CENTER>
|
|
<A HREF="http://www.linuxgazette.com/">
|
|
<IMG ALT="LINUX GAZETTE" SRC="../gx/lglogo.png"
|
|
WIDTH="600" HEIGHT="124" border="0"></A>
|
|
<BR>
|
|
|
|
<!-- *** BEGIN navbar *** -->
|
|
<IMG ALT="" SRC="../gx/navbar/left.jpg" WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="bottom"><A HREF="durodola.html"><IMG ALT="[ Prev ]" SRC="../gx/navbar/prev.jpg" WIDTH="16" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="index.html"><IMG ALT="[ Table of Contents ]" SRC="../gx/navbar/toc.jpg" WIDTH="220" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><A HREF="../index.html"><IMG ALT="[ Front Page ]" SRC="../gx/navbar/frontpage.jpg" WIDTH="137" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue81/kurup.html"><IMG ALT="[ Talkback ]" SRC="../gx/navbar/talkback.jpg" WIDTH="121" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><A HREF="../lg_faq.html"><IMG ALT="[ FAQ ]" SRC="./../gx/navbar/faq.jpg"WIDTH="62" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="padala.html"><IMG ALT="[ Next ]" SRC="../gx/navbar/next.jpg" WIDTH="15" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><IMG ALT="" SRC="../gx/navbar/right.jpg" WIDTH="15" HEIGHT="45" ALIGN="bottom">
|
|
<!-- *** END navbar *** -->
|
|
<P>
|
|
</CENTER>
|
|
|
|
<!--endcut ============================================================-->
|
|
|
|
<H4 ALIGN="center">
|
|
"Linux Gazette...<I>making Linux just a little more fun!</I>"
|
|
</H4>
|
|
|
|
<P> <HR> <P>
|
|
<!--===================================================================-->
|
|
|
|
<center>
|
|
<H1><font color="maroon">Is Your Memory Not What It Used To Be?</font></H1>
|
|
<H4>By <a href="http://www.geocities.com/madhumkurup/mailme.html">Madhu M Kurup</a></H4>
|
|
</center>
|
|
<P> <HR> <P>
|
|
|
|
<!-- END header -->
|
|
|
|
|
|
|
|
|
|
<h2>Intent</h2>
|
|
The intent of this article is to provide an understanding of memory
|
|
leak detection and profiling tools currently available. It also aims at
|
|
providing you with enough information to be able to make a choice between
|
|
the different tools for your needs.<br>
|
|
|
|
<h2>Leaks and Corruption</h2>
|
|
We are talking software here, not plumbing. And yes, any fairly large,
|
|
non trivial program is bound to have a problem with memory and or leaks.<br>
|
|
|
|
<h3>Where do problems occur?</h3>
|
|
First, leaks and such memory problems do not occur in some languages.
|
|
These languages believe that memory management is <i>so important</i> that
|
|
it should never be handled by the users of that language. It is better
|
|
handled by the <i>language designers</i>. Examples of such languages are
|
|
Perl, Java and so on.<br>
|
|
However, in some other languages (notably C and
|
|
C++) the language designers have felt that memory management is <i>so important</i>
|
|
that it can only be taken care of by the <i>users</i> of the language.
|
|
A leak is said to occur when you dynamically allocate memory and then forget
|
|
to return it. In addition to leaks, other memory problems such as <a
|
|
href="http://www.tuxedo.org/%7Eesr/jargon/html/entry/buffer-overflow.html">buffer
|
|
overflows</a>, <a
|
|
href="http://www.tuxedo.org/%7Eesr/jargon/html/entry/dangling-pointer.html">dangling
|
|
pointers</a> also occur when programmers manage memory themselves.
|
|
These problems are caused where there is a mismatch between what
|
|
the program (and by extension the programmer) believes the state of memory
|
|
is, as opposed to what it really is.<br>
|
|
|
|
<h3>What are the problems?</h3>
|
|
In order for programs to be able to deal with data whose size is
|
|
not known at compile time, the program may need to request memory from
|
|
the runtime environment (operating system). However, having obtained a
|
|
chunk of memory, it may be possible that the program does not return to
|
|
back to the environment after use. An even more severe condition results
|
|
when the address of the block that was obtained is lost, which means that
|
|
it is no longer possible to identify that allocated memory. Other
|
|
problems include trying to access memory after it has been returned (dangling
|
|
pointers). Another common problem is trying to access more memory that was
|
|
originally requested and so on (buffer overflow).<br>
|
|
|
|
<h3>Why should these problems bother me?</h3>
|
|
Leaks may not be a problem for short-lived programs that finish their
|
|
work quickly. Unfortunately, many programs are designed to function
|
|
without termination for a long period. A good example would be the Apache
|
|
webserver that is currently providing you this web page. In such a situation,
|
|
a malfunctioning leaky program could keep requesting memory from the system
|
|
and not return it. Eventually this would lead to the system running out
|
|
of memory and all programs running on that machine to suffer. This is obviously
|
|
not a good thing. In addition to a program requiring more memory,
|
|
leaks can also make a program sluggish. The speed at which the program
|
|
is context-switched in and out can decrease if the memory load increases.
|
|
While not as severe as causing the machine to crash, an excessive memory
|
|
load on a machine could cause it to thrash, swapping data back and forth.<br>
|
|
Dangling pointers can result in subtle corruption
|
|
and bugs that are extremely unusual, obscure and hard to solve. Buffer overflows
|
|
are probably the most dangerous of the three forms of memory problems.
|
|
They lead to most of the security exploits that you read about[<a
|
|
href="#Secure_Programming_">SEC</a>]. In addition to the problems
|
|
described above, it may be possible that the same memory chunk is returned
|
|
back to the system multiple times. This obviously indicates a programming
|
|
error. A programmer may wish to see how the memory requests are made by
|
|
a program over the course of the lifetime of the program in order to find
|
|
and fix bugs.<br>
|
|
|
|
<h3>Combating these problems</h3>
|
|
There are some run time mechanisms to combat memory problems. Leaks can
|
|
be solved by periodically stopping and restarting the offending program
|
|
<cite></cite> [<a href="#OOM_killer">OOM</a>]. Dangling pointers can be
|
|
made repeatable by zeroing out all memory returned back to the operating
|
|
systems. Buffer overflows have a variety of solutions, some of which are
|
|
described in more detail <a
|
|
href="http://www.geocities.com/madhumkurup/papers/Buffer.ps">here</a>. <br>
|
|
Typically, the overhead of combating these problems at
|
|
runtime or late in development cycle is so high that finding them and
|
|
fixing them at the program level is often the more optimal solution.<br>
|
|
|
|
<h2>Open Source </h2>
|
|
|
|
<h3>GCC-based alternatives</h3>
|
|
The <a
|
|
href="http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/boehm-gc/">gcc</a> toolset
|
|
now includes a garbage collector which facilitates the easy detection
|
|
and elimination of many memory problems. Note that while this can be used
|
|
to detect leaks, the primary reason for creating this was to implement
|
|
a good garbage collector[<a href="#Garbage_Collectors">GC</a>]. This work
|
|
is currently being led by Hans-J. Boehm at HP.
|
|
<h4>Technology</h4>
|
|
The technology used here is <a
|
|
href="http://www.hpl.hp.com/personal/Hans_Boehm/gc/gcdescr.html">Boehm-Demers-Weiser</a>
|
|
technique for keeping track of allocated memory. Allocation of memory
|
|
is done using the algorithm's version of the standard memory allocation
|
|
functions. The program is then compiled with these functions and when executed,
|
|
the algorithm can analyze the behavior of the program. This algorithm is
|
|
fairly well known and well understood. It should not cause any problems
|
|
and/or interfere with programs. It can be made thread safe and can even
|
|
scale onto a multiprocessor system.<br>
|
|
|
|
<h4>Performance</h4>
|
|
Good performance with reduction in speed in line with expectations.
|
|
The code is extremely portable and is also available directly with gcc.
|
|
The version shipped with gcc is slightly older, but can be upgraded.<br>
|
|
There is no interface - it is difficult to use
|
|
and requires much effort for it to be useful. Existing systems may not
|
|
have this compiler configuration and may require some additional work to
|
|
get it going. In addition, in order for the calls to be trapped, all memory
|
|
calls (such as <i>malloc()</i> and <i>free()</i> ) have to be replaced with
|
|
equivalents provided by the garbage collector. One could use a macro, but
|
|
that is still not very flexible. Also this approach implicitly requires
|
|
source code for all pieces that require memory profiling with the ability
|
|
to shift from the real functions to those provided.<br>
|
|
|
|
<h4>Verdict</h4>
|
|
If you need a solution across multiple platforms (architectures,
|
|
operating systems) where you have control over all relevant source, this
|
|
could be it.<br>
|
|
|
|
<h3>Memprof</h3>
|
|
<a href="http://people.redhat.com/otaylor/memprof/">Memprof</a> is
|
|
an attractive easy to use package, created by Owen Talyor of Red Hat. This
|
|
tool is a nice clean GNOME front-end to the Boehm-Demers-Weiser garbage
|
|
collector.<br>
|
|
|
|
<h4>Technology</h4>
|
|
At the heart of the profiling, memprof is no different from the toolset
|
|
described above. However, how it implements this functionality is to trap
|
|
all memory requests from the program and redirect it at runtime to the
|
|
garbage collector. While not as functional as the gcc alternative on threads
|
|
and multiprocessors, the program can be asked to follow forks as they happen.<br>
|
|
|
|
<h4>Performance</h4>
|
|
The performance of this tool is pretty good. The GUI was well designed,
|
|
responsive and informative. This tools works directly with executables,
|
|
and it works without any changes needed to the source. This tool also graphically
|
|
displays the memory profile as the program executes which helps in understanding
|
|
memory requirements of the program during its lifetime.<br>
|
|
This tool is currently available only for the x86
|
|
and PPC architecture on Linux. If you need help on other platforms, you
|
|
will need to look elsewhere. This tool is not a GTK application, it needs
|
|
the full-blown GNOME environment. This may not be feasible everywhere.
|
|
Finally, development on this tool appears to be static (version 0.4.1.
|
|
for a while). While it is possible that it does what it is required to do
|
|
well, it does not seem that this too will do anything more than just leak
|
|
detection.<br>
|
|
|
|
<h4>Verdict</h4>
|
|
If you like GUI tools and don't mind GNOME and Linux, this is a
|
|
tool for you.<br>
|
|
|
|
<h3>Valgrind</h3>
|
|
<a href="http://developer.kde.org/%7Esewardj/">Valgrind</a> is a
|
|
program that attempts to solve a whole slew of memory problems, leaks
|
|
being just one of them. This tool is the product of Julian Seward (of
|
|
<a href="http://sources.redhat.com/bzip2/index.html">bzip2</a> and <a
|
|
href="http://www.cacheprof.org">cacheprof</a> fame). It terms itself "an open source
|
|
memory debugger for x86 linux" and it certainly fits that bill. In addition,
|
|
it can profile the usage of the CPU cache, something that is fairly unusual.
|
|
|
|
<h4>Technology</h4>
|
|
The technology used in this program is fairly complex and <a
|
|
href="http://developer.kde.org/%7Esewardj/docs/techdocs.html">well documented</a>.
|
|
Each byte of memory allocated by the program is tracked by nine status
|
|
bits, which are then used for housekeeping purposes to identify what is
|
|
going on. At the cost of tremendously increasing the memory load of an
|
|
executing program, this tool enables a much greater set of checks. As all
|
|
the reads and writes are intercepted, cache profiling of the CPU's various
|
|
L caches can also be done.<br>
|
|
|
|
<h4>Performance</h4>
|
|
The tool was the slowest of the three detailed here, for obvious
|
|
reasons. However, for the reduction in speed, this tool provides a wealth
|
|
of information is probably the most detailed of the three. In addition
|
|
to the usual suspects, this tool can identify a variety of other memory
|
|
and even some POSIX pthread issues. Cache information is probably overkill
|
|
for most applications, but it is an interesting way to look at the performance
|
|
of an application. The biggest plus for Valgrind is that it is under rapid
|
|
development with a pro-active developer and an active community. In fact
|
|
the web page of Valgrind proclaims the following from the author - <i>"If
|
|
you have problems with Valgrind, don't suffer in silence. Mail me."</i>.<br>
|
|
The tool however, is very x86 specific. Portability
|
|
is fairly limited and to x86 Linux. The interface is purely command-line
|
|
driven and while usable, sometimes the tool gives you too much information
|
|
for it to be useful. This tool also directly works with binaries, so while
|
|
recompiles are not required, it will require diligence to go through the
|
|
output of this tool to find what you are looking for. You can suppress memory
|
|
profiling for various system libraries by creating suppression files, but
|
|
writing these files is not easy. In addition, threading support is not complete,
|
|
although this tool has been used on Mozilla, OpenOffice and such other
|
|
large threaded programs. If this tool had a GUI front end, it would
|
|
win hands down.<br>
|
|
|
|
<h4>Verdict</h4>
|
|
If you are on x86 and know your code well and do not mind a CLI interface,
|
|
this program will take you another level.<br>
|
|
|
|
<h3>Other Open Source tools</h3>
|
|
Before I get sent to the stake for not having mentioned your favorite
|
|
memory tool, I must confess that few compare in completeness to these three
|
|
in terms of the data that they provide. A more comprehensive list
|
|
of leak detection tools is available <a
|
|
href="http://www.sslug.dk/emailarkiv/bog/2001_08/msg00030.html">here</a>.
|
|
<br>
|
|
|
|
<h2>Commercial</h2>
|
|
These tools are mentioned here only for completeness.
|
|
<h3>Purify</h3>
|
|
The <a href="http://www.rational.com/products/pqc/pplus_ux.jsp">big
|
|
daddy</a> of memory tools, does <i>not work</i> on Linux, so you can stop
|
|
asking that question.
|
|
<h3>Geodesic</h3>
|
|
A latecomer to this arena, <a
|
|
href="http://www.geodesic.com/solutions/solutions_linux.html">Geodesic</a>
|
|
is known most in the Linux community for their <a
|
|
href="http://www.geodesic.com/solutions/products_gc_demo.html">Mozilla</a>
|
|
demo, in which they use their tools to help find memory problems in the
|
|
Mozilla codebase. How much use this has been to the Mozilla team is yet
|
|
to be quantified, but their open-source friendliness can't hurt. Works
|
|
for Solaris/Linux with a fully functional trial. Works on Windows as well.<br>
|
|
|
|
<h3>Insure++</h3>
|
|
A C++ specific tool, but still fairly well known, Parasoft's <a
|
|
href="http://www.parasoft.com/jsp/products/home.jsp?product=Insure">Insure++</a>
|
|
is a fairly complete memory profiling / leak detection tool. In addition,
|
|
it can find some C++ specific errors as well, so that can't hurt. This tool
|
|
works with a variety of compilers and operating systems, a free trial version
|
|
is available too.
|
|
<h2>Miscellaneous Notes:</h2>
|
|
|
|
<h3><a name="Secure_Programming_"></a>Secure Programming </h3>
|
|
Secure programming involves many components, but probably the most significant
|
|
is the careful use of memory. More details are available <a
|
|
href="http://www.theorygroup.com/Theory/FAQ/Secure-Programs-HOWTO-1.html">here</a>.<br>
|
|
|
|
<h3><a name="OOM_killer"></a>OOM killer</h3>
|
|
Some the newer Linux kernels employ an algorithm which is known as the
|
|
Out Of Memory (OOM) killer. This code is invoked when the kernel completely
|
|
runs out of memory, at which point active programs / processes are chosen
|
|
to be executed (as in killed, end_of_the_road, happy hunting grounds, etc).
|
|
More details are available <a
|
|
href="http://linux-mm.org/docs/oom-killer.shtml">here</a>.<br>
|
|
|
|
<h3><a name="Garbage_Collectors"></a>Garbage Collectors </h3>
|
|
One of the other reasons why garbage collection is not always a preferred
|
|
solution is that it is really tough to implement. They have severe problems
|
|
with self-referential structures (i.e. structures that link to themselves)
|
|
as aptly described <a
|
|
href="http://www.tuxedo.org/%7Eesr/jargon/html/Some-AI-Koans.html">here</a>.<br>
|
|
|
|
|
|
|
|
|
|
<!-- *** BEGIN bio *** -->
|
|
<SPACER TYPE="vertical" SIZE="30">
|
|
<P>
|
|
<H4><IMG ALIGN=BOTTOM ALT="" SRC="../gx/note.gif">Madhu M Kurup</H4>
|
|
<EM>I'm a CS engineer from Bangalore, India and formerly of the
|
|
<a href="http://www.linux-bangalore.org">ILUG Bangalore</a>. I've
|
|
been working and playing with Linux for a while and while programming is
|
|
my first love, Linux comes a close second. I work at the Data Mining group
|
|
at <a href="http://www.yahoo.com">Yahoo!</a> Inc and work on algorithms,
|
|
scalability and APIs there. I moonlight on the Linux messenger client and
|
|
dabble in various software projects when (if ever) I can find any free time.
|
|
|
|
<P> And yes, if you want to know, I use C++, vi, mutt, Windowmaker
|
|
and Mandrake; let the flame wars begin :) </EM>
|
|
|
|
|
|
<!-- *** END bio *** -->
|
|
|
|
<!-- *** BEGIN copyright *** -->
|
|
<P> <hr> <!-- P -->
|
|
<H5 ALIGN=center>
|
|
|
|
Copyright © 2002, Madhu M Kurup.<BR>
|
|
Copying license <A HREF="../copying.html">http://www.linuxgazette.com/copying.html</A><BR>
|
|
Published in Issue 81 of <i>Linux Gazette</i>, August 2002</H5>
|
|
<!-- *** END copyright *** -->
|
|
|
|
<!--startcut ==========================================================-->
|
|
<HR><P>
|
|
<CENTER>
|
|
<!-- *** BEGIN navbar *** -->
|
|
<IMG ALT="" SRC="../gx/navbar/left.jpg" WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="bottom"><A HREF="durodola.html"><IMG ALT="[ Prev ]" SRC="../gx/navbar/prev.jpg" WIDTH="16" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="index.html"><IMG ALT="[ Table of Contents ]" SRC="../gx/navbar/toc.jpg" WIDTH="220" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><A HREF="../index.html"><IMG ALT="[ Front Page ]" SRC="../gx/navbar/frontpage.jpg" WIDTH="137" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue81/kurup.html"><IMG ALT="[ Talkback ]" SRC="../gx/navbar/talkback.jpg" WIDTH="121" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><A HREF="../lg_faq.html"><IMG ALT="[ FAQ ]" SRC="./../gx/navbar/faq.jpg"WIDTH="62" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="padala.html"><IMG ALT="[ Next ]" SRC="../gx/navbar/next.jpg" WIDTH="15" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><IMG ALT="" SRC="../gx/navbar/right.jpg" WIDTH="15" HEIGHT="45" ALIGN="bottom">
|
|
<!-- *** END navbar *** -->
|
|
</CENTER>
|
|
</BODY></HTML>
|
|
<!--endcut ============================================================-->
|