367 lines
17 KiB
HTML
367 lines
17 KiB
HTML
<!--startcut ==============================================-->
|
|
<!-- *** BEGIN HTML header *** -->
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
|
<HTML><HEAD>
|
|
<title>Process Tracing Using Ptrace LG #81</title>
|
|
</HEAD>
|
|
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#0000AF"
|
|
ALINK="#FF0000">
|
|
<!-- *** END HTML header *** -->
|
|
|
|
<CENTER>
|
|
<A HREF="http://www.linuxgazette.com/">
|
|
<IMG ALT="LINUX GAZETTE" SRC="../gx/lglogo.png"
|
|
WIDTH="600" HEIGHT="124" border="0"></A>
|
|
<BR>
|
|
|
|
<!-- *** BEGIN navbar *** -->
|
|
<IMG ALT="" SRC="../gx/navbar/left.jpg" WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="bottom"><A HREF="ramankutty.html"><IMG ALT="[ Prev ]" SRC="../gx/navbar/prev.jpg" WIDTH="16" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="index.html"><IMG ALT="[ Table of Contents ]" SRC="../gx/navbar/toc.jpg" WIDTH="220" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><A HREF="../index.html"><IMG ALT="[ Front Page ]" SRC="../gx/navbar/frontpage.jpg" WIDTH="137" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue81/sandeep.html"><IMG ALT="[ Talkback ]" SRC="../gx/navbar/talkback.jpg" WIDTH="121" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><A HREF="../lg_faq.html"><IMG ALT="[ FAQ ]" SRC="./../gx/navbar/faq.jpg"WIDTH="62" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="sevenich.html"><IMG ALT="[ Next ]" SRC="../gx/navbar/next.jpg" WIDTH="15" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><IMG ALT="" SRC="../gx/navbar/right.jpg" WIDTH="15" HEIGHT="45" ALIGN="bottom">
|
|
<!-- *** END navbar *** -->
|
|
<P>
|
|
</CENTER>
|
|
|
|
<!--endcut ============================================================-->
|
|
|
|
<H4 ALIGN="center">
|
|
"Linux Gazette...<I>making Linux just a little more fun!</I>"
|
|
</H4>
|
|
|
|
<P> <HR> <P>
|
|
<!--===================================================================-->
|
|
|
|
<center>
|
|
<H1><font color="maroon">Process Tracing Using Ptrace</font></H1>
|
|
<H4>By <a href="mailto:sk_nellayi@rediffmail.com">Sandeep S</a></H4>
|
|
</center>
|
|
<P> <HR> <P>
|
|
|
|
<!-- END header -->
|
|
|
|
|
|
|
|
|
|
<EM><B>The ptrace system call is crucial to the working of debugger programs like gdb - yet its behaviour is not very well documented - unless you believe that the best documentation is kernel source itself! I shall attempt to demonstrate how ptrace can be used to implement some of the functionality available in tools like gdb.</B></EM>
|
|
|
|
<H2><A NAME="s1">1. Introduction</A></H2>
|
|
|
|
<P><B>ptrace()</B> is a system call that enables one process to control the execution
|
|
of another. It also enables a process to change the core image of
|
|
another process. The traced process behaves normally until a signal is caught.
|
|
When that occurs the process enters stopped state and informs the
|
|
tracing process by a <B>wait()</B> call. Then tracing process decides how the
|
|
traced process should respond. The only exception is SIGKILL which surely kills
|
|
the process.
|
|
<P>The traced process may also enter the stopped state in response to some
|
|
specific events during its course of execution. This happens only if the
|
|
tracing process has set any event flags in the context of the traced process.
|
|
The tracing process can even <B>kill</B> the traced one by setting the exit code
|
|
of the traced process. After tracing, the tracer process may kill the traced
|
|
one or leave to continue with its execution.
|
|
<P>
|
|
<P><B>Note:</B> Ptrace() is highly dependent on the architecture of the
|
|
underlying hardware. Applications using ptrace are not easily portable across
|
|
different architectures and implementations.
|
|
<P>
|
|
<H2><A NAME="s2">2. More Details </A></H2>
|
|
|
|
<P>The prototype of ptrace() is as follows.
|
|
<BLOCKQUOTE><CODE>
|
|
<HR>
|
|
<PRE>
|
|
#include <sys/ptrace.h>
|
|
long int ptrace(enum __ptrace_request request, pid_t pid,
|
|
void * addr, void * data)
|
|
</PRE>
|
|
<HR>
|
|
</CODE></BLOCKQUOTE>
|
|
<P>
|
|
<P>Of the four arguments, the value of <B>request</B> decides what to be
|
|
done. <B>Pid</B> is the ID of the process to be traced. <B>Addr</B> is the
|
|
offset in the user space of the traced process to where the <B>data</B> is
|
|
written when instructed to do so. It is the offset in user space of the traced
|
|
process from where a word is read and returned as the result of the call.
|
|
<P>
|
|
<P>The parent can fork a child process and trace it by calling ptrace with
|
|
<B>request</B> as PTRACE_TRACEME. Parent can also trace an existing process using
|
|
PTRACE_ATTACH. The different values of <B>request</B> are discussed below.
|
|
<P>
|
|
<H2>2.1 How does ptrace() work.</H2>
|
|
|
|
<P>Whenever ptrace is called, what it first does is to lock the kernel. Just
|
|
before returning it unlocks the kernel. Let's see its working in between
|
|
this for different values of <B>request</B>.
|
|
<P>
|
|
<H3><B>PTRACE_TRACEME</B>:</H3>
|
|
|
|
<P>This is called when the child is to be traced by the
|
|
parent. As said above, any signals (except SIGKILL), either delivered from
|
|
outside or from the <B>exec</B> calls made by the process, causes it to stop and
|
|
lets the parent decide how to proceed. Inside ptrace(), the only thing that is checked is
|
|
whether the ptrace flag of the current process is set. If not, permission
|
|
is granted and the flag is set. All the parameters other than <B>request</B> are
|
|
ignored.
|
|
<P>
|
|
<H3><B>PTRACE_ATTACH</B>:</H3>
|
|
|
|
<P>Here a process wants to control another. One thing to
|
|
remember is that nobody is allowed to trace/control the <B>init</B> process.
|
|
A process is not allowed to control itself. The current process (caller) becomes
|
|
the parent of the process with process ID <B>pid</B>. But a <B>getpid()</B> by the
|
|
child (the one being traced) returns the process ID of the real parent.
|
|
<P>
|
|
<P>What goes behind the scenes is that when a call is made, the usual permission
|
|
checks are made along with whether the process is init or current or it is
|
|
already traced. If there is no problem, permission is given and the flag
|
|
is set. Now the links of the child process are rearranged; e.g., the child is
|
|
removed from the task queue and its parent process field is changed (the
|
|
original parent remains the same). It is put to the queue again in such a
|
|
position that <B>init</B> comes next to it. Finally a SIGSTOP signal is delivered
|
|
to it. Here <B>addr</B> and <B>data</B> are ignored.
|
|
<P>
|
|
<H3><B>PTRACE_DETACH</B>:</H3>
|
|
|
|
<P>Stop tracing a process. The tracer may decide whether the child should continue
|
|
to live. This undoes all the effects made by PTRACE_ATTACH/PTRACE_TRACEME. The
|
|
parent sends the exit code for the child in <B>data</B>. Ptrace flag of the child
|
|
is reset. Then the child is moved to its original position in the task queue.
|
|
The pid of real parent is written to the parent field. The single-step bit
|
|
which might have been set is reset. Finally the child is woken up as nothing
|
|
had happened to it; <B>addr</B> is ignored.
|
|
<P>
|
|
<P>
|
|
<H3><B>PTRACE_PEEKTEXT, PTRACE_PEEKDATA, PTRACE_PEEKUSER</B>:</H3>
|
|
|
|
<P>These options read data from child's memory and user space. PTRACE_PEEKTEXT
|
|
and PTRACE_PEEKDATA read data from memory and both these options have the same
|
|
effect. PTRACE_PEEKUSER reads from the user space of child. A word is read and
|
|
placed into a temporary data structure, and with the help of put_user()
|
|
(which copies a string from the kernel's memory segment to the
|
|
process' memory segment) the required data is written to <B>data</B> and returns
|
|
0 on success.
|
|
<P>
|
|
<P>In the case of PTRACE_PEEKTEXT/PTRACE_PEEKDATA, <B>addr</B> is the address of
|
|
the location to be read from child's memory. In PTRACE_PEEKUSER <B>addr</B> is
|
|
the offset of the word in child's user space; <B>data</B> is ignored.
|
|
<P>
|
|
<H3><B>PTRACE_POKETEXT, PTRACE_POKEDATA, PTRACE_POKEUSER</B>:</H3>
|
|
|
|
<P>These options are analogous to the three explained above. The difference is
|
|
that these are used to write the <B>data</B> to the memory/user space of the
|
|
process being traced. In PTRACE_POKETEXT and PTRACE_POKEDATA a word from
|
|
location <B>data</B> is copied to the child's memory location <B>addr</B>.
|
|
<P>
|
|
<P>In PTRACE_POKEUSER we are trying to modify some locations in the
|
|
<CODE>task_struct</CODE> of the process. As the integrity of the kernel has to be
|
|
maintained, we need to be very careful. After a lot of security checks made
|
|
by ptrace, only certain portions of the task_struct is allowed to change. Here
|
|
<B>addr</B> is the offset in child's user area.
|
|
<P>
|
|
<H3><B>PTRACE_SYSCALL, PTRACE_CONT</B>:</H3>
|
|
|
|
<P>Both these wakes up the stopped process. PTRACE_SYSCALL makes the child to
|
|
stop after the next system call. PTRACE_CONT just allows the child to continue.
|
|
In both, the exit code of the child process is set by the ptrace() where the
|
|
exit code is contained in <B>data</B>. All this happens only if the
|
|
signal/exit code is a valid one. Ptrace() resets the single step bit of the
|
|
child, sets/resets the syscall trace bit, and wakes up the process; <B>addr</B> is ignored.
|
|
<P>
|
|
<H3><B>PTRACE_SINGLESTEP</B>;</H3>
|
|
|
|
<P>Does the same as PTRACE_SYSCALL except that the child is stopped after
|
|
every instruction. The single step bit of the child is set. As above <B>data</B>
|
|
contains the exit code for the child; <B>addr</B> is ignored.
|
|
<P>
|
|
<H3><B>PTRACE_KILL</B>:</H3>
|
|
|
|
<P>When the child is to be terminated, PTRACE_KILL may be used. How the murder
|
|
occurs is as follows. Ptrace() checks whether the child is already dead or not.
|
|
If alive, the exit code of the child is set to <B>sigkill</B>. The single step bit
|
|
of the child is reset. Now the child is woken up and when it starts to work it
|
|
gets killed as per the exit code.
|
|
<P>
|
|
<H2>2.2 More machine-dependent calls</H2>
|
|
|
|
<P>The values of <B>request</B> discussed above were independent on the
|
|
architecture and implementation of the system. The values discussed below are
|
|
those that allow the tracing process to get/set (i.e., to read/write) the
|
|
registers of child process. These register fetching/setting options are more
|
|
directly dependent on the architecture of the system. The set of registers
|
|
include general purpose registers, floating point registers and extended
|
|
floating point registers. These more machine-dependent options are discussed
|
|
below. When these options are given, a direct interaction between the
|
|
registers/segments of the system is required.
|
|
|
|
<P> <H3><B>PTRACE_GETREGS,
|
|
PTRACE_GETFPREGS, PTRACE_GETFPXREGS</B>:</H3>
|
|
|
|
<P>These values give the value of general purpose, floating point, extended
|
|
floating point registers of the child process. The registers are
|
|
read to the location <B>data</B> in the parent. The usual checks for access on the
|
|
registers are made. Then the register values are copied to the location
|
|
specified by <B>data</B> with the help of getreg() and __put_user() functions;
|
|
<B>addr</B> is ignored.
|
|
<P>
|
|
<H3><B>PTRACE_SETREGS, PTRACE_SETFPREGS, PTRACE_SETFPXREGS</B>:</H3>
|
|
|
|
<P>These are values of <B>request</B> that allow the tracing process to set the
|
|
general purpose, floating point, extended floating point registers of the child
|
|
respectively. There are some restrictions in the case of setting the registers.
|
|
Some are not allowed to be changed. The data to be copied to the registers will
|
|
be taken from the location <B>data</B> of the parent. Here also <B>addr</B> is
|
|
ignored.
|
|
<P>
|
|
<H2>2.3 Return values of ptrace()</H2>
|
|
|
|
<P>A successful ptrace() returns zero. Errors make it return -1 and set
|
|
<B>errno</B>. Since the return value of a successful PEEKDATA/PEEKTEXT may
|
|
be -1, it is better to check the <B>errno</B>. The errors are
|
|
<P>
|
|
<P>EPERM : The requested process couldn't be traced. Permission denied.
|
|
<P>
|
|
<P>ESRCH : The requested process doesn't exist or is being traced.
|
|
<P>
|
|
<P>EIO : The request was invalid or read/write was made from/to invalid area
|
|
of memory.
|
|
<P>
|
|
<P>EFAULT: Read/write was made from/to memory which was not really mapped.
|
|
<P>
|
|
<P>It is really hard to distinguish between the reasons of EIO and EFAULT.
|
|
These are returned for almost identical errors.
|
|
<P>
|
|
<H2><A NAME="s3">3. A small example.</A></H2>
|
|
|
|
<P>
|
|
If you found the parameter description to be a bit dry, don't
|
|
despair. I shall not attempt anything of that sort again. I will
|
|
try to write simple programs which illustrate many of the
|
|
points discussed above.
|
|
<P>
|
|
<P>Here is the first one. The parent process counts
|
|
the number of instructions executed by the test program run by the child.
|
|
<P>Here the test program is listing the entries of the current directory.
|
|
<P>
|
|
<BLOCKQUOTE><CODE>
|
|
<HR>
|
|
<PRE>
|
|
|
|
#include <stdio.h>
|
|
#include <stdlib.h>
|
|
#include <signal.h>
|
|
#include <syscall.h>
|
|
#include <sys/ptrace.h>
|
|
#include <sys/types.h>
|
|
#include <sys/wait.h>
|
|
#include <unistd.h>
|
|
#include <errno.h>
|
|
|
|
|
|
int main(void)
|
|
{
|
|
long long counter = 0; /* machine instruction counter */
|
|
int wait_val; /* child's return value */
|
|
int pid; /* child's process id */
|
|
|
|
puts("Please wait");
|
|
|
|
switch (pid = fork()) {
|
|
case -1:
|
|
perror("fork");
|
|
break;
|
|
case 0: /* child process starts */
|
|
ptrace(PTRACE_TRACEME, 0, 0, 0);
|
|
/*
|
|
* must be called in order to allow the
|
|
* control over the child process
|
|
*/
|
|
execl("/bin/ls", "ls", NULL);
|
|
/*
|
|
* executes the program and causes
|
|
* the child to stop and send a signal
|
|
* to the parent, the parent can now
|
|
* switch to PTRACE_SINGLESTEP
|
|
*/
|
|
break;
|
|
/* child process ends */
|
|
default:/* parent process starts */
|
|
wait(&wait_val);
|
|
/*
|
|
* parent waits for child to stop at next
|
|
* instruction (execl())
|
|
*/
|
|
while (wait_val == 1407 ) {
|
|
counter++;
|
|
if (ptrace(PTRACE_SINGLESTEP, pid, 0, 0) != 0)
|
|
perror("ptrace");
|
|
/*
|
|
* switch to singlestep tracing and
|
|
* release child
|
|
* if unable call error.
|
|
*/
|
|
wait(&wait_val);
|
|
/* wait for next instruction to complete */
|
|
}
|
|
/*
|
|
* continue to stop, wait and release until
|
|
* the child is finished; wait_val != 1407
|
|
* Low=0177L and High=05 (SIGTRAP)
|
|
*/
|
|
}
|
|
printf("Number of machine instructions : %lld\n", counter);
|
|
return 0;
|
|
}
|
|
</PRE>
|
|
<HR>
|
|
</CODE></BLOCKQUOTE>
|
|
<P>
|
|
<P>open your favourite editor and write the program. Then run it by typing
|
|
<P><CODE>cc file.c</CODE>
|
|
<P><CODE>a.out</CODE>
|
|
<P>You can see the number of instructions needed for listing of your current
|
|
directory. <CODE>cd</CODE> to some other directory and run the program from there
|
|
and see whether there is any difference. (note that it may take some time
|
|
for the output to appear, if you are using a slow machine).
|
|
<P>
|
|
<H2><A NAME="s4">4. Conclusion</A></H2>
|
|
|
|
<P>
|
|
<P>Ptrace() is heavily used for debugging. It is also used for system call
|
|
tracing. The debugger forks and the
|
|
child process created is traced by the parent. The program which is to
|
|
be debugged is exec'd by the child (in the above program it was
|
|
"ls") and after each instruction the parent can examine the register
|
|
values of the program being run. I shall demonstrate programs which
|
|
exploit ptrace's versatility in the next part of this series. Good
|
|
bye till then.
|
|
|
|
|
|
|
|
|
|
<!-- *** BEGIN bio *** -->
|
|
<SPACER TYPE="vertical" SIZE="30">
|
|
<P>
|
|
<H4><IMG ALIGN=BOTTOM ALT="" SRC="../gx/note.gif">Sandeep S</H4>
|
|
<EM>I am a final year student of Government Engineering College in Thrissur,
|
|
Kerala, India. My areas of interests include FreeBSD, Networking and also
|
|
Theoretical Computer Science.</EM>
|
|
|
|
<!-- *** END bio *** -->
|
|
|
|
<!-- *** BEGIN copyright *** -->
|
|
<P> <hr> <!-- P -->
|
|
<H5 ALIGN=center>
|
|
|
|
Copyright © 2002, Sandeep S.<BR>
|
|
Copying license <A HREF="../copying.html">http://www.linuxgazette.com/copying.html</A><BR>
|
|
Published in Issue 81 of <i>Linux Gazette</i>, August 2002</H5>
|
|
<!-- *** END copyright *** -->
|
|
|
|
<!--startcut ==========================================================-->
|
|
<HR><P>
|
|
<CENTER>
|
|
<!-- *** BEGIN navbar *** -->
|
|
<IMG ALT="" SRC="../gx/navbar/left.jpg" WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="bottom"><A HREF="ramankutty.html"><IMG ALT="[ Prev ]" SRC="../gx/navbar/prev.jpg" WIDTH="16" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="index.html"><IMG ALT="[ Table of Contents ]" SRC="../gx/navbar/toc.jpg" WIDTH="220" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><A HREF="../index.html"><IMG ALT="[ Front Page ]" SRC="../gx/navbar/frontpage.jpg" WIDTH="137" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue81/sandeep.html"><IMG ALT="[ Talkback ]" SRC="../gx/navbar/talkback.jpg" WIDTH="121" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><A HREF="../lg_faq.html"><IMG ALT="[ FAQ ]" SRC="./../gx/navbar/faq.jpg"WIDTH="62" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="sevenich.html"><IMG ALT="[ Next ]" SRC="../gx/navbar/next.jpg" WIDTH="15" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><IMG ALT="" SRC="../gx/navbar/right.jpg" WIDTH="15" HEIGHT="45" ALIGN="bottom">
|
|
<!-- *** END navbar *** -->
|
|
</CENTER>
|
|
</BODY></HTML>
|
|
<!--endcut ============================================================-->
|