962 lines
20 KiB
HTML
962 lines
20 KiB
HTML
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
||
|
<HTML
|
||
|
><HEAD
|
||
|
><TITLE
|
||
|
>Technical Details</TITLE
|
||
|
><META
|
||
|
NAME="GENERATOR"
|
||
|
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
|
||
|
REL="HOME"
|
||
|
TITLE="Linux Loadable Kernel Module HOWTO"
|
||
|
HREF="index.html"><LINK
|
||
|
REL="PREVIOUS"
|
||
|
TITLE="Persistent Data"
|
||
|
HREF="x615.html"><LINK
|
||
|
REL="NEXT"
|
||
|
TITLE="Writing Your Own Loadable Kernel Module"
|
||
|
HREF="x839.html"></HEAD
|
||
|
><BODY
|
||
|
CLASS="SECT1"
|
||
|
BGCOLOR="#FFFFFF"
|
||
|
TEXT="#000000"
|
||
|
LINK="#0000FF"
|
||
|
VLINK="#840084"
|
||
|
ALINK="#0000FF"
|
||
|
><DIV
|
||
|
CLASS="NAVHEADER"
|
||
|
><TABLE
|
||
|
SUMMARY="Header navigation table"
|
||
|
WIDTH="100%"
|
||
|
BORDER="0"
|
||
|
CELLPADDING="0"
|
||
|
CELLSPACING="0"
|
||
|
><TR
|
||
|
><TH
|
||
|
COLSPAN="3"
|
||
|
ALIGN="center"
|
||
|
>Linux Loadable Kernel Module HOWTO</TH
|
||
|
></TR
|
||
|
><TR
|
||
|
><TD
|
||
|
WIDTH="10%"
|
||
|
ALIGN="left"
|
||
|
VALIGN="bottom"
|
||
|
><A
|
||
|
HREF="x615.html"
|
||
|
ACCESSKEY="P"
|
||
|
>Prev</A
|
||
|
></TD
|
||
|
><TD
|
||
|
WIDTH="80%"
|
||
|
ALIGN="center"
|
||
|
VALIGN="bottom"
|
||
|
></TD
|
||
|
><TD
|
||
|
WIDTH="10%"
|
||
|
ALIGN="right"
|
||
|
VALIGN="bottom"
|
||
|
><A
|
||
|
HREF="x839.html"
|
||
|
ACCESSKEY="N"
|
||
|
>Next</A
|
||
|
></TD
|
||
|
></TR
|
||
|
></TABLE
|
||
|
><HR
|
||
|
ALIGN="LEFT"
|
||
|
WIDTH="100%"></DIV
|
||
|
><DIV
|
||
|
CLASS="SECT1"
|
||
|
><H1
|
||
|
CLASS="SECT1"
|
||
|
><A
|
||
|
NAME="AEN627"
|
||
|
></A
|
||
|
>10. Technical Details</H1
|
||
|
><DIV
|
||
|
CLASS="SECT2"
|
||
|
><H2
|
||
|
CLASS="SECT2"
|
||
|
><A
|
||
|
NAME="HOWTHEYWORK"
|
||
|
></A
|
||
|
>10.1. How They Work</H2
|
||
|
><P
|
||
|
><B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> makes an <TT
|
||
|
CLASS="FUNCTION"
|
||
|
>init_module</TT
|
||
|
>
|
||
|
system call to load the LKM into kernel memory. Loading it is the
|
||
|
easy part, though. How does the kernel know to use it? The answer is
|
||
|
that the <TT
|
||
|
CLASS="FUNCTION"
|
||
|
>init_module</TT
|
||
|
> system call invokes the
|
||
|
LKM's initialization routine right after it loads the LKM.
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> passes to <TT
|
||
|
CLASS="FUNCTION"
|
||
|
>init_module</TT
|
||
|
>
|
||
|
the address of the subroutine in the LKM named
|
||
|
<TT
|
||
|
CLASS="FUNCTION"
|
||
|
>init_module</TT
|
||
|
> as its initialization routine.</P
|
||
|
><P
|
||
|
>(This is confusing -- every LKM has a subroutine named
|
||
|
<TT
|
||
|
CLASS="FUNCTION"
|
||
|
>init_module</TT
|
||
|
>, and the base kernel has a system
|
||
|
call by that same name, which is accessible via a subroutine in the
|
||
|
standard C library also named <TT
|
||
|
CLASS="FUNCTION"
|
||
|
>init_module</TT
|
||
|
>).</P
|
||
|
><P
|
||
|
>The LKM author set up <TT
|
||
|
CLASS="FUNCTION"
|
||
|
>init_module</TT
|
||
|
> to call a
|
||
|
kernel function that registers the subroutines that the LKM contains.
|
||
|
For example, a character device driver's
|
||
|
<TT
|
||
|
CLASS="FUNCTION"
|
||
|
>init_module</TT
|
||
|
> subroutine might call the kernel's
|
||
|
<TT
|
||
|
CLASS="FUNCTION"
|
||
|
>register_chrdev</TT
|
||
|
> subroutine, passing the major and
|
||
|
minor number of the device it intends to drive and the address of its
|
||
|
own "open" routine among the arguments.
|
||
|
<TT
|
||
|
CLASS="FUNCTION"
|
||
|
>register_chrdev</TT
|
||
|
> records in base kernel tables
|
||
|
that when the kernel wants to open that particular device, it should
|
||
|
call the open routine in our LKM.</P
|
||
|
><P
|
||
|
>But the astute reader will now ask how the LKM's
|
||
|
<TT
|
||
|
CLASS="FUNCTION"
|
||
|
>init_module</TT
|
||
|
> subroutine knew the address of the
|
||
|
base kernel's <TT
|
||
|
CLASS="FUNCTION"
|
||
|
>register_chrdev</TT
|
||
|
> subroutine. This
|
||
|
is not a system call, but an ordinary subroutine bound into the base
|
||
|
kernel. Calling it means branching to its address. So how does our
|
||
|
LKM, which was not compiled anywhere near the base kernel, know that
|
||
|
address? The answer to this is the relocation that happens at
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> time.</P
|
||
|
><P
|
||
|
>How that relocation happens is fundamentally different between Linux
|
||
|
2.4 and Linux 2.6. In 2.6, <B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> pretty much just
|
||
|
passes the verbatim contents of the LKM file (.ko file) to the kernel
|
||
|
and the kernel does the relocation. In 2.4, <B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> does
|
||
|
the relocation and passes a fully linked module image, ready to stuff
|
||
|
into kernel memory, to the kernel. <EM
|
||
|
>The description below
|
||
|
covers the 2.4 case.</EM
|
||
|
></P
|
||
|
><P
|
||
|
><B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> functions as a relocating linker/loader.
|
||
|
The LKM object file contains an external reference to the symbol
|
||
|
<TT
|
||
|
CLASS="VARNAME"
|
||
|
>register_chrdev</TT
|
||
|
>. <B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> does a
|
||
|
<TT
|
||
|
CLASS="FUNCTION"
|
||
|
>query_module</TT
|
||
|
> system call to find out the
|
||
|
addresses of various symbols that the existing kernel exports.
|
||
|
<TT
|
||
|
CLASS="VARNAME"
|
||
|
>register_chrdev</TT
|
||
|
> is among these.
|
||
|
<TT
|
||
|
CLASS="FUNCTION"
|
||
|
>query_module</TT
|
||
|
> returns the address for which
|
||
|
<TT
|
||
|
CLASS="VARNAME"
|
||
|
>register_chrdev</TT
|
||
|
> stands and
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> patches that into the LKM where the LKM
|
||
|
refers to <TT
|
||
|
CLASS="VARNAME"
|
||
|
>register_chrdev</TT
|
||
|
>.</P
|
||
|
><P
|
||
|
>If you want to see the kind of information <B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
>
|
||
|
can get from a <TT
|
||
|
CLASS="FUNCTION"
|
||
|
>query_module</TT
|
||
|
> system call, look at
|
||
|
the contents of <TT
|
||
|
CLASS="FILENAME"
|
||
|
>/proc/ksyms</TT
|
||
|
>.</P
|
||
|
><P
|
||
|
>Note that some LKMs call subroutines in other LKMs. They can do this
|
||
|
because of the <TT
|
||
|
CLASS="LITERAL"
|
||
|
>__ksymtab</TT
|
||
|
> and
|
||
|
<TT
|
||
|
CLASS="LITERAL"
|
||
|
>.kstrtab</TT
|
||
|
> sections in the LKM object files. These
|
||
|
sections together list the external symbols within the LKM object file
|
||
|
that are supposed to be accessible by other LKMs inserted in the
|
||
|
future. <B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> looks at
|
||
|
<TT
|
||
|
CLASS="LITERAL"
|
||
|
>__ksymtab</TT
|
||
|
> and <TT
|
||
|
CLASS="LITERAL"
|
||
|
>.kstrtab</TT
|
||
|
> and tells
|
||
|
the kernel to add those symbols to its exported kernel symbols table.</P
|
||
|
><P
|
||
|
>To see this for yourself, insert the LKM <TT
|
||
|
CLASS="FILENAME"
|
||
|
>msdos.o</TT
|
||
|
>
|
||
|
and then notice in <TT
|
||
|
CLASS="FILENAME"
|
||
|
>/proc/ksyms</TT
|
||
|
> the symbol
|
||
|
<TT
|
||
|
CLASS="VARNAME"
|
||
|
>fat_add_cluster</TT
|
||
|
> (which is the name of a subroutine
|
||
|
in the <TT
|
||
|
CLASS="FILENAME"
|
||
|
>fat.o</TT
|
||
|
> LKM). Any subsequently inserted LKM
|
||
|
can branch to <TT
|
||
|
CLASS="VARNAME"
|
||
|
>fat_add_cluster</TT
|
||
|
>, and in fact
|
||
|
<TT
|
||
|
CLASS="FILENAME"
|
||
|
>msdos.o</TT
|
||
|
> does just that.</P
|
||
|
></DIV
|
||
|
><DIV
|
||
|
CLASS="SECT2"
|
||
|
><H2
|
||
|
CLASS="SECT2"
|
||
|
><A
|
||
|
NAME="MODINFO"
|
||
|
></A
|
||
|
>10.2. The .modinfo Section</H2
|
||
|
><P
|
||
|
>An ELF object file consists of various named sections. Some of them
|
||
|
are basic parts of an object file, for example the
|
||
|
<TT
|
||
|
CLASS="LITERAL"
|
||
|
>.text</TT
|
||
|
> section contains executable code that a
|
||
|
loader loads. But you can make up any section you want and have it
|
||
|
used by special programs. For the purposes of Linux LKMs, there is
|
||
|
the <TT
|
||
|
CLASS="LITERAL"
|
||
|
>.modinfo</TT
|
||
|
> section. An LKM doesn't have to have
|
||
|
a section named <TT
|
||
|
CLASS="LITERAL"
|
||
|
>.modinfo</TT
|
||
|
> to work, but the macros
|
||
|
you're supposed to use to code an LKM cause one to be generated, so
|
||
|
they generally do.</P
|
||
|
><P
|
||
|
>To see the sections of an object file, including the
|
||
|
<TT
|
||
|
CLASS="LITERAL"
|
||
|
>.modinfo</TT
|
||
|
> section if it exists, use the
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>objdump</B
|
||
|
> program. For example:</P
|
||
|
><P
|
||
|
>To see all the sections in the object file for the msdos LKM:
|
||
|
<TABLE
|
||
|
BORDER="1"
|
||
|
BGCOLOR="#E0E0E0"
|
||
|
WIDTH="100%"
|
||
|
><TR
|
||
|
><TD
|
||
|
><FONT
|
||
|
COLOR="#000000"
|
||
|
><PRE
|
||
|
CLASS="SCREEN"
|
||
|
>objdump msdos.o --section-headers</PRE
|
||
|
></FONT
|
||
|
></TD
|
||
|
></TR
|
||
|
></TABLE
|
||
|
>
|
||
|
To see the contents of the <TT
|
||
|
CLASS="LITERAL"
|
||
|
>.modinfo</TT
|
||
|
> section:
|
||
|
<TABLE
|
||
|
BORDER="1"
|
||
|
BGCOLOR="#E0E0E0"
|
||
|
WIDTH="100%"
|
||
|
><TR
|
||
|
><TD
|
||
|
><FONT
|
||
|
COLOR="#000000"
|
||
|
><PRE
|
||
|
CLASS="SCREEN"
|
||
|
>objdump msdos.o --full-contents --section=.modinfo</PRE
|
||
|
></FONT
|
||
|
></TD
|
||
|
></TR
|
||
|
></TABLE
|
||
|
></P
|
||
|
><P
|
||
|
>You can use the <B
|
||
|
CLASS="COMMAND"
|
||
|
>modinfo</B
|
||
|
> program to interpret the
|
||
|
contents of the <TT
|
||
|
CLASS="LITERAL"
|
||
|
>.modinfo</TT
|
||
|
> section.</P
|
||
|
><P
|
||
|
>So what is in the <TT
|
||
|
CLASS="LITERAL"
|
||
|
>.modinfo</TT
|
||
|
> section and who uses it?
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> uses the <TT
|
||
|
CLASS="LITERAL"
|
||
|
>.modinfo</TT
|
||
|
> section
|
||
|
for the following:
|
||
|
<P
|
||
|
></P
|
||
|
><UL
|
||
|
><LI
|
||
|
><P
|
||
|
>It contains the kernel release number for which the module was built.
|
||
|
I.e. of the kernel source tree whose header files were used in
|
||
|
compiling the module.</P
|
||
|
><P
|
||
|
><B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> uses that information as explained in
|
||
|
<A
|
||
|
HREF="basekerncompat.html"
|
||
|
>Section 6</A
|
||
|
>.</P
|
||
|
></LI
|
||
|
><LI
|
||
|
><P
|
||
|
>It describes the form of the LKM's parameters.
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> uses this information to format the
|
||
|
parameters you supply on the <B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> command line
|
||
|
into data structure initial values, which <B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
>
|
||
|
inserts into the LKM as it loads it.</P
|
||
|
></LI
|
||
|
></UL
|
||
|
></P
|
||
|
></DIV
|
||
|
><DIV
|
||
|
CLASS="SECT2"
|
||
|
><H2
|
||
|
CLASS="SECT2"
|
||
|
><A
|
||
|
NAME="AEN712"
|
||
|
></A
|
||
|
>10.3. The __ksymtab And .kstrtab Sections</H2
|
||
|
><P
|
||
|
>Two other sections you often find in an LKM object file are named
|
||
|
<TT
|
||
|
CLASS="LITERAL"
|
||
|
>__ksymtab</TT
|
||
|
> and <TT
|
||
|
CLASS="LITERAL"
|
||
|
>.kstrtab</TT
|
||
|
>.
|
||
|
Together, they list symbols in the LKM that should be accessible
|
||
|
(exported) to other parts of the kernel. A symbol is just a text name
|
||
|
for an address in the LKM. LKM A's object file can refer to an
|
||
|
address in LKM B by name (say, <TT
|
||
|
CLASS="LITERAL"
|
||
|
>getBinfo</TT
|
||
|
>"). When
|
||
|
you insert LKM A, after having inserted LKM B,
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> can insert into LKM A the actual address
|
||
|
within LKM B where the data/subroutine named
|
||
|
<TT
|
||
|
CLASS="LITERAL"
|
||
|
>getBinfo</TT
|
||
|
> is loaded.</P
|
||
|
><P
|
||
|
>See <A
|
||
|
HREF="x627.html#HOWTHEYWORK"
|
||
|
>Section 10.1</A
|
||
|
> for more mind-numbing details of symbol binding.</P
|
||
|
></DIV
|
||
|
><DIV
|
||
|
CLASS="SECT2"
|
||
|
><H2
|
||
|
CLASS="SECT2"
|
||
|
><A
|
||
|
NAME="AEN722"
|
||
|
></A
|
||
|
>10.4. Ksymoops Symbols</H2
|
||
|
><P
|
||
|
><B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> adds a bunch of exported symbols to the LKM
|
||
|
as it loads it. These symbols are all intended to help
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>ksymoops</B
|
||
|
> do its job. <B
|
||
|
CLASS="COMMAND"
|
||
|
>ksymoops</B
|
||
|
>
|
||
|
is a program that interprets and "oops" display. And
|
||
|
"oops" display is stuff that the Linux kernel displays when
|
||
|
it detects an internal kernel error (and consequently terminates a
|
||
|
process). This information contains a bunch of addresses in the
|
||
|
kernel, in hexadecimal.</P
|
||
|
><P
|
||
|
><B
|
||
|
CLASS="COMMAND"
|
||
|
>ksymoops</B
|
||
|
> looks at the hexadecimal addresses, looks
|
||
|
them up in the kernel symbol table (which you see in
|
||
|
<TT
|
||
|
CLASS="FILENAME"
|
||
|
>/proc/ksyms</TT
|
||
|
> (Linux 2.4) or
|
||
|
<TT
|
||
|
CLASS="FILENAME"
|
||
|
>/proc/kallsyms</TT
|
||
|
> (Linux 2.6), and translates the
|
||
|
addresses in the oops message to symbolic addresses, which you might
|
||
|
be able to look up in an assembler listing.</P
|
||
|
><P
|
||
|
>So lets say you have an LKM crash on you. The oops message contains
|
||
|
the address of the instruction that choked, and what you want
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>ksymoops</B
|
||
|
> to tell you is 1) in what LKM is that
|
||
|
instruction, and 2) where is the instruction relative to an assembler
|
||
|
listing of that LKM? Similar questions arise for the data addresses
|
||
|
in the oops message.</P
|
||
|
><P
|
||
|
>To answer those questions, <B
|
||
|
CLASS="COMMAND"
|
||
|
>ksymoops</B
|
||
|
> must be able to
|
||
|
get the loadpoints and lengths of the various sections of the LKM from
|
||
|
the kernel symbol table.</P
|
||
|
><P
|
||
|
>Well, in Linux 2.4, <B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> knows those addresses,
|
||
|
so it just creates symbols for them and includes them in the symbols
|
||
|
it loads with the LKM.</P
|
||
|
><P
|
||
|
>In particular, those symbols are named (and you can see this for
|
||
|
yourself by looking at <TT
|
||
|
CLASS="FILENAME"
|
||
|
>/proc/ksyms</TT
|
||
|
>):</P
|
||
|
><P
|
||
|
><TT
|
||
|
CLASS="LITERAL"
|
||
|
>__insmod_</TT
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>name</TT
|
||
|
><TT
|
||
|
CLASS="LITERAL"
|
||
|
>_S</TT
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>sectionname</TT
|
||
|
><TT
|
||
|
CLASS="LITERAL"
|
||
|
>_L</TT
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>length</TT
|
||
|
></P
|
||
|
><P
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>name</TT
|
||
|
> is the LKM name (as you would see in
|
||
|
<TT
|
||
|
CLASS="FILENAME"
|
||
|
>/proc/modules</TT
|
||
|
>.</P
|
||
|
><P
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>sectionname</TT
|
||
|
> is the section name, e.g. <TT
|
||
|
CLASS="LITERAL"
|
||
|
>.text</TT
|
||
|
>
|
||
|
(don't forget the leading period).</P
|
||
|
><P
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>length</TT
|
||
|
> is the length of the section, in decimal.</P
|
||
|
><P
|
||
|
>The value of the symbol is, of course, the address of the section.</P
|
||
|
><P
|
||
|
>Insmod also adds a pretty useful symbol that tells from what file
|
||
|
the LKM was loaded. That symbol's name is</P
|
||
|
><P
|
||
|
><TT
|
||
|
CLASS="LITERAL"
|
||
|
>__insmod_</TT
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>name</TT
|
||
|
><TT
|
||
|
CLASS="LITERAL"
|
||
|
>_O</TT
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>filespec</TT
|
||
|
><TT
|
||
|
CLASS="LITERAL"
|
||
|
>_M</TT
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>mtime</TT
|
||
|
><TT
|
||
|
CLASS="LITERAL"
|
||
|
>_V<TT
|
||
|
CLASS="VARNAME"
|
||
|
>version</TT
|
||
|
></TT
|
||
|
></P
|
||
|
><P
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>name</TT
|
||
|
> is the LKM name, as above.</P
|
||
|
><P
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>filespec</TT
|
||
|
> is the file specification that was used to
|
||
|
identify the file containing the LKM when it was loaded. Note that it
|
||
|
isn't necessarily still under that name, and there are multiple
|
||
|
file specifications that might have been used to refer to the same
|
||
|
file. For example, <TT
|
||
|
CLASS="FILENAME"
|
||
|
>../dir1/mylkm.o</TT
|
||
|
> and
|
||
|
<TT
|
||
|
CLASS="FILENAME"
|
||
|
>/lib/dir1/mylkm.o</TT
|
||
|
>.</P
|
||
|
><P
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>mtime</TT
|
||
|
> is the modification time of that file, in
|
||
|
the standard Unix representation (seconds since 1969), in hexadecimal.</P
|
||
|
><P
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>version</TT
|
||
|
> tells the kernel version level for which
|
||
|
the LKM was built (same as in the <TT
|
||
|
CLASS="LITERAL"
|
||
|
>.modinfo</TT
|
||
|
> section).
|
||
|
It is the value of the macro <TT
|
||
|
CLASS="LITERAL"
|
||
|
>LINUX_VERSION_CODE</TT
|
||
|
>
|
||
|
in Linux's <TT
|
||
|
CLASS="FILENAME"
|
||
|
>linux/version.h</TT
|
||
|
> file. For example,
|
||
|
<TT
|
||
|
CLASS="LITERAL"
|
||
|
>132101</TT
|
||
|
>.</P
|
||
|
><P
|
||
|
>The value of this symbol is meaningless.</P
|
||
|
><P
|
||
|
>In Linux 2.6, it works differently. (I haven't figured out how yet).</P
|
||
|
></DIV
|
||
|
><DIV
|
||
|
CLASS="SECT2"
|
||
|
><H2
|
||
|
CLASS="SECT2"
|
||
|
><A
|
||
|
NAME="AEN782"
|
||
|
></A
|
||
|
>10.5. Other Symbols</H2
|
||
|
><P
|
||
|
><B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
>
|
||
|
adds another symbol, similar to the <B
|
||
|
CLASS="COMMAND"
|
||
|
>ksymoops</B
|
||
|
>
|
||
|
symbols. This one tells where the persistent data lives in the
|
||
|
LKM, which <B
|
||
|
CLASS="COMMAND"
|
||
|
>rmmod</B
|
||
|
> needs to know in order to save
|
||
|
the persistent data.</P
|
||
|
><P
|
||
|
><TT
|
||
|
CLASS="LITERAL"
|
||
|
>__insmod_</TT
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>name</TT
|
||
|
><TT
|
||
|
CLASS="LITERAL"
|
||
|
>_P</TT
|
||
|
><TT
|
||
|
CLASS="VARNAME"
|
||
|
>length</TT
|
||
|
></P
|
||
|
></DIV
|
||
|
><DIV
|
||
|
CLASS="SECT2"
|
||
|
><H2
|
||
|
CLASS="SECT2"
|
||
|
><A
|
||
|
NAME="AEN793"
|
||
|
></A
|
||
|
>10.6. Debugging Symbols</H2
|
||
|
><P
|
||
|
>There is another kind of symbol that relates to an LKM: kallsyms
|
||
|
symbols. These are not exported symbols; they do not show up in
|
||
|
<TT
|
||
|
CLASS="FILENAME"
|
||
|
>proc/ksyms</TT
|
||
|
>. They refer to addresses in the
|
||
|
kernel that are nobody's business except the module they are in, and
|
||
|
are not meant to be referenced by anything except a debugger. Kdb,
|
||
|
the kernel debugger that comes with the Linux kernel, is one user of
|
||
|
kallsyms symbols.</P
|
||
|
><P
|
||
|
>The kallsyms facility works for both the base kernel and LKMs. For the
|
||
|
base kernel, you select it when you build the base kernel, with the
|
||
|
CONFIG_KALLSYMS configuration option. When you do that, the kernel
|
||
|
contains a kallsyms symbol for all the symbols in the base kernel
|
||
|
object files. You know your base kernel is participating in the kallsyms
|
||
|
facility if you see the symbol <TT
|
||
|
CLASS="LITERAL"
|
||
|
>__start___kallsyms</TT
|
||
|
>
|
||
|
in <TT
|
||
|
CLASS="FILENAME"
|
||
|
>/proc/ksyms</TT
|
||
|
>.</P
|
||
|
><P
|
||
|
>For an LKM, you decide at load time whether it will contain kallsyms
|
||
|
symbols. You include the kallsyms definitions in the data you pass to
|
||
|
the <TT
|
||
|
CLASS="LITERAL"
|
||
|
>init_module</TT
|
||
|
> system call to load the LKM.
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> does this if either 1) you specify the
|
||
|
<TT
|
||
|
CLASS="LITERAL"
|
||
|
>--kallsyms</TT
|
||
|
> option, or 2) <B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
>
|
||
|
determines, by looking at <TT
|
||
|
CLASS="FILENAME"
|
||
|
>/proc/ksyms</TT
|
||
|
>, that the
|
||
|
base kernel is participating in the kallsyms facility. The kallsyms that
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> defines are all the symbols in the LKM
|
||
|
object file. To wit, those are the symbols you see when you run
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>nm</B
|
||
|
> on the LKM object file.</P
|
||
|
><P
|
||
|
>Each loaded LKM that is participating in kallsyms has its own kallsyms
|
||
|
symbol table. When the base kernel is participating in the kallsyms
|
||
|
facility, the individual LKM kallysms symbol tables are linked into a
|
||
|
master symbol table so that a debugger can look up a symbol anywhere
|
||
|
in the kernel. When the base kernel is not participating in kallsyms,
|
||
|
a debugger must look explicitly at a particular LKM to find symbols
|
||
|
for that LKM. Kdb, for one, cannot do this. So the basic rule is: If
|
||
|
you're going to do any kernel debugging, use CONFIG_KALLSYMS.</P
|
||
|
><P
|
||
|
>Note that the <TT
|
||
|
CLASS="LITERAL"
|
||
|
>__kallsyms</TT
|
||
|
> section has nothing to do
|
||
|
with LKMs. That's a section in the base kernel object module. The
|
||
|
base kernel doesn't have the luxury of something as high-level as
|
||
|
sophisticated as <B
|
||
|
CLASS="COMMAND"
|
||
|
>insmod</B
|
||
|
> to load it, so it needs
|
||
|
that extra object file section to facilitate its participation in
|
||
|
kallsyms.</P
|
||
|
><P
|
||
|
>Similarly, the program <B
|
||
|
CLASS="COMMAND"
|
||
|
>kallsyms</B
|
||
|
> has nothing to do with
|
||
|
LKMs. It is what creates the <TT
|
||
|
CLASS="LITERAL"
|
||
|
>__kallsyms</TT
|
||
|
> section.</P
|
||
|
><P
|
||
|
>There is another kind of debugging symbol -- the kind that
|
||
|
<B
|
||
|
CLASS="COMMAND"
|
||
|
>gcc</B
|
||
|
> creates with its <TT
|
||
|
CLASS="LITERAL"
|
||
|
>-g</TT
|
||
|
> option.
|
||
|
These are unrelated to the kallsyms facility. They do not get loaded
|
||
|
into kernel memory. Kdb does not use them. But Kgdb (which gets
|
||
|
information both from kernel memory and source and object files) does.</P
|
||
|
></DIV
|
||
|
><DIV
|
||
|
CLASS="SECT2"
|
||
|
><H2
|
||
|
CLASS="SECT2"
|
||
|
><A
|
||
|
NAME="MEMALLOC"
|
||
|
></A
|
||
|
>10.7. Memory Allocation For Loading</H2
|
||
|
><P
|
||
|
>This section is about how Linux allocates memory in which to load an LKM.
|
||
|
It is not about how an LKM dynamically allocates memory, which is the same
|
||
|
as for any other part of the kernel.</P
|
||
|
><P
|
||
|
>The memory where an LKM resides is a little different from that where
|
||
|
the base kernel resides. The base kernel is always loaded into one
|
||
|
big contiguous area of real memory, whose real addresses are equal to
|
||
|
its virtual addresses. That's possible because the base kernel is the
|
||
|
first thing ever to get loaded (besides the loader) -- it has a wide
|
||
|
open empty space in which to load. And since the Linux kernel is not
|
||
|
pageable, it stays in its homestead forever.</P
|
||
|
><P
|
||
|
>By the time you load an LKM, real memory is all fragmented -- you
|
||
|
can't simply add the LKM to the end of the base kernel. But the LKM
|
||
|
needs to be in contiguous virtual memory, so Linux uses
|
||
|
<TT
|
||
|
CLASS="FUNCTION"
|
||
|
>vmalloc</TT
|
||
|
> to allocate a contiguous area of virtual
|
||
|
memory (in the kernel address space), which is probably not contiguous
|
||
|
in real memory. But <EM
|
||
|
>the memory is still not
|
||
|
pageable</EM
|
||
|
>. The LKM gets loaded into real page frames from
|
||
|
the start, and stays in those real page frames until it gets unloaded.</P
|
||
|
><P
|
||
|
>This design is based on the principle that it's much easier to get a
|
||
|
large contiguous area of virtual memory than to get a large contiguous
|
||
|
area of real memory because there is orders of magnitude more virtual
|
||
|
address space than real memory. In the early days of virtual memory,
|
||
|
that was certainly true -- a 32 bit machine had a 4GiB virtual address
|
||
|
space and rarely more than 64 MiB of real memory. Now, however, it's
|
||
|
quite often just the reverse, with virtual address space being the
|
||
|
constrained resource. So this LKM loading strategy may not make sense
|
||
|
on your system.</P
|
||
|
><P
|
||
|
>Some CPUs can take advantage of the properties of the base kernel to
|
||
|
effect faster access to base kernel memory. For example, on one
|
||
|
machine, the entire base kernel is covered by one page table entry
|
||
|
and consequently one entry in the translation lookaside buffer (TLB).
|
||
|
Naturally, that TLB entry is virtually always present. For LKMs on
|
||
|
this machine, there is a page table entry for each memory page into
|
||
|
which the LKM is loaded. Much more often, the entry for a page is not
|
||
|
in the TLB when the CPU goes to access it, which means a slower access.</P
|
||
|
><P
|
||
|
>This effect is probably trivial.</P
|
||
|
><P
|
||
|
>It is also said that PowerPC Linux does something with its address
|
||
|
translation so that transferring between accessing base kernel memory
|
||
|
to accessing LKM memory is costly. I don't know anything solid about
|
||
|
that.</P
|
||
|
><P
|
||
|
>The base kernel contains within its prized contiguous domain a large
|
||
|
expanse of reusable memory -- the kmalloc pool. In some versions of
|
||
|
Linux, the module loader tries first to get contiguous memory from
|
||
|
that pool into which to load an LKM and only if a large enough space
|
||
|
was not available, go to the vmalloc space. Andi Kleen submitted code
|
||
|
to do that in Linux 2.5 in October 2002. He claims the difference is
|
||
|
in the several per cent range.</P
|
||
|
></DIV
|
||
|
><DIV
|
||
|
CLASS="SECT2"
|
||
|
><H2
|
||
|
CLASS="SECT2"
|
||
|
><A
|
||
|
NAME="AEN830"
|
||
|
></A
|
||
|
>10.8. Linux internals</H2
|
||
|
><P
|
||
|
>If you're interested in the internal workings of the Linux kernel with
|
||
|
respect to LKMs, this section can get you started. You should not need
|
||
|
to know any of this in order to develop, build, and use LKMs.</P
|
||
|
><P
|
||
|
>The code to handle LKMs is in the source files
|
||
|
<TT
|
||
|
CLASS="FILENAME"
|
||
|
>kernel/module.c</TT
|
||
|
> in the Linux source tree.</P
|
||
|
><P
|
||
|
>The kernel module loader (see <A
|
||
|
HREF="x197.html#AUTOLOAD"
|
||
|
>Section 5.4</A
|
||
|
>) lives in
|
||
|
<TT
|
||
|
CLASS="FILENAME"
|
||
|
>kernel/kmod.c</TT
|
||
|
>.</P
|
||
|
><P
|
||
|
>(Ok, that wasn't much of a start, but at least I have a framework here
|
||
|
for adding this information in the future).</P
|
||
|
></DIV
|
||
|
></DIV
|
||
|
><DIV
|
||
|
CLASS="NAVFOOTER"
|
||
|
><HR
|
||
|
ALIGN="LEFT"
|
||
|
WIDTH="100%"><TABLE
|
||
|
SUMMARY="Footer navigation table"
|
||
|
WIDTH="100%"
|
||
|
BORDER="0"
|
||
|
CELLPADDING="0"
|
||
|
CELLSPACING="0"
|
||
|
><TR
|
||
|
><TD
|
||
|
WIDTH="33%"
|
||
|
ALIGN="left"
|
||
|
VALIGN="top"
|
||
|
><A
|
||
|
HREF="x615.html"
|
||
|
ACCESSKEY="P"
|
||
|
>Prev</A
|
||
|
></TD
|
||
|
><TD
|
||
|
WIDTH="34%"
|
||
|
ALIGN="center"
|
||
|
VALIGN="top"
|
||
|
><A
|
||
|
HREF="index.html"
|
||
|
ACCESSKEY="H"
|
||
|
>Home</A
|
||
|
></TD
|
||
|
><TD
|
||
|
WIDTH="33%"
|
||
|
ALIGN="right"
|
||
|
VALIGN="top"
|
||
|
><A
|
||
|
HREF="x839.html"
|
||
|
ACCESSKEY="N"
|
||
|
>Next</A
|
||
|
></TD
|
||
|
></TR
|
||
|
><TR
|
||
|
><TD
|
||
|
WIDTH="33%"
|
||
|
ALIGN="left"
|
||
|
VALIGN="top"
|
||
|
>Persistent Data</TD
|
||
|
><TD
|
||
|
WIDTH="34%"
|
||
|
ALIGN="center"
|
||
|
VALIGN="top"
|
||
|
> </TD
|
||
|
><TD
|
||
|
WIDTH="33%"
|
||
|
ALIGN="right"
|
||
|
VALIGN="top"
|
||
|
>Writing Your Own Loadable Kernel Module</TD
|
||
|
></TR
|
||
|
></TABLE
|
||
|
></DIV
|
||
|
></BODY
|
||
|
></HTML
|
||
|
>
|