mirror of https://github.com/tLDP/LDP
253 lines
14 KiB
Plaintext
253 lines
14 KiB
Plaintext
|
<sect1><title>Character Device Drivers</title>
|
||
|
|
||
|
<indexterm><primary>device file</primary><secondary>character</secondary></indexterm>
|
||
|
|
||
|
|
||
|
|
||
|
<sect2><title>The <type>file_operations</type> Structure</title>
|
||
|
|
||
|
<indexterm><primary>file_operations</primary></indexterm>
|
||
|
|
||
|
<para>The <type>file_operations</type> structure is defined in <filename role="headerfile">linux/fs.h</filename>, and
|
||
|
holds pointers to functions defined by the driver that perform various operations on the device. Each field of the
|
||
|
structure corresponds to the address of some function defined by the driver to handle a requested operation.</para>
|
||
|
|
||
|
<para> For example, every character driver needs to define a function that reads from the device. The
|
||
|
<type>file_operations</type> structure holds the address of the module's function that performs that operation. Here is
|
||
|
what the definition looks like for kernel <literal>2.6.5</literal>:</para>
|
||
|
|
||
|
<screen>
|
||
|
struct file_operations {
|
||
|
struct module *owner;
|
||
|
loff_t(*llseek) (struct file *, loff_t, int);
|
||
|
ssize_t(*read) (struct file *, char __user *, size_t, loff_t *);
|
||
|
ssize_t(*aio_read) (struct kiocb *, char __user *, size_t, loff_t);
|
||
|
ssize_t(*write) (struct file *, const char __user *, size_t, loff_t *);
|
||
|
ssize_t(*aio_write) (struct kiocb *, const char __user *, size_t,
|
||
|
loff_t);
|
||
|
int (*readdir) (struct file *, void *, filldir_t);
|
||
|
unsigned int (*poll) (struct file *, struct poll_table_struct *);
|
||
|
int (*ioctl) (struct inode *, struct file *, unsigned int,
|
||
|
unsigned long);
|
||
|
int (*mmap) (struct file *, struct vm_area_struct *);
|
||
|
int (*open) (struct inode *, struct file *);
|
||
|
int (*flush) (struct file *);
|
||
|
int (*release) (struct inode *, struct file *);
|
||
|
int (*fsync) (struct file *, struct dentry *, int datasync);
|
||
|
int (*aio_fsync) (struct kiocb *, int datasync);
|
||
|
int (*fasync) (int, struct file *, int);
|
||
|
int (*lock) (struct file *, int, struct file_lock *);
|
||
|
ssize_t(*readv) (struct file *, const struct iovec *, unsigned long,
|
||
|
loff_t *);
|
||
|
ssize_t(*writev) (struct file *, const struct iovec *, unsigned long,
|
||
|
loff_t *);
|
||
|
ssize_t(*sendfile) (struct file *, loff_t *, size_t, read_actor_t,
|
||
|
void __user *);
|
||
|
ssize_t(*sendpage) (struct file *, struct page *, int, size_t,
|
||
|
loff_t *, int);
|
||
|
unsigned long (*get_unmapped_area) (struct file *, unsigned long,
|
||
|
unsigned long, unsigned long,
|
||
|
unsigned long);
|
||
|
};
|
||
|
</screen>
|
||
|
|
||
|
<para>Some operations are not implemented by a driver. For example, a driver that handles a video card won't need to read
|
||
|
from a directory structure. The corresponding entries in the <type>file_operations</type> structure should be set to
|
||
|
<varname>NULL</varname>.</para>
|
||
|
|
||
|
<para>There is a gcc extension that makes assigning to this structure more convenient. You'll see it in modern drivers,
|
||
|
and may catch you by surprise. This is what the new way of assigning to the structure looks like:</para>
|
||
|
|
||
|
<screen>
|
||
|
struct file_operations fops = {
|
||
|
read: device_read,
|
||
|
write: device_write,
|
||
|
open: device_open,
|
||
|
release: device_release
|
||
|
};
|
||
|
</screen>
|
||
|
|
||
|
<para>However, there's also a C99 way of assigning to elements of a structure, and this is definitely preferred over using
|
||
|
the GNU extension. The version of gcc I'm currently using, <literal>2.95</literal>, supports the new C99 syntax. You
|
||
|
should use this syntax in case someone wants to port your driver. It will help with compatibility:</para>
|
||
|
|
||
|
<screen>
|
||
|
struct file_operations fops = {
|
||
|
.read = device_read,
|
||
|
.write = device_write,
|
||
|
.open = device_open,
|
||
|
.release = device_release
|
||
|
};
|
||
|
</screen>
|
||
|
|
||
|
<para>The meaning is clear, and you should be aware that any member of the structure which you don't explicitly assign
|
||
|
will be initialized to <varname>NULL</varname> by gcc.</para>
|
||
|
|
||
|
<para>A pointer to a <type>struct file_operations</type> is commonly named <varname>fops</varname>.</para>
|
||
|
|
||
|
</sect2>
|
||
|
|
||
|
|
||
|
|
||
|
<sect2><title>The <type>file</type> structure</title>
|
||
|
|
||
|
<indexterm><primary>file</primary></indexterm>
|
||
|
<indexterm><primary>inode</primary></indexterm>
|
||
|
|
||
|
<para>Each device is represented in the kernel by a <type>file</type> structure, which is defined in <filename
|
||
|
role="header">linux/fs.h</filename>. Be aware that a <type>file</type> is a kernel level structure and never appears in a
|
||
|
user space program. It's not the same thing as a <type>FILE</type>, which is defined by glibc and would never appear in a
|
||
|
kernel space function. Also, its name is a bit misleading; it represents an abstract open `file', not a file on a disk,
|
||
|
which is represented by a structure named <type>inode</type>.</para>
|
||
|
|
||
|
<para>A pointer to a <varname>struct file</varname> is commonly named <function>filp</function>. You'll also see it
|
||
|
refered to as <varname>struct file file</varname>. Resist the temptation.</para>
|
||
|
|
||
|
<para>Go ahead and look at the definition of <function>file</function>. Most of the entries you see, like
|
||
|
<function>struct dentry</function> aren't used by device drivers, and you can ignore them. This is because drivers don't
|
||
|
fill <varname>file</varname> directly; they only use structures contained in <varname>file</varname> which are created
|
||
|
elsewhere.</para>
|
||
|
|
||
|
</sect2>
|
||
|
|
||
|
|
||
|
|
||
|
<sect2><title>Registering A Device</title>
|
||
|
|
||
|
<indexterm><primary>register_chrdev</primary></indexterm>
|
||
|
<indexterm><primary>major number</primary><secondary>dynamic allocation</secondary></indexterm>
|
||
|
|
||
|
<para>As discussed earlier, char devices are accessed through device files, usually located in <filename
|
||
|
role="direcotry">/dev</filename><footnote><para>This is by convention. When writing a driver, it's OK to put the device
|
||
|
file in your current directory. Just make sure you place it in <filename role="directory">/dev</filename> for a
|
||
|
production driver</para></footnote>. The major number tells you which driver handles which device file. The minor number
|
||
|
is used only by the driver itself to differentiate which device it's operating on, just in case the driver handles more
|
||
|
than one device.</para>
|
||
|
|
||
|
<para>Adding a driver to your system means registering it with the kernel. This is synonymous with assigning it a major
|
||
|
number during the module's initialization. You do this by using the <function>register_chrdev</function> function,
|
||
|
defined by <filename role="headerfile">linux/fs.h</filename>.</para>
|
||
|
|
||
|
<screen>
|
||
|
int register_chrdev(unsigned int major, const char *name, struct file_operations *fops);
|
||
|
</screen>
|
||
|
|
||
|
<para>where <varname>unsigned int major</varname> is the major number you want to request, <varname>const char
|
||
|
*name</varname> is the name of the device as it'll appear in <filename>/proc/devices</filename> and <varname>struct
|
||
|
file_operations *fops</varname> is a pointer to the <varname>file_operations</varname> table for your driver. A negative
|
||
|
return value means the registertration failed. Note that we didn't pass the minor number to
|
||
|
<function>register_chrdev</function>. That's because the kernel doesn't care about the minor number; only our driver uses
|
||
|
it.</para>
|
||
|
|
||
|
<para>Now the question is, how do you get a major number without hijacking one that's already in use? The easiest way
|
||
|
would be to look through <filename>Documentation/devices.txt</filename> and pick an unused one. That's a bad way of doing
|
||
|
things because you'll never be sure if the number you picked will be assigned later. The answer is that you can ask the
|
||
|
kernel to assign you a dynamic major number.</para>
|
||
|
|
||
|
<para>If you pass a major number of 0 to <function>register_chrdev</function>, the return value will be the dynamically
|
||
|
allocated major number. The downside is that you can't make a device file in advance, since you don't know what the major
|
||
|
number will be. There are a couple of ways to do this. First, the driver itself can print the newly assigned number and
|
||
|
we can make the device file by hand. Second, the newly registered device will have an entry in
|
||
|
<filename>/proc/devices</filename>, and we can either make the device file by hand or write a shell script to read the
|
||
|
file in and make the device file. The third method is we can have our driver make the the device file using the
|
||
|
<function>mknod</function> system call after a successful registration and rm during the call to
|
||
|
<function>cleanup_module</function>.</para>
|
||
|
|
||
|
</sect2>
|
||
|
|
||
|
|
||
|
|
||
|
<sect2><title>Unregistering A Device</title>
|
||
|
|
||
|
<indexterm><primary>rmmod</primary><secondary>preventing</secondary></indexterm>
|
||
|
|
||
|
<para>We can't allow the kernel module to be <application>rmmod</application>'ed whenever root feels like it. If the
|
||
|
device file is opened by a process and then we remove the kernel module, using the file would cause a call to the memory
|
||
|
location where the appropriate function (read/write) used to be. If we're lucky, no other code was loaded there, and
|
||
|
we'll get an ugly error message. If we're unlucky, another kernel module was loaded into the same location, which means a
|
||
|
jump into the middle of another function within the kernel. The results of this would be impossible to predict, but they
|
||
|
can't be very positive.</para>
|
||
|
|
||
|
<para>Normally, when you don't want to allow something, you return an error code (a negative number) from the function
|
||
|
which is supposed to do it. With <function>cleanup_module</function> that's impossible because it's a void function.
|
||
|
However, there's a counter which keeps track of how many processes are using your module. You can see what it's value is
|
||
|
by looking at the 3rd field of <filename>/proc/modules</filename>. If this number isn't zero, <function>rmmod</function>
|
||
|
will fail. Note that you don't have to check the counter from within <function>cleanup_module</function> because the
|
||
|
check will be performed for you by the system call <function>sys_delete_module</function>, defined in
|
||
|
<filename>linux/module.c</filename>. You shouldn't use this counter directly, but there are functions defined in <filename
|
||
|
role="headerfile">linux/modules.h</filename> which let you increase, decrease and display this counter:</para>
|
||
|
|
||
|
<itemizedlist>
|
||
|
<listitem><para><varname>try_module_get(THIS_MODULE)</varname>: Increment the use count.</para></listitem>
|
||
|
<listitem><para><varname>try_module_put(THIS_MODULE)</varname>: Decrement the use count.</para></listitem>
|
||
|
</itemizedlist>
|
||
|
|
||
|
<para>It's important to keep the counter accurate; if you ever do lose track of the correct usage count, you'll never be
|
||
|
able to unload the module; it's now reboot time, boys and girls. This is bound to happen to you sooner or later during a
|
||
|
module's development.</para>
|
||
|
|
||
|
<indexterm><primary>MOD_INC_USE_COUNT</primary></indexterm>
|
||
|
<indexterm><primary>MOD_DEC_USE_COUNT</primary></indexterm>
|
||
|
<indexterm><primary>MOD_IN_USE</primary></indexterm>
|
||
|
|
||
|
</sect2>
|
||
|
|
||
|
|
||
|
|
||
|
<sect2><title>chardev.c</title>
|
||
|
|
||
|
<para>The next code sample creates a char driver named <filename>chardev</filename>. You can <filename>cat</filename> its
|
||
|
device file (or <filename>open</filename> the file with a program) and the driver will put the number of times the device
|
||
|
file has been read from into the file. We don't support writing to the file (like <command>echo "hi" >
|
||
|
/dev/hello</command>), but catch these attempts and tell the user that the operation isn't supported. Don't worry if you
|
||
|
don't see what we do with the data we read into the buffer; we don't do much with it. We simply read in the data and
|
||
|
print a message acknowledging that we received it.</para>
|
||
|
|
||
|
<example><title>chardev.c</title><programlisting><inlinegraphic fileref="lkmpg-examples/04-CharacterDeviceFiles/chardev.c" format="linespecific"/></inlinegraphic></programlisting></example>
|
||
|
</sect2>
|
||
|
|
||
|
|
||
|
|
||
|
<sect2><title>Writing Modules for Multiple Kernel Versions</title>
|
||
|
|
||
|
<indexterm><primary>kernel versions</primary></indexterm>
|
||
|
<indexterm><primary>LINUX_VERSION_CODE</primary></indexterm>
|
||
|
<indexterm><primary>KERNEL_VERSION</primary></indexterm>
|
||
|
|
||
|
<para>The system calls, which are the major interface the kernel shows to the processes, generally stay the same across
|
||
|
versions. A new system call may be added, but usually the old ones will behave exactly like they used to. This is
|
||
|
necessary for backward compatibility -- a new kernel version is not supposed to break regular processes. In most cases,
|
||
|
the device files will also remain the same. On the other hand, the internal interfaces within the kernel can and do change
|
||
|
between versions.</para>
|
||
|
|
||
|
<para>The Linux kernel versions are divided between the stable versions (n.$<$even number$>$.m) and the development
|
||
|
versions (n.$<$odd number$>$.m). The development versions include all the cool new ideas, including those which will
|
||
|
be considered a mistake, or reimplemented, in the next version. As a result, you can't trust the interface to remain the
|
||
|
same in those versions (which is why I don't bother to support them in this book, it's too much work and it would become
|
||
|
dated too quickly). In the stable versions, on the other hand, we can expect the interface to remain the same regardless
|
||
|
of the bug fix version (the m number).</para>
|
||
|
|
||
|
<para>There are differences between different kernel versions, and if you want to support multiple kernel versions, you'll
|
||
|
find yourself having to code conditional compilation directives. The way to do this to compare the macro
|
||
|
<varname>LINUX_VERSION_CODE</varname> to the macro <varname>KERNEL_VERSION</varname>. In version <varname>a.b.c</varname>
|
||
|
of the kernel, the value of this macro would be $2^{16}a+2^{8}b+c$. </para>
|
||
|
|
||
|
<para>
|
||
|
While previous versions of this guide showed how you can write backward compatible code with such constructs in
|
||
|
great detail, we decided to break with this tradition for the better. People interested in doing such
|
||
|
might now use a LKMPG with a version matching to their kernel. We decided to version the LKMPG like the kernel,
|
||
|
at least as far as major and minor number are concerned. We use the patchlevel for our own versioning so
|
||
|
use LKMPG version 2.4.x for kernels 2.4.x, use LKMPG version 2.6.x for kernels 2.6.x and so on.
|
||
|
Also make sure that you always use current, up to date versions of both, kernel and guide.
|
||
|
</para>
|
||
|
|
||
|
</sect2>
|
||
|
|
||
|
</sect1>
|
||
|
|
||
|
|
||
|
|
||
|
<!--
|
||
|
vim:textwidth=128 shiftwidth=3
|
||
|
-->
|