LDP/LDP/guide/docbook/lkmpg-2.6/04-CharacterDeviceFiles.sgml

262 lines
15 KiB
Plaintext

<sect1><title>Character Device Drivers</title>
<indexterm><primary>device file</primary><secondary>character</secondary></indexterm>
<sect2><title>The <type>file_operations</type> Structure</title>
<indexterm><primary>file_operations</primary></indexterm>
<para>The <type>file_operations</type> structure is defined in <filename role="headerfile">linux/fs.h</filename>, and
holds pointers to functions defined by the driver that perform various operations on the device. Each field of the
structure corresponds to the address of some function defined by the driver to handle a requested operation.</para>
<para> For example, every character driver needs to define a function that reads from the device. The
<type>file_operations</type> structure holds the address of the module's function that performs that operation. Here is
what the definition looks like for kernel <literal>2.6.5</literal>:</para>
<screen>
struct file_operations {
struct module *owner;
loff_t(*llseek) (struct file *, loff_t, int);
ssize_t(*read) (struct file *, char __user *, size_t, loff_t *);
ssize_t(*aio_read) (struct kiocb *, char __user *, size_t, loff_t);
ssize_t(*write) (struct file *, const char __user *, size_t, loff_t *);
ssize_t(*aio_write) (struct kiocb *, const char __user *, size_t,
loff_t);
int (*readdir) (struct file *, void *, filldir_t);
unsigned int (*poll) (struct file *, struct poll_table_struct *);
int (*ioctl) (struct inode *, struct file *, unsigned int,
unsigned long);
int (*mmap) (struct file *, struct vm_area_struct *);
int (*open) (struct inode *, struct file *);
int (*flush) (struct file *);
int (*release) (struct inode *, struct file *);
int (*fsync) (struct file *, struct dentry *, int datasync);
int (*aio_fsync) (struct kiocb *, int datasync);
int (*fasync) (int, struct file *, int);
int (*lock) (struct file *, int, struct file_lock *);
ssize_t(*readv) (struct file *, const struct iovec *, unsigned long,
loff_t *);
ssize_t(*writev) (struct file *, const struct iovec *, unsigned long,
loff_t *);
ssize_t(*sendfile) (struct file *, loff_t *, size_t, read_actor_t,
void __user *);
ssize_t(*sendpage) (struct file *, struct page *, int, size_t,
loff_t *, int);
unsigned long (*get_unmapped_area) (struct file *, unsigned long,
unsigned long, unsigned long,
unsigned long);
};
</screen>
<para>Some operations are not implemented by a driver. For example, a driver that handles a video card won't need to read
from a directory structure. The corresponding entries in the <type>file_operations</type> structure should be set to
<varname>NULL</varname>.</para>
<para>There is a gcc extension that makes assigning to this structure more convenient. You'll see it in modern drivers,
and may catch you by surprise. This is what the new way of assigning to the structure looks like:</para>
<screen>
struct file_operations fops = {
read: device_read,
write: device_write,
open: device_open,
release: device_release
};
</screen>
<para>However, there's also a C99 way of assigning to elements of a structure, and this is definitely preferred over using
the GNU extension. The version of gcc the author used when writing this, <literal>2.95</literal>, supports the new C99 syntax. You
should use this syntax in case someone wants to port your driver. It will help with compatibility:</para>
<screen>
struct file_operations fops = {
.read = device_read,
.write = device_write,
.open = device_open,
.release = device_release
};
</screen>
<para>The meaning is clear, and you should be aware that any member of the structure which you don't explicitly assign
will be initialized to <varname>NULL</varname> by gcc.</para>
<para>An instance of <type>struct file_operations</type> containing pointers to functions that are used to implement
read, write, open, ... syscalls is commonly named <varname>fops</varname>.</para>
</sect2>
<sect2><title>The <type>file</type> structure</title>
<indexterm><primary>file</primary></indexterm>
<indexterm><primary>inode</primary></indexterm>
<para>Each device is represented in the kernel by a <type>file</type> structure, which is defined in <filename
role="header">linux/fs.h</filename>. Be aware that a <type>file</type> is a kernel level structure and never appears in a
user space program. It's not the same thing as a <type>FILE</type>, which is defined by glibc and would never appear in a
kernel space function. Also, its name is a bit misleading; it represents an abstract open `file', not a file on a disk,
which is represented by a structure named <type>inode</type>.</para>
<para>An instance of <varname>struct file</varname> is commonly named <function>filp</function>. You'll also see it
refered to as <varname>struct file file</varname>. Resist the temptation.</para>
<para>Go ahead and look at the definition of <function>file</function>. Most of the entries you see, like
<function>struct dentry</function> aren't used by device drivers, and you can ignore them. This is because drivers don't
fill <varname>file</varname> directly; they only use structures contained in <varname>file</varname> which are created
elsewhere.</para>
</sect2>
<sect2><title>Registering A Device</title>
<indexterm><primary>register_chrdev</primary></indexterm>
<indexterm><primary>major number</primary><secondary>dynamic allocation</secondary></indexterm>
<para>As discussed earlier, char devices are accessed through device files, usually located in <filename
role="direcotry">/dev</filename><footnote><para>This is by convention. When writing a driver, it's OK to put the device
file in your current directory. Just make sure you place it in <filename role="directory">/dev</filename> for a
production driver</para></footnote>. The major number tells you which driver handles which device file. The minor number
is used only by the driver itself to differentiate which device it's operating on, just in case the driver handles more
than one device.</para>
<para>Adding a driver to your system means registering it with the kernel. This is synonymous with assigning it a major
number during the module's initialization. You do this by using the <function>register_chrdev</function> function,
defined by <filename role="headerfile">linux/fs.h</filename>.</para>
<screen>
int register_chrdev(unsigned int major, const char *name, struct file_operations *fops);
</screen>
<para>where <varname>unsigned int major</varname> is the major number you want to request, <varname>const char
*name</varname> is the name of the device as it'll appear in <filename>/proc/devices</filename> and <varname>struct
file_operations *fops</varname> is a pointer to the <varname>file_operations</varname> table for your driver. A negative
return value means the registration failed. Note that we didn't pass the minor number to
<function>register_chrdev</function>. That's because the kernel doesn't care about the minor number; only our driver uses
it.</para>
<para>Now the question is, how do you get a major number without hijacking one that's already in use? The easiest way
would be to look through <filename>Documentation/devices.txt</filename> and pick an unused one. That's a bad way of doing
things because you'll never be sure if the number you picked will be assigned later. The answer is that you can ask the
kernel to assign you a dynamic major number.</para>
<para>If you pass a major number of 0 to <function>register_chrdev</function>, the return value will be the dynamically
allocated major number. The downside is that you can't make a device file in advance, since you don't know what the major
number will be. There are a couple of ways to do this. First, the driver itself can print the newly assigned number and
we can make the device file by hand. Second, the newly registered device will have an entry in
<filename>/proc/devices</filename>, and we can either make the device file by hand or write a shell script to read the
file in and make the device file. The third method is we can have our driver make the device file using the
<function>mknod</function> system call after a successful registration and rm during the call to
<function>cleanup_module</function>.</para>
</sect2>
<sect2><title>Unregistering A Device</title>
<indexterm><primary>rmmod</primary><secondary>preventing</secondary></indexterm>
<para>We can't allow the kernel module to be <application>rmmod</application>'ed whenever root feels like it. If the
device file is opened by a process and then we remove the kernel module, using the file would cause a call to the memory
location where the appropriate function (read/write) used to be. If we're lucky, no other code was loaded there, and
we'll get an ugly error message. If we're unlucky, another kernel module was loaded into the same location, which means a
jump into the middle of another function within the kernel. The results of this would be impossible to predict, but they
can't be very positive.</para>
<para>Normally, when you don't want to allow something, you return an error code (a negative number) from the function
which is supposed to do it. With <function>cleanup_module</function> that's impossible because it's a void function.
However, there's a counter which keeps track of how many processes are using your module. You can see what it's value is
by looking at the 3rd field of <filename>/proc/modules</filename>. If this number isn't zero, <function>rmmod</function>
will fail. Note that you don't have to check the counter from within <function>cleanup_module</function> because the
check will be performed for you by the system call <function>sys_delete_module</function>, defined in
<filename>linux/module.c</filename>. You shouldn't use this counter directly, but there are functions defined in <filename
role="headerfile">linux/module.h</filename> which let you increase, decrease and display this counter:</para>
<itemizedlist>
<listitem><para><varname>try_module_get(THIS_MODULE)</varname>: Increment the use count.</para></listitem>
<listitem><para><varname>module_put(THIS_MODULE)</varname>: Decrement the use count.</para></listitem>
</itemizedlist>
<para>It's important to keep the counter accurate; if you ever do lose track of the correct usage count, you'll never be
able to unload the module; it's now reboot time, boys and girls. This is bound to happen to you sooner or later during a
module's development.</para>
<indexterm><primary>MOD_INC_USE_COUNT</primary></indexterm>
<indexterm><primary>MOD_DEC_USE_COUNT</primary></indexterm>
<indexterm><primary>MOD_IN_USE</primary></indexterm>
</sect2>
<sect2><title>chardev.c</title>
<para>The next code sample creates a char driver named <filename>chardev</filename>. You can <filename>cat</filename> its
device file (or <filename>open</filename> the file with a program) and the driver will put the number of times the device
file has been read from into the file. We don't support writing to the file (like <command>echo "hi" >
/dev/hello</command>), but catch these attempts and tell the user that the operation isn't supported. Don't worry if you
don't see what we do with the data we read into the buffer; we don't do much with it. We simply read in the data and
print a message acknowledging that we received it.</para>
<example><title>chardev.c</title><programlisting><inlinegraphic fileref="lkmpg-examples/04-CharacterDeviceFiles/chardev.c" format="linespecific"/></inlinegraphic></programlisting></example>
</sect2>
<sect2><title>Writing Modules for Multiple Kernel Versions</title>
<indexterm><primary>kernel versions</primary></indexterm>
<indexterm><primary>LINUX_VERSION_CODE</primary></indexterm>
<indexterm><primary>KERNEL_VERSION</primary></indexterm>
<para>The system calls, which are the major interface the kernel shows to the processes, generally stay the same across
versions. A new system call may be added, but usually the old ones will behave exactly like they used to. This is
necessary for backward compatibility -- a new kernel version is not supposed to break regular processes. In most cases,
the device files will also remain the same. On the other hand, the internal interfaces within the kernel can and do change
between versions.</para>
<para>The Linux kernel versions are divided between the stable versions (n.$&lt;$even number$&gt;$.m) and the development
versions (n.$&lt;$odd number$&gt;$.m). The development versions include all the cool new ideas, including those which will
be considered a mistake, or reimplemented, in the next version. As a result, you can't trust the interface to remain the
same in those versions (which is why I don't bother to support them in this book, it's too much work and it would become
dated too quickly). In the stable versions, on the other hand, we can expect the interface to remain the same regardless
of the bug fix version (the m number).</para>
<para>There are differences between different kernel versions, and if you want to support multiple kernel versions, you'll
find yourself having to code conditional compilation directives. The way to do this to compare the macro
<varname>LINUX_VERSION_CODE</varname> to the macro <varname>KERNEL_VERSION</varname>. In version <varname>a.b.c</varname>
of the kernel, the value of this macro would be $2^{16}a+2^{8}b+c$. </para>
<para>
While previous versions of this guide showed how you can write backward compatible code with such constructs in
great detail, we decided to break with this tradition for the better. People interested in doing such
might now use a LKMPG with a version matching to their kernel. We decided to version the LKMPG like the kernel,
at least as far as major and minor number are concerned. We use the patchlevel for our own versioning so
use LKMPG version 2.4.x for kernels 2.4.x, use LKMPG version 2.6.x for kernels 2.6.x and so on.
Also make sure that you always use current, up to date versions of both, kernel and guide.
</para>
<para> Update: What we've said above was true for kernels up to and including 2.6.10. You might already
have noticed that recent kernels look different. In case you haven't they look like 2.6.x.y now.
The meaning of the first three items basically stays the same, but a subpatchlevel has been added and will
indicate security fixes till the next stable patchlevel is out. So people can choose between a stable
tree with security updates <emphasis> and </emphasis> use the latest kernel as developer tree.
Search the kernel mailing list archives if you're interested in the full story.
</para>
</sect2>
</sect1>
<!--
vim:textwidth=128 shiftwidth=3
-->