Hello, World (part 1): The Simplest Module

Hello, World (part 1): The Simplest Module When the first caveman programmer chiseled the first program on the walls of the first cave computer, it was a program to paint the string `Hello, world' in Antelope pictures. Roman programming textbooks began with the `Salut, Mundi' program. I don't know what happens to people who break with this tradition, but I think it's safer not to find out. We'll start with a series of hello world programs that demonstrate the different aspects of the basics of writing a kernel module. Here's the simplest module possible. Don't compile it yet; we'll cover module compilation in the next section. source filehello-1.c hello-1.c init_module() cleanup_module() Kernel modules must have at least two functions: a "start" (initialization) function called init_module() which is called when the module is insmoded into the kernel, and an "end" (cleanup) function called cleanup_module() which is called just before it is rmmoded. Actually, things have changed starting with kernel 2.3.13. You can now use whatever name you like for the start and end functions of a module, and you'll learn how to do this in . In fact, the new method is the preferred method. However, many people still use init_module() and cleanup_module() for their start and end functions. Typically, init_module() either registers a handler for something with the kernel, or it replaces one of the kernel functions with its own code (usually code to do something and then call the original function). The cleanup_module() function is supposed to undo whatever init_module() did, so the module can be unloaded safely. Lastly, every kernel module needs to include linux/module.h. We needed to include linux/kernel.h only for the macro expansion for the printk() log level, KERN_ALERT, which you'll learn about in . Introducing <function>printk()</function> printk() DEFAULT_MESSAGE_LOGLEVEL Despite what you might think, printk() was not meant to communicate information to the user, even though we used it for exactly this purpose in hello-1! It happens to be a logging mechanism for the kernel, and is used to log information or give warnings. Therefore, each printk() statement comes with a priority, which is the <1> and KERN_ALERT you see. There are 8 priorities and the kernel has macros for them, so you don't have to use cryptic numbers, and you can view them (and their meanings) in linux/kernel.h. If you don't specify a priority level, the default priority, DEFAULT_MESSAGE_LOGLEVEL, will be used. Take time to read through the priority macros. The header file also describes what each priority means. In practise, don't use number, like <4>. Always use the macro, like KERN_WARNING. If the priority is less than int console_loglevel, the message is printed on your current terminal. If both syslogd and klogd are running, then the message will also get appended to /var/log/messages, whether it got printed to the console or not. We use a high priority, like KERN_ALERT, to make sure the printk() messages get printed to your console rather than just logged to your logfile. When you write real modules, you'll want to use priorities that are meaningful for the situation at hand. Compiling Kernel Modules insmod Kernel modules need to be compiled a bit differently from regular userspace apps. Former kernel versions required us to care much about these settings, which are usually stored in Makefiles. Although hierarchically organized, many redundant settings accumulated in sublevel Makefiles and made them large and rather difficult to maintain. Fortunately, there is a new way of doing these things, called kbuild, and the build process for external loadable modules is now fully integrated into the standard kernel build mechanism. To learn more on how to compile modules which are not part of the official kernel (such as all the examples you'll find in this guide), see file linux/Documentation/kbuild/modules.txt. So, let's look at a simple Makefile for compiling a module named hello-1.c: Makefile for a basic kernel module From a technical point of view just the first line is really necessary, the "all" and "clean" targets were added for pure convenience. Now you can compile the module by issuing the command make . You should obtain an output which resembles the following: hostname:~/lkmpg-examples/02-HelloWorld# make make -C /lib/modules/2.6.11/build M=/root/lkmpg-examples/02-HelloWorld modules make[1]: Entering directory `/usr/src/linux-2.6.11' CC [M] /root/lkmpg-examples/02-HelloWorld/hello-1.o Building modules, stage 2. MODPOST CC /root/lkmpg-examples/02-HelloWorld/hello-1.mod.o LD [M] /root/lkmpg-examples/02-HelloWorld/hello-1.ko make[1]: Leaving directory `/usr/src/linux-2.6.11' hostname:~/lkmpg-examples/02-HelloWorld# Note that kernel 2.6 introduces a new file naming convention: kernel modules now have a .ko extension (in place of the old .o extension) which easily distinguishes them from conventional object files. The reason for this is that they contain an additional .modinfo section that where additional information about the module is kept. We'll soon see what this information is good for. Use modinfo hello-*.ko to see what kind of information it is. hostname:~/lkmpg-examples/02-HelloWorld# modinfo hello-1.ko filename: hello-1.ko vermagic: 2.6.11 preempt PENTIUMII 4KSTACKS gcc-3.3 depends: Nothing spectacular, so far. That changes once we're using modinfo on one of our the later examples, hello-5.ko . hostname:~/lkmpg-examples/02-HelloWorld# modinfo hello-5.ko filename: hello-5.ko license: GPL author: Peter Jay Salzman vermagic: 2.6.11 preempt PENTIUMII 4KSTACKS gcc-3.3 depends: parm: myintArray:An array of integers (array of int) parm: mystring:A character string (charp) parm: mylong:A long integer (long) parm: myint:An integer (int) parm: myshort:A short integer (short) hostname:~/lkmpg-examples/02-HelloWorld# Lot's of useful information to see here. An author string for bugreports, license information, even a short description of the parameters it accepts. Additional details about Makefiles for kernel modules are available in linux/Documentation/kbuild/makefiles.txt. Be sure to read this and the related files before starting to hack Makefiles. It'll probably save you lots of work. Now it is time to insert your freshly-compiled module it into the kernel with insmod ./hello-1.ko (ignore anything you see about tainted kernels; we'll cover that shortly). All modules loaded into the kernel are listed in /proc/modules. Go ahead and cat that file to see that your module is really a part of the kernel. Congratulations, you are now the author of Linux kernel code! When the novelty wears off, remove your module from the kernel by using rmmod hello-1. Take a look at /var/log/messages just to see that it got logged to your system logfile. Here's another exercise for the reader. See that comment above the return statement in init_module()? Change the return value to something negative, recompile and load the module again. What happens? Hello World (part 2) module_init module_exit As of Linux 2.4, you can rename the init and cleanup functions of your modules; they no longer have to be called init_module() and cleanup_module() respectively. This is done with the module_init() and module_exit() macros. These macros are defined in linux/init.h. The only caveat is that your init and cleanup functions must be defined before calling the macros, otherwise you'll get compilation errors. Here's an example of this technique: source filehello-2.c hello-2.c So now we have two real kernel modules under our belt. Adding another module is as simple as this: Makefile for both our modules Now have a look at linux/drivers/char/Makefile for a real world example. As you can see, some things get hardwired into the kernel (obj-y) but where are all those obj-m gone? Those familiar with shell scripts will easily be able to spot them. For those not, the obj-$(CONFIG_FOO) entries you see everywhere expand into obj-y or obj-m, depending on whether the CONFIG_FOO variable has been set to y or m. While we are at it, those were exactly the kind of variables that you have set in the linux/.config file, the last time when you said make menuconfig or something like that. Hello World (part 3): The <literal>__init</literal> and <literal>__exit</literal> Macros __init __initdata __exit __initfunction() This demonstrates a feature of kernel 2.2 and later. Notice the change in the definitions of the init and cleanup functions. The __init macro causes the init function to be discarded and its memory freed once the init function finishes for built-in drivers, but not loadable modules. If you think about when the init function is invoked, this makes perfect sense. There is also an __initdata which works similarly to __init but for init variables rather than functions. The __exit macro causes the omission of the function when the module is built into the kernel, and like __exit, has no effect for loadable modules. Again, if you consider when the cleanup function runs, this makes complete sense; built-in drivers don't need a cleanup function, while loadable modules do. These macros are defined in linux/init.h and serve to free up kernel memory. When you boot your kernel and see something like Freeing unused kernel memory: 236k freed, this is precisely what the kernel is freeing. source filehello-3.c hello-3.c Hello World (part 4): Licensing and Module Documentation If you're running kernel 2.4 or later, you might have noticed something like this when you loaded proprietary modules: # insmod xxxxxx.o Warning: loading xxxxxx.ko will taint the kernel: no license See http://www.tux.org/lkml/#export-tainted for information about tainted modules Module xxxxxx loaded, with warnings MODULE_LICENSE() In kernel 2.4 and later, a mechanism was devised to identify code licensed under the GPL (and friends) so people can be warned that the code is non open-source. This is accomplished by the MODULE_LICENSE() macro which is demonstrated in the next piece of code. By setting the license to GPL, you can keep the warning from being printed. This license mechanism is defined and documented in linux/module.h: /* * The following license idents are currently accepted as indicating free * software modules * * "GPL" [GNU Public License v2 or later] * "GPL v2" [GNU Public License v2] * "GPL and additional rights" [GNU Public License v2 rights and more] * "Dual BSD/GPL" [GNU Public License v2 * or BSD license choice] * "Dual MIT/GPL" [GNU Public License v2 * or MIT license choice] * "Dual MPL/GPL" [GNU Public License v2 * or Mozilla license choice] * * The following other idents are available * * "Proprietary" [Non free products] * * There are dual licensed components, but when running with Linux it is the * GPL that is relevant so this is a non issue. Similarly LGPL linked with GPL * is a GPL combined work. * * This exists for several reasons * 1. So modinfo can show license info for users wanting to vet their setup * is free * 2. So the community can ignore bug reports including proprietary modules * 3. So vendors can do likewise based on their own policies */ MODULE_DESCRIPTION() MODULE_AUTHOR() MODULE_SUPPORTED_DEVICE() Similarly, MODULE_DESCRIPTION() is used to describe what the module does, MODULE_AUTHOR() declares the module's author, and MODULE_SUPPORTED_DEVICE() declares what types of devices the module supports. These macros are all defined in linux/module.h and aren't used by the kernel itself. They're simply for documentation and can be viewed by a tool like objdump. As an exercise to the reader, try and search fo these macros in linux/drivers to see how module authors use these macros to document their modules. I'd recommend to use something like grep -inr MODULE_AUTHOR * in /usr/src/linux-2.6.x/ . People unfamiliar with command line tools will probably like some web base solution, search for sites that offer kernel trees that got indexed with LXR. (or setup it up on your local machine). Users of traditional Unix editors, like emacs or vi will also find tag files useful. They can be generated by make tags or make TAGS in /usr/src/linux-2.6.x/ . Once you've got such a tagfile in your kerneltree you can put the cursor on some function call and use some key combination to directly jump to the definition function. source filehello-4.c hello-4.c Passing Command Line Arguments to a Module Modules can take command line arguments, but not with the argc/argv you might be used to. To allow arguments to be passed to your module, declare the variables that will take the values of the command line arguments as global and then use the module_param() macro, (defined in linux/moduleparam.h) to set the mechanism up. At runtime, insmod will fill the variables with any command line arguments that are given, like ./insmod mymodule.ko myvariable=5. The variable declarations and macros should be placed at the beginning of the module for clarity. The example code should clear up my admittedly lousy explanation. The module_param() macro takes 3 arguments: the name of the variable, its type and permissions for the corresponding file in sysfs. Integer types can be signed as usual or unsigned. If you'd like to use arrays of integers or strings see module_param_array() and module_param_string(). int myint = 3; module_param(myint, int, 0); Arrays are supported too, but things are a bit different now than they were in the 2.4. days. To keep track of the number of parameters you need to pass a pointer to a count variable as third parameter. At your option, you could also ignore the count and pass NULL instead. We show both possibilities here: int myintarray[2]; module_param_array(myintarray, int, NULL, 0); /* not interested in count */ int myshortarray[4]; int count; module_parm_array(myshortarray, short, &count, 0); /* put count into "count" variable */ A good use for this is to have the module variable's default values set, like an port or IO address. If the variables contain the default values, then perform autodetection (explained elsewhere). Otherwise, keep the current value. This will be made clear later on. Lastly, there's a macro function, MODULE_PARM_DESC(), that is used to document arguments that the module can take. It takes two parameters: a variable name and a free form string describing that variable. source filehello-5.c hello-5.c I would recommend playing around with this code: satan# insmod hello-5.ko mystring="bebop" mybyte=255 myintArray=-1 mybyte is an 8 bit integer: 255 myshort is a short integer: 1 myint is an integer: 20 mylong is a long integer: 9999 mystring is a string: bebop myintArray is -1 and 420 satan# rmmod hello-5 Goodbye, world 5 satan# insmod hello-5.ko mystring="supercalifragilisticexpialidocious" \ > mybyte=256 myintArray=-1,-1 mybyte is an 8 bit integer: 0 myshort is a short integer: 1 myint is an integer: 20 mylong is a long integer: 9999 mystring is a string: supercalifragilisticexpialidocious myintArray is -1 and -1 satan# rmmod hello-5 Goodbye, world 5 satan# insmod hello-5.ko mylong=hello hello-5.o: invalid argument syntax for mylong: 'h' Modules Spanning Multiple Files source filesmultiple Sometimes it makes sense to divide a kernel module between several source files. Here's an example of such a kernel module. source filestart.c start.c The next file: source filestop.c stop.c And finally, the makefile: Makefile This is the complete makefile for all the examples we've seen so far. The first five lines are nothing special, but for the last example we'll need two lines. First we invent an object name for our combined module, second we tell make what object files are part of that module. Building modules for a precompiled kernel source filesmultiple Obviously, we strongly suggest you to recompile your kernel, so that you can enable a number of useful debugging features, such as forced module unloading (MODULE_FORCE_UNLOAD): when this option is enabled, you can force the kernel to unload a module even when it believes it is unsafe, via a rmmod -f module command. This option can save you a lot of time and a number of reboots during the development of a module. Nevertheless, there is a number of cases in which you may want to load your module into a precompiled running kernel, such as the ones shipped with common Linux distributions, or a kernel you have compiled in the past. In certain circumstances you could require to compile and insert a module into a running kernel which you are not allowed to recompile, or on a machine that you prefer not to reboot. If you can't think of a case that will force you to use modules for a precompiled kernel you might want to skip this and treat the rest of this chapter as a big footnote. Now, if you just install a kernel source tree, use it to compile your kernel module and you try to insert your module into the kernel, in most cases you would obtain an error as follows: insmod: error inserting 'poet_atkm.ko': -1 Invalid module format Less cryptical information are logged to /var/log/messages: Jun 4 22:07:54 localhost kernel: poet_atkm: version magic '2.6.5-1.358custom 686 REGPARM 4KSTACKS gcc-3.3' should be '2.6.5-1.358 686 REGPARM 4KSTACKS gcc-3.3' In other words, your kernel refuses to accept your module because version strings (more precisely, version magics) do not match. Incidentally, version magics are stored in the module object in the form of a static string, starting with vermagic:. Version data are inserted in your module when it is linked against the init/vermagic.o file. To inspect version magics and other strings stored in a given module, issue the modinfo module.ko command: [root@pcsenonsrv 02-HelloWorld]# modinfo hello-4.ko license: GPL author: Peter Jay Salzman <p@dirac.org> description: A sample driver vermagic: 2.6.5-1.358 686 REGPARM 4KSTACKS gcc-3.3 depends: To overcome this problem we could resort to the --force-vermagic option, but this solution is potentially unsafe, and unquestionably inacceptable in production modules. Consequently, we want to compile our module in an environment which was identical to the one in which our precompiled kernel was built. How to do this, is the subject of the remainder of this chapter. First of all, make sure that a kernel source tree is available, having exactly the same version as your current kernel. Then, find the configuration file which was used to compile your precompiled kernel. Usually, this is available in your current /boot directory, under a name like config-2.6.x. You may just want to copy it to your kernel source tree: cp /boot/config-`uname -r` /usr/src/linux-`uname -r`/.config. Let's focus again on the previous error message: a closer look at the version magic strings suggests that, even with two configuration files which are exactly the same, a slight difference in the version magic could be possible, and it is sufficient to prevent insertion of the module into the kernel. That slight difference, namely the custom string which appears in the module's version magic and not in the kernel's one, is due to a modification with respect to the original, in the makefile that some distribution include. Then, examine your /usr/src/linux/Makefile, and make sure that the specified version information matches exactly the one used for your current kernel. For example, you makefile could start as follows: VERSION = 2 PATCHLEVEL = 6 SUBLEVEL = 5 EXTRAVERSION = -1.358custom ... In this case, you need to restore the value of symbol EXTRAVERSION to -1.358. We suggest to keep a backup copy of the makefile used to compile your kernel available in /lib/modules/2.6.5-1.358/build. A simple cp /lib/modules/`uname -r`/build/Makefile /usr/src/linux-`uname -r` should suffice. Additionally, if you already started a kernel build with the previous (wrong) Makefile, you should also rerun make, or directly modify symbol UTS_RELEASE in file /usr/src/linux-2.6.x/include/linux/version.h according to contents of file /lib/modules/2.6.x/build/include/linux/version.h, or overwrite the latter with the first. Now, please run make to update configuration and version headers and objects: [root@pcsenonsrv linux-2.6.x]# make CHK include/linux/version.h UPD include/linux/version.h SYMLINK include/asm -> include/asm-i386 SPLIT include/linux/autoconf.h -> include/config/* HOSTCC scripts/basic/fixdep HOSTCC scripts/basic/split-include HOSTCC scripts/basic/docproc HOSTCC scripts/conmakehash HOSTCC scripts/kallsyms CC scripts/empty.o ... If you do not desire to actually compile the kernel, you can interrupt the build process (CTRL-C) just after the SPLIT line, because at that time, the files you need will be are ready. Now you can turn back to the directory of your module and compile it: It will be built exactly according your current kernel settings, and it will load into it without any errors.