Hello, World (part 1): The Simplest Module When the first caveman programmer chiseled the first program on the walls of the first cave computer, it was a program to paint the string `Hello, world' in Antelope pictures. Roman programming textbooks began with the `Salut, Mundi' program. I don't know what happens to people who break with this tradition, but I think it's safer not to find out. We'll start with a series of hello world programs that demonstrate the different aspects of the basics of writing a kernel module. Here's the simplest module possible. Don't compile it yet; we'll cover module compilation in the next section. source filehello-1.c hello-1.c */ /* Kernel Programming */ #define MODULE #define LINUX #define __KERNEL__ #include /* Needed by all modules */ #include /* Needed for KERN_ALERT */ int init_module(void) { printk("<1>Hello world 1.\n"); // A non 0 return means init_module failed; module can't be loaded. return 0; } void cleanup_module(void) { printk(KERN_ALERT "Goodbye world 1.\n"); } MODULE_LICENSE("GPL"); ]]> init_module() cleanup_module() Kernel modules must have at least two functions: a "start" (initialization) function called init_module() which is called when the module is insmoded into the kernel, and an "end" (cleanup) function called cleanup_module() which is called just before it is rmmoded. Actually, things have changed starting with kernel 2.3.13. You can now use whatever name you like for the start and end functions of a module, and you'll learn how to do this in . In fact, the new method is the preferred method. However, many people still use init_module() and cleanup_module() for their start and end functions. Typically, init_module() either registers a handler for something with the kernel, or it replaces one of the kernel functions with its own code (usually code to do something and then call the original function). The cleanup_module() function is supposed to undo whatever init_module() did, so the module can be unloaded safely. Lastly, every kernel module needs to include linux/module.h. We needed to include linux/kernel.h only for the macro expansion for the printk() log level, KERN_ALERT, which you'll learn about in . Introducing <function>printk()</function> printk() DEFAULT_MESSAGE_LOGLEVEL Despite what you might think, printk() was not meant to communicate information to the user, even though we used it for exactly this purpose in hello-1! It happens to be a logging mechanism for the kernel, and is used to log information or give warnings. Therefore, each printk() statement comes with a priority, which is the <1> and KERN_ALERT you see. There are 8 priorities and the kernel has macros for them, so you don't have to use cryptic numbers, and you can view them (and their meanings) in linux/kernel.h. If you don't specify a priority level, the default priority, DEFAULT_MESSAGE_LOGLEVEL, will be used. Take time to read through the priority macros. The header file also describes what each priority means. In practise, don't use number, like <4>. Always use the macro, like KERN_WARNING. If the priority is less than int console_loglevel, the message is printed on your current terminal. If both syslogd and klogd are running, then the message will also get appended to /var/log/messages, whether it got printed to the console or not. We use a high priority, like KERN_ALERT, to make sure the printk() messages get printed to your console rather than just logged to your logfile. When you write real modules, you'll want to use priorities that are meaningful for the situation at hand. Compiling Kernel Modules insmod Kernel modules need to be compiled with certain gcc options to make them work. In addition, they also need to be compiled with certain symbols defined. This is because the kernel header files need to behave differently, depending on whether we're compiling a kernel module or an executable. You can define symbols using gcc's option, or with the #define preprocessor command. We'll cover what you need to do in order to compile kernel modules in this section. : A kernel module is not an independant executable, but an object file which will be linked into the kernel during runtime using insmod. As a result, modules should be compiled with the flag. : The kernel makes extensive use of inline functions, so modules must be compiled with the optimization flag turned on. Without optimization, some of the assembler macros calls will be mistaken by the compiler for function calls. This will cause loading the module to fail, since insmod won't find those functions in the kernel. : A programming mistake can take your system down. You should always turn on compiler warnings, and this applies to all your compiling endeavors, not just module compilation. : You must use the kernel headers of the kernel you're compiling against. Using the default /usr/include/linux won't work. -D__KERNEL__: Defining this symbol tells the header files that the code will be run in kernel mode, not as a user process. -DMODULE: This symbol tells the header files to give the appropriate definitions for a kernel module. We use gcc's option instead of because it tells gcc to surpress some "unused variable" warnings that causes when you include module.h. By using under gcc-3.0, the kernel header files are treated specially, and the warnings are surpressed. If you instead use (or even under gcc 2.9x), the "unused variable" warnings will be printed. Just ignore them if they do. So, let's look at a simple Makefile for compiling a module named hello-1.c: Makefile for a basic kernel module As an exercise to the reader, compile hello-1.c and insert it into the kernel with insmod ./hello-1.o (ignore anything you see about tainted kernels; we'll cover that shortly). Neat, eh? All modules loaded into the kernel are listed in /proc/modules. Go ahead and cat that file to see that your module is really a part of the kernel. Congratulations, you are now the author of Linux kernel code! When the novelty wares off, remove your module from the kernel by using rmmod hello-1. Take a look at /var/log/messages just to see that it got logged to your system logfile. Here's another exercise to the reader. See that comment above the return statement in init_module()? Change the return value to something non-zero, recompile and load the module again. What happens? Hello World (part 2) module_init module_exit As of Linux 2.4, you can rename the init and cleanup functions of your modules; they no longer have to be called init_module() and cleanup_module() respectively. This is done with the module_init() and module_exit() macros. These macros are defined in linux/init.h. The only caveat is that your init and cleanup functions must be defined before calling the macros, otherwise you'll get compilation errors. Here's an example of this technique: source filehello-2.c hello-2.c */ /* Kernel Programming */ #define MODULE #define LINUX #define __KERNEL__ #include // Needed by all modules #include // Needed for KERN_ALERT #include // Needed for the macros static int hello_2_init(void) { printk(KERN_ALERT "Hello, world 2\n"); return 0; } static void hello_2_exit(void) { printk(KERN_ALERT "Goodbye, world 2\n"); } module_init(hello_2_init); module_exit(hello_2_exit); MODULE_LICENSE("GPL"); ]]> So now we have two real kernel modules under our belt. With productivity as high as ours, we should have a high powered Makefile. Here's a more advanced Makefile which will compile both our modules at the same time. It's optimized for brevity and scalability. If you don't understand it, I urge you to read the makefile info pages or the GNU Make Manual. Makefile for both our modules As an exercise to the reader, if we had another module in the same directory, say hello-3.c, how would you modify this Makefile to automatically compile that module? Hello World (part 3): The <literal>__init</literal> and <literal>__exit</literal> Macros __init __initdata __exit __initfunction() This demonstrates a feature of kernel 2.2 and later. Notice the change in the definitions of the init and cleanup functions. The __init macro causes the init function to be discarded and its memory freed once the init function finishes for built-in drivers, but not loadable modules. If you think about when the init function is invoked, this makes perfect sense. There is also an __initdata which works similarly to __init but for init variables rather than functions. The __exit macro causes the omission of the function when the module is built into the kernel, and like __exit, has no effect for loadable modules. Again, if you consider when the cleanup function runs, this makes complete sense; built-in drivers don't need a cleanup function, while loadable modules do. These macros are defined in linux/init.h and serve to free up kernel memory. When you boot your kernel and see something like Freeing unused kernel memory: 236k freed, this is precisely what the kernel is freeing. source filehello-3.c hello-3.c */ /* Kernel Programming */ #define MODULE #define LINUX #define __KERNEL__ #include /* Needed by all modules */ #include /* Needed for KERN_ALERT */ #include /* Needed for the macros */ static int hello3_data __initdata = 3; static int __init hello_3_init(void) { printk(KERN_ALERT "Hello, world %d\n", hello3_data); return 0; } static void __exit hello_3_exit(void) { printk(KERN_ALERT "Goodbye, world 3\n"); } module_init(hello_3_init); module_exit(hello_3_exit); MODULE_LICENSE("GPL"); ]]> By the way, you may see the directive "__initfunction()" in drivers written for Linux 2.2 kernels: This macro served the same purpose as __init, but is now very deprecated in favor of __init. I only mention it because you might see it modern kernels. As of 2.4.18, there are 38 references to __initfunction(), and of 2.4.20, there are 37 references. However, don't use it in your own code. Hello World (part 4): Licensing and Module Documentation If you're running kernel 2.4 or later, you might have noticed something like this when you loaded the previous example modules: # insmod hello-3.o Warning: loading hello-3.o will taint the kernel: no license See http://www.tux.org/lkml/#export-tainted for information about tainted modules Hello, world 3 Module hello-3 loaded, with warnings MODULE_LICENSE() In kernel 2.4 and later, a mechanism was devised to identify code licensed under the GPL (and friends) so people can be warned that the code is non open-source. This is accomplished by the MODULE_LICENSE() macro which is demonstrated in the next piece of code. By setting the license to GPL, you can keep the warning from being printed. This license mechanism is defined and documented in linux/module.h. MODULE_DESCRIPTION() MODULE_AUTHOR() MODULE_SUPPORTED_DEVICE() Similarly, MODULE_DESCRIPTION() is used to describe what the module does, MODULE_AUTHOR() declares the module's author, and MODULE_SUPPORTED_DEVICE() declares what types of devices the module supports. These macros are all defined in linux/module.h and aren't used by the kernel itself. They're simply for documentation and can be viewed by a tool like objdump. As an exercise to the reader, try grepping through linux/drivers to see how module authors use these macros to document their modules. source filehello-4.c hello-4.c */ /* Kernel Programming */ #define MODULE #define LINUX #define __KERNEL__ #include #include #include #define DRIVER_AUTHOR "Peter Jay Salzman " #define DRIVER_DESC "A sample driver" int init_hello_3(void); void cleanup_hello_3(void); static int init_hello_4(void) { printk(KERN_ALERT "Hello, world 4\n"); return 0; } static void cleanup_hello_4(void) { printk(KERN_ALERT "Goodbye, world 4\n"); } module_init(init_hello_4); module_exit(cleanup_hello_4); /* You can use strings, like this: */ MODULE_LICENSE("GPL"); // Get rid of taint message by declaring code as GPL. /* Or with defines, like this: */ MODULE_AUTHOR(DRIVER_AUTHOR); // Who wrote this module? MODULE_DESCRIPTION(DRIVER_DESC); // What does this module do? /* This module uses /dev/testdevice. The MODULE_SUPPORTED_DEVICE macro might be used in * the future to help automatic configuration of modules, but is currently unused other * than for documentation purposes. */ MODULE_SUPPORTED_DEVICE("testdevice"); ]]> Passing Command Line Arguments to a Module Modules can take command line arguments, but not with the argc/argv you might be used to. To allow arguments to be passed to your module, declare the variables that will take the values of the command line arguments as global and then use the MODULE_PARM() macro, (defined in linux/module.h) to set the mechanism up. At runtime, insmod will fill the variables with any command line arguments that are given, like ./insmod mymodule.o myvariable=5. The variable declarations and macros should be placed at the beginning of the module for clarity. The example code should clear up my admittedly lousy explanation. The MODULE_PARM() macro takes 2 arguments: the name of the variable and its type. The supported variable types are "b": single byte, "h": short int, "i": integer, "l": long int and "s": string, and the integer types can be signed as usual or unsigned. Strings should be declared as "char *" and insmod will allocate memory for them. You should always try to give the variables an initial default value. This is kernel code, and you should program defensively. For example: int myint = 3; char *mystr; MODULE_PARM(myint, "i"); MODULE_PARM(mystr, "s"); Arrays are supported too. An integer value preceding the type in MODULE_PARM will indicate an array of some maximum length. Two numbers separated by a '-' will give the minimum and maximum number of values. For example, an array of shorts with at least 2 and no more than 4 values could be declared as: int myshortArray[4]; MODULE_PARM (myintArray, "3-9i"); A good use for this is to have the module variable's default values set, like an port or IO address. If the variables contain the default values, then perform autodetection (explained elsewhere). Otherwise, keep the current value. This will be made clear later on. Lastly, there's a macro function, MODULE_PARM_DESC(), that is used to document arguments that the module can take. It takes two parameters: a variable name and a free form string describing that variable. source filehello-5.c hello-5.c */ /* Kernel Programming */ #define MODULE #define LINUX #define __KERNEL__ #include #include #include MODULE_LICENSE("GPL"); MODULE_AUTHOR("Peter Jay Salzman"); // These global variables can be set with command line arguments when you insmod // the module in. // static u8 mybyte = 'A'; static unsigned short myshort = 1; static int myint = 20; static long mylong = 9999; static char *mystring = "blah"; static int myintArray[2] = { 0, 420 }; /* Now we're actually setting the mechanism up -- making the variables command * line arguments rather than just a bunch of global variables. */ MODULE_PARM(mybyte, "b"); MODULE_PARM(myshort, "h"); MODULE_PARM(myint, "i"); MODULE_PARM(mylong, "l"); MODULE_PARM(mystring, "s"); MODULE_PARM(myintArray, "1-2i"); MODULE_PARM_DESC(mybyte, "This byte really does nothing at all."); MODULE_PARM_DESC(myshort, "This short is *extremely* important."); // You get the picture. Always use a MODULE_PARM_DESC() for each MODULE_PARM(). static int __init hello_5_init(void) { printk(KERN_ALERT "mybyte is an 8 bit integer: %i\n", mybyte); printk(KERN_ALERT "myshort is a short integer: %hi\n", myshort); printk(KERN_ALERT "myint is an integer: %i\n", myint); printk(KERN_ALERT "mylong is a long integer: %li\n", mylong); printk(KERN_ALERT "mystring is a string: %s\n", mystring); printk(KERN_ALERT "myintArray is %i and %i\n", myintArray[0], myintArray[1]); return 0; } static void __exit hello_5_exit(void) { printk(KERN_ALERT "Goodbye, world 5\n"); } module_init(hello_5_init); module_exit(hello_5_exit); ]]> I would recommend playing around with this code: satan# insmod hello-5.o mystring="bebop" mybyte=255 myintArray=-1 mybyte is an 8 bit integer: 255 myshort is a short integer: 1 myint is an integer: 20 mylong is a long integer: 9999 mystring is a string: bebop myintArray is -1 and 420 satan# rmmod hello-5 Goodbye, world 5 satan# insmod hello-5.o mystring="supercalifragilisticexpialidocious" \ > mybyte=256 myintArray=-1,-1 mybyte is an 8 bit integer: 0 myshort is a short integer: 1 myint is an integer: 20 mylong is a long integer: 9999 mystring is a string: supercalifragilisticexpialidocious myintArray is -1 and -1 satan# rmmod hello-5 Goodbye, world 5 satan# insmod hello-5.o mylong=hello hello-5.o: invalid argument syntax for mylong: 'h' Modules Spanning Multiple Files source filesmultiple __NO_VERSION__ module.h version.h kernel\_version ld elf_i386 Sometimes it makes sense to divide a kernel module between several source files. In this case, you need to: In all the source files but one, add the line #define __NO_VERSION__. This is important because module.h normally includes the definition of kernel_version, a global variable with the kernel version the module is compiled for. If you need version.h, you need to include it yourself, because module.h won't do it for you with __NO_VERSION__. Compile all the source files as usual. Combine all the object files into a single one. Under x86, use ld -m elf_i386 -r -o <module name.o> <1st src file.o> <2nd src file.o>. Here's an example of such a kernel module. source filestart.c start.c */ /* Kernel Programming */ #define MODULE #define LINUX #define __KERNEL__ #include /* We're doing kernel work */ #include /* Specifically, a module */ int init_module(void) { printk("Hello, world - this is the kernel speaking\n"); return 0; } MODULE_LICENSE("GPL"); ]]> The next file: source filestop.c stop.c */ /* Kernel Programming */ #define MODULE #define LINUX #define __KERNEL__ #if defined(CONFIG_MODVERSIONS) && ! defined(MODVERSIONS) #include /* Will be explained later */ #define MODVERSIONS #endif #include /* We're doing kernel work */ #include /* Specifically, a module */ #define __NO_VERSION__ /* It's not THE file of the kernel module */ #include /* Not included by module.h because of __NO_VERSION__ */ void cleanup_module() { printk("<1>Short is the life of a kernel module\n"); } ]]> And finally, the makefile: Makefile for a multi-filed module