mirror of https://github.com/tLDP/LDP
251 lines
13 KiB
Plaintext
251 lines
13 KiB
Plaintext
<sect1><title>The /proc File System</title>
|
|
|
|
<indexterm><primary><filename role=directory>/proc</filename> filesystem</primary></indexterm>
|
|
<indexterm><primary>filesystem</primary><secondary><filename role=directory>/proc</filename></secondary></indexterm>
|
|
|
|
<para>
|
|
In Linux, there is an additional mechanism for the kernel and kernel modules to send information to processes --- the
|
|
<filename role="directory">/proc</filename> file system. Originally designed to allow easy access to information about
|
|
processes (hence the name), it is now used by every bit of the kernel which has something interesting to report, such as
|
|
<filename>/proc/modules</filename> which provides the list of modules and <filename>/proc/meminfo</filename>
|
|
which stats memory usage statistics.
|
|
</para>
|
|
|
|
<indexterm><primary><filename>/proc/modules</filename></primary></indexterm>
|
|
<indexterm><primary><filename>/proc/meminfo</filename></primary></indexterm>
|
|
|
|
<para>The method to use the proc file system is very similar to the one used with device drivers --- a structure is created
|
|
with all the information needed for the <filename role="directory">/proc</filename> file, including pointers to any handler
|
|
functions (in our case there is only one, the one called when somebody attempts to read from the <filename
|
|
role="directory">/proc</filename> file). Then, <function>init_module</function> registers the structure with the kernel and
|
|
<function>cleanup_module</function> unregisters it.</para>
|
|
|
|
<para>The reason we use <function>proc_register_dynamic</function><footnote><para>In version 2.0, in version 2.2 this is done
|
|
automatically if we set the inode to zero.</para></footnote> is because we don't want to determine the inode number
|
|
used for our file in advance, but to allow the kernel to determine it to prevent clashes. Normal file systems are located on a
|
|
disk, rather than just in memory (which is where <filename role="directory">/proc</filename> is), and in that case the inode
|
|
number is a pointer to a disk location where the file's index-node (inode for short) is located. The inode contains
|
|
information about the file, for example the file's permissions, together with a pointer to the disk location or locations
|
|
where the file's data can be found.</para>
|
|
|
|
<indexterm><primary><function>proc_register_dynamic</function></primary></indexterm>
|
|
<indexterm><primary><function>proc_register</function></primary></indexterm>
|
|
<indexterm><primary>inode</primary></indexterm>
|
|
|
|
<para>
|
|
Because we don't get called when the file is opened or closed, there's nowhere for us to put
|
|
<varname>try_module_get</varname> and <varname>try_module_put</varname> in this module, and if
|
|
the file is opened and then the module is removed, there's no way to avoid the consequences.
|
|
</para>
|
|
|
|
<para>
|
|
Here a simple example showing how to use a /proc file. This is the HelloWorld for the /proc filesystem.
|
|
There are three parts: create the file <filename>/proc/helloworld</filename> in the function <function>init_module</function>,
|
|
return a value (and a buffer) when the file <filename>/proc/helloworld</filename> is read in the callback
|
|
function <function>procfs_read</function>, and delete the file <filename>/proc/helloworld</filename>
|
|
in the function <function>cleanup_module</function>.
|
|
</para>
|
|
|
|
<para>
|
|
The <filename>/proc/helloworld</filename> is created when the module is loaded with the function
|
|
<function>create_proc_entry</function>. The return value is a 'struct proc_dir_entry *', and it
|
|
will be used to configure the file <filename>/proc/helloworld</filename> (for example, the owner of this file).
|
|
A null return value means that the creation has failed.
|
|
</para>
|
|
|
|
<para>
|
|
Each time, everytime the file <filename>/proc/helloworld</filename> is read, the function
|
|
<function>procfs_read</function> is called.
|
|
Two parameters of this function are very important: the buffer (the first parameter) and the offset (the third one).
|
|
The content of the buffer will be returned to the application which read it (for example the cat command). The offset
|
|
is the current position in the file. If the return value of the function isn't null, then this function is
|
|
called again. So be careful with this function, if it never returns zero, the read function is called endlessly.
|
|
</para>
|
|
|
|
<screen>
|
|
% cat /proc/helloworld
|
|
HelloWorld!
|
|
</screen>
|
|
|
|
<example>
|
|
<title>procfs1.c</title>
|
|
<programlisting>
|
|
<inlinegraphic fileref="lkmpg-examples/05-TheProcFileSystem/procfs1.c" format="linespecific"/></inlinegraphic>
|
|
</programlisting>
|
|
</example>
|
|
|
|
</sect1>
|
|
|
|
|
|
<sect1><title>Read and Write a /proc File</title>
|
|
|
|
<para>
|
|
We have seen a very simple example for a /proc file where we only read the file <filename>/proc/helloworld</filename>.
|
|
It's also possible to write in a /proc file. It works the same way as read, a function is called when the /proc
|
|
file is written. But there is a little difference with read, data comes from user, so you have to import data from
|
|
user space to kernel space (with <function>copy_from_user</function> or <function>get_user</function>)
|
|
</para>
|
|
|
|
<indexterm><primary><function>get_user</function></primary></indexterm>
|
|
<indexterm><primary><function>put_user</function></primary></indexterm>
|
|
<indexterm><primary><function>copy_from_user</function></primary></indexterm>
|
|
<indexterm><primary><function>copy_to_user</function></primary></indexterm>
|
|
<indexterm><primary>memory segments</primary></indexterm>
|
|
<indexterm><primary>segment</primary><secondary>memory</secondary></indexterm>
|
|
|
|
<para>
|
|
The reason for <function>copy_from_user</function> or <function>get_user</function> is that Linux memory (on Intel
|
|
architecture, it may be different under some other processors) is segmented. This means that a pointer, by itself,
|
|
does not reference a unique location in memory, only a location in a memory segment, and you need to know which
|
|
memory segment it is to be able to use it. There is one memory segment for the kernel, and one for each of the processes.
|
|
</para>
|
|
|
|
<para>
|
|
The only memory segment accessible to a process is its own, so when writing regular programs to run as processes,
|
|
there's no need to worry about segments. When you write a kernel module, normally you want to access the kernel memory
|
|
segment, which is handled automatically by the system. However, when the content of a memory buffer needs to be passed between
|
|
the currently running process and the kernel, the kernel function receives a pointer to the memory buffer which is in the
|
|
process segment. The <function>put_user</function> and <function>get_user</function> macros allow you to access that
|
|
memory. These functions handle only one caracter, you can handle several caracters with <function>copy_to_user</function> and
|
|
<function>copy_from_user</function>. As the buffer (in read or write function) is in kernel space, for write function
|
|
you need to import data because it comes from user space, but not for the read function because data is already
|
|
in kernel space.
|
|
</para>
|
|
|
|
<indexterm><primary>read</primary><secondary>in the kernel</secondary></indexterm>
|
|
<indexterm><primary>write</primary><secondary>in the kernel</secondary></indexterm>
|
|
|
|
<example>
|
|
<title>procfs2.c</title>
|
|
<programlisting>
|
|
<inlinegraphic fileref="lkmpg-examples/05-TheProcFileSystem/procfs2.c" format="linespecific"/></inlinegraphic>
|
|
</programlisting>
|
|
</example>
|
|
|
|
</sect1>
|
|
|
|
|
|
<sect1><title>Manage /proc file with standard filesystem</title>
|
|
|
|
<para>
|
|
We have seen how to read and write a /proc file with the /proc interface. But it's also possible
|
|
to manage /proc file with inodes. The main interest is to use advanced function, like permissions.
|
|
</para>
|
|
|
|
<para>
|
|
In Linux, there is a standard mechanism for file system registration. Since every file system has to have its own
|
|
functions to handle inode and file operations<footnote><para>The difference between the two is that file operations deal with
|
|
the file itself, and inode operations deal with ways of referencing the file, such as creating links to it.</para></footnote>,
|
|
there is a special structure to hold pointers to all those functions, <varname>struct inode_operations</varname>, which
|
|
includes a pointer to <varname>struct file_operations</varname>. In /proc, whenever we register a new file, we're allowed to
|
|
specify which <varname>struct inode_operations</varname> will be used to access to it. This is the mechanism we use, a
|
|
<varname>struct inode_operations</varname> which includes a pointer to a <varname>struct file_operations</varname> which
|
|
includes pointers to our <function>procfs_read</function> and <function>procfs_write</function> functions.
|
|
</para>
|
|
|
|
<indexterm><primary>filesystem</primary><secondary>registration</secondary></indexterm>
|
|
<indexterm><primary>filesystem registration</primary></indexterm>
|
|
<indexterm><primary><varname>struct inode_operations</varname></primary></indexterm>
|
|
<indexterm><primary><varname>inode_operations</varname> structure</primary></indexterm>
|
|
<indexterm><primary><varname>struct file_operations</varname></primary></indexterm>
|
|
<indexterm><primary><varname>file_operations</varname> structure</primary></indexterm>
|
|
|
|
<para>
|
|
Another interesting point here is the <function>module_permission</function> function. This function is called whenever
|
|
a process tries to do something with the <filename role="directory">/proc</filename> file, and it can decide whether to allow
|
|
access or not. Right now it is only based on the operation and the uid of the current user (as available in
|
|
<varname>current</varname>, a pointer to a structure which includes information on the currently running process), but it
|
|
could be based on anything we like, such as what other processes are doing with the same file, the time of day, or the last
|
|
input we received.</para>
|
|
|
|
<indexterm><primary>pointer</primary><secondary>current</secondary></indexterm>
|
|
<indexterm><primary>permission</primary></indexterm>
|
|
<indexterm><primary><varname>module_permissions</varname></primary></indexterm>
|
|
|
|
<para>
|
|
It's important to note that the standard roles of read and write are reversed in the kernel. Read functions are used for
|
|
output, whereas write functions are used for input. The reason for that is that read and write refer to the user's point of
|
|
view --- if a process reads something from the kernel, then the kernel needs to output it, and if a process writes something
|
|
to the kernel, then the kernel receives it as input.
|
|
</para>
|
|
|
|
<example>
|
|
<title>procfs3.c</title>
|
|
<programlisting>
|
|
<inlinegraphic fileref="lkmpg-examples/05-TheProcFileSystem/procfs3.c" format="linespecific"/></inlinegraphic>
|
|
</programlisting>
|
|
</example>
|
|
|
|
<para>
|
|
Still hungry for procfs examples? Well, first of all keep in mind, there are rumors around, claiming
|
|
that procfs is on it's way out, consider using sysfs instead. Second, if you really can't get enough,
|
|
there's a highly recommendable bonus level for procfs below <filename> linux/Documentation/DocBook/ </filename>.
|
|
Use <command> make help </command> in your toplevel kernel directory for instructions about how to convert it into
|
|
your favourite format. Example: <command> make htmldocs </command>. Consider using this mechanism,
|
|
in case you want to document something kernel related yourself.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
<sect1><title>Manage /proc file with seq_file</title>
|
|
|
|
<indexterm><primary>seq_file</primary></indexterm>
|
|
|
|
<para>
|
|
As we have seen, writing a /proc file may be quite "complex". So to help people writting
|
|
/proc file, there is an API named seq_file that helps formating a /proc file for output.
|
|
It's based on sequence, which is composed of 3 functions: start(), next(), and stop().
|
|
The seq_file API starts a sequence when a user read the /proc file.
|
|
</para>
|
|
|
|
<para>
|
|
A sequence begins with the call of the function start(). If the return is a non NULL value,
|
|
the function next() is called. This function is an iterator, the goal is to go thought
|
|
all the data. Each time next() is called, the function show() is also called. It writes
|
|
data values in the buffer read by the user. The function next() is called until it returns
|
|
NULL. The sequence ends when next() returns NULL, then the function stop() is called.
|
|
</para>
|
|
|
|
<para>
|
|
BE CARREFUL: when a sequence is finished, another one starts. That means that at the end
|
|
of function stop(), the function start() is called again. This loop finishes when the
|
|
function start() returns NULL. You can see a scheme of this in the figure "How seq_file works".
|
|
</para>
|
|
|
|
<figure>
|
|
<title>How seq_file works</title>
|
|
<graphic fileref="figures/seq_file.png"></graphic>
|
|
</figure>
|
|
|
|
<para>
|
|
Seq_file provides basic functions for file_operations, as seq_read, seq_lseek, and some others.
|
|
But nothing to write in the /proc file. Of course, you can still use the same way as in the
|
|
previous example.
|
|
</para>
|
|
|
|
<example>
|
|
<title>procfs4.c</title>
|
|
<programlisting>
|
|
<inlinegraphic fileref="lkmpg-examples/05-TheProcFileSystem/procfs4.c" format="linespecific"/></inlinegraphic>
|
|
</programlisting>
|
|
</example>
|
|
|
|
<para>
|
|
If you want more information, you can read this web page:
|
|
</para>
|
|
|
|
<itemizedlist>
|
|
<listitem><para><ulink url="http://lwn.net/Articles/22355/"></ulink></para></listitem>
|
|
<listitem><para><ulink url="http://www.kernelnewbies.org/documents/seq_file_howto.txt"></ulink></para></listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
You can also read the code of fs/seq_file.c in the linux kernel.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
<!--
|
|
vim:textwidth=128
|
|
-->
|