This commit is contained in:
gferg 2002-07-22 14:27:09 +00:00
parent 77b94ddeee
commit f61249f8a1
9 changed files with 1360 additions and 478 deletions

View File

@ -22,7 +22,7 @@ jade -t sgml -i html -d /usr/lib/sgml/stylesheets/ldp.dsl\#html ../nfs-howto.sgm
<surname>Barr</surname>
<affiliation>
<address>
<email>tavis@mahler.econ.columbia.edu</email>
<email>tavis dot barr at liu dot edu</email>
</address>
</affiliation>
</author>
@ -31,7 +31,7 @@ jade -t sgml -i html -d /usr/lib/sgml/stylesheets/ldp.dsl\#html ../nfs-howto.sgm
<surname>Langfeldt</surname>
<affiliation>
<address>
<email>janl@linpro.no</email>
<email>janl at linpro dot no</email>
</address>
</affiliation>
</author>
@ -40,7 +40,16 @@ jade -t sgml -i html -d /usr/lib/sgml/stylesheets/ldp.dsl\#html ../nfs-howto.sgm
<surname>Vidal</surname>
<affiliation>
<address>
<email>skvidal@phy.duke.edu</email>
<email>skvidal at phy dot duke dot edu</email>
</address>
</affiliation>
</author>
<author>
<firstname>Tom</firstname>
<surname>McNeal</surname>
<affiliation>
<address>
<email>trmcneal at attbi dot com</email>
</address>
</affiliation>
</author>

View File

@ -9,19 +9,24 @@
standard distributions do. If you are using a 2.2 or later kernel
with the <filename>/proc</filename> filesystem you can check the latter by reading the
file <filename>/proc/filesystems</filename> and making sure there is a line containing
nfs. If not, you will need to build (or download) a kernel that has
NFS support built in.
nfs. If not, typing <userinput>insmod nfs</userinput> may make it
magically appear if NFS has been compiled as a module; otherwise,
you will need to build (or download) a kernel that has
NFS support built in. In general, kernels that do not have NFS
compiled in will give a very specific error when the
<command>mount</command> command below is run.
</para>
<para>
To begin using machine as an NFS client, you will need the portmapper
running on that machine, and to use NFS file locking, you will
also need <filename>rpc.statd</filename> and <filename>rpc.lockd</filename>
also need <command>rpc.statd</command> and <command>rpc.lockd</command>
running on both the client and the server. Most recent distributions
start those services by default at boot time; if yours doesn't, see
<xref linkend="config"> for information on how to start them up.
</para>
<para>
With portmapper, lockd, and statd running, you should now be able to
With <command>portmap</command>, <command>lockd</command>,
and <command>statd</command> running, you should now be able to
mount the remote directory from your server just the way you mount
a local hard drive, with the mount command. Continuing our example
from the previous section, suppose our server above is called
@ -32,7 +37,9 @@
# mount master.foo.com:/home /mnt/home
</screen>
and the directory <filename>/home</filename> on master will appear as the directory
<filename>/mnt/home</filename> on <emphasis>slave1</emphasis>.
<filename>/mnt/home</filename> on <emphasis>slave1</emphasis>. (Note that
this assumes we have created the directory <filename>/mnt/home</filename>
as an empty mount point beforehand.)
</para>
<para>
If this does not work, see the Troubleshooting section (<xref linkend="troubleshooting">).
@ -106,9 +113,10 @@
The program accessing a file on a NFS mounted file system
will hang when the server crashes. The process cannot be
interrupted or killed (except by a "sure kill") unless you also
specify intr. When the NFS server is back online the program will
continue undisturbed from where it was. We recommend using hard,
intr on all NFS mounted file systems.
specify <userinput>intr</userinput>. When the
NFS server is back online the program will
continue undisturbed from where it was. We recommend using
<userinput>hard,intr</userinput> on all NFS mounted file systems.
</para>
</glossdef>
</glossentry>

View File

@ -53,8 +53,10 @@
<orderedlist numeration="lowerroman">
<listitem>
<para>
Version 4.3.2 of AIX requires that file systems be exported with
the insecure option, which causes NFS to listen to requests from
Version 4.3.2 of AIX, and possibly earlier versions as well,
requires that file systems be exported with
the <userinput>insecure</userinput> option, which
causes NFS to listen to requests from
insecure ports (i.e., ports above 1024, to which non-root users can
bind). Older versions of AIX do not seem to require this.
</para>
@ -63,12 +65,14 @@
<para>
AIX clients will default to mounting version 3 NFS over TCP.
If your Linux server does not support this, then you may need
to specify vers=2 and/or proto=udp in your mount options.
to specify <userinput>vers=2</userinput> and/or
<userinput>proto=udp</userinput> in your mount options.
</para>
</listitem>
<listitem>
<para>
Using netmasks in <filename>/etc/exports</filename> seems to sometimes cause clients
Using netmasks in <filename>/etc/exports</filename>
seems to sometimes cause clients
to lose mounts when another client is reset. This can be fixed
by listing out hosts explicitly.
</para>
@ -95,13 +99,15 @@
<title>Linux servers and BSD clients</title>
<para>
Some versions of BSD may make requests to the server from insecure ports,
in which case you will need to export your volumes with the insecure
option. See the man page for <emphasis>exports(5)</emphasis> for more details.
in which case you will need to export your volumes with the
<userinput>insecure</userinput>
option. See the man page for <emphasis>exports(5)</emphasis>
for more details.
</para>
</sect3>
</sect2>
<sect2 id="tru64">
<title>Compaq Tru64 Unix</title>
<title>Tru64 Unix</title>
<sect3 id="tru64server">
<title>Tru64 Unix Servers and Linux Clients</title>
<para>
@ -117,6 +123,12 @@
-root=slave1.foo.com:slave2.foo.com
</programlisting>
</para>
<para>
(The <userinput>root</userinput> option is listed in the last
entry for informational purposes only; its use is not recommended
unless necessary.)
</para>
<para>
Tru64 checks the <filename>/etc/exports</filename> file every time there is a mount request
so you do not need to run the <command>exportfs</command> command; in fact on many
@ -129,7 +141,8 @@
There are two issues to watch out for here. First, Tru64 Unix mounts
using Version 3 NFS by default. You will see mount errors if your
Linux server does not support Version 3 NFS. Second, in Tru64 Unix
4.x, NFS locking requests are made by daemon. You will therefore
4.x, NFS locking requests are made by
<computeroutput>daemon</computeroutput>. You will therefore
need to specify the <userinput>insecure_locks</userinput> option on all volumes you export
to a Tru64 Unix 4.x client; see the <command>exports</command> man pages for details.
</para>
@ -153,7 +166,9 @@
<title>Linux Servers and HP-UX Clients</title>
<para>
HP-UX diskless clients will require at least a kernel version 2.2.19
(or patched 2.2.18) for device files to export correctly.
(or patched 2.2.18) for device files to export correctly. Also, any
exports to an HP-UX client will need to be exported with the
<userinput>insecure_locks</userinput> option.
</para>
</sect3>
</sect2>
@ -176,11 +191,112 @@
2.4 kernel. As a workaround, you can export and mount lower-down
file systems separately.
</para>
</sect3>
<para>
As of Kernel 2.4.17, there continue to be several minor interoperability
issues that may require a kernel upgrade. In particular:
<itemizedlist>
<listitem>
<para>
Make sure that Trond Myklebust's <application>seekdir</application>
(or <application>dir</application>) kernel patch is applied.
The latest version (for 2.4.17) is located at:
</para>
<para>
<ulink url="http://www.fys.uio.no/~trondmy/src/2.4.17/linux-2.4.17-seekdir.dif">
http://www.fys.uio.no/~trondmy/src/2.4.17/linux-2.4.17-seekdir.dif</ulink>
</para>
</listitem>
<listitem>
<para>
IRIX servers do not always use the same
<computeroutput>fsid</computeroutput> attribute field across
reboots, which results in <computeroutput>inode number mismatch</computeroutput>
errors on a Linux
client if the mounted IRIX server reboots. A patch is available from:
</para>
<para><ulink url="http://www.geocrawler.com/lists/3/SourceForge/789/0/7777454/">
http://www.geocrawler.com/lists/3/SourceForge/789/0/7777454/</ulink>
</para>
</listitem>
<listitem>
<para>
Linux kernels v2.4.9 and above have problems reading large directories
(hundreds of files) from exported IRIX XFS file systems that were made
with <userinput>naming version=1</userinput>.
The reason for the problem can be found at:
</para>
<para>
<ulink url="http://www.geocrawler.com/archives/3/789/2001/9/100/6531172/">
http://www.geocrawler.com/archives/3/789/2001/9/100/6531172/</ulink>
</para>
<para>
The naming version can be found by using (on the IRIX server):
</para>
<programlisting>
xfs_growfs -n mount_point
</programlisting>
<para>
The workaround is to export these file systems using the
<userinput>-32bitclients</userinput>
option in the <filename>/etc/exports</filename> file.
The fix is to convert the file system to 'naming version=2'.
Unfortunately the only way to do this is by a
<userinput>backup</userinput>/<userinput>mkfs</userinput>/<userinput>restore</userinput>.
</para>
<para>
<command>mkfs_xfs</command> on IRIX 6.5.14 (and above)
creates <userinput>naming version=2</userinput> XFS file
systems by default. On IRIX 6.5.5 to 6.5.13, use:
<programlisting>
mkfs_xfs -n version=2 device
</programlisting>
</para>
<para>
Versions of IRIX prior to 6.5.5 do not support
<userinput>naming version=2</userinput> XFS file systems.
</para>
</listitem>
</itemizedlist>
</para>
</sect3>
<sect3 id="irixclient">
<title>IRIX clients and Linux servers</title>
<para>
There are no known interoperability issues.
Irix versions up to 6.5.12 have problems mounting file systems exported
from Linux boxes - the mount point "gets lost," e.g.,
<programlisting>
# mount linux:/disk1 /mnt
# cd /mnt/xyz/abc
# pwd
/xyz/abc
</programlisting>
</para>
<para>
This is known IRIX bug (SGI bug 815265 - IRIX not liking file handles of
less than 32 bytes), which is fixed in <application>IRIX 6.5.13</application>.
If it is not possible
to upgrade to <application>IRIX 6.5.13</application>, then the unofficial
workaround is to force the Linux <command>nfsd</command>
to always use 32 byte file handles.
</para>
<para>
A number of patches exist - see:
<itemizedlist>
<listitem>
<para>
<ulink url="http://www.geocrawler.com/archives/3/789/2001/8/50/6371896/">
http://www.geocrawler.com/archives/3/789/2001/8/50/6371896/</ulink>
</para>
</listitem>
<listitem>
<para>
<ulink url="http://oss.sgi.com/projects/xfs/mail_archive/0110/msg00006.html">
http://oss.sgi.com/projects/xfs/mail_archive/0110/msg00006.html</ulink>
</para>
</listitem>
</itemizedlist>
</para>
</sect3>
</sect2>
@ -190,10 +306,11 @@
<title>Solaris Servers</title>
<para>
Solaris has a slightly different format on the server end from
other operating systems. Instead of <filename>/etc/exports</filename>, the configuration
file is <filename>/etc/dfs/dfstab</filename>. Entries are of the form of a "share"
command, where the syntax for the example in <xref linkend="server"> would
look like
other operating systems. Instead of
<filename>/etc/exports</filename>, the configuration
file is <filename>/etc/dfs/dfstab</filename>. Entries are of
the form of a <command>share</command> command, where the syntax
for the example in <xref linkend="server"> would look like
<programlisting>
share -o rw=slave1,slave2 -d "Master Usr" /usr
</programlisting>
@ -202,11 +319,13 @@ share -o rw=slave1,slave2 -d "Master Usr" /usr
<para>
Solaris servers are especially sensitive to packet size. If you
are using a Linux client with a Solaris server, be sure to set
<userinput>rsize</userinput> and <userinput>wsize</userinput> to 32768 at mount time.
<userinput>rsize</userinput> and <userinput>wsize</userinput>
to 32768 at mount time.
</para>
<para>
Finally, there is an issue with root squashing on Solaris: root gets
mapped to the user <emphasis>noone</emphasis>, which is not the same as the user <emphasis>nobody</emphasis>.
mapped to the user <computeroutput>noone</computeroutput>, which
is not the same as the user <computeroutput>nobody</computeroutput>.
If you are having trouble with file permissions as root on the client
machine, be sure to check that the mapping works as you expect.
</para>
@ -221,7 +340,7 @@ svc: unknown program 100227 (me 100003)
</screen>
<para>
This happens because Solaris clients, when they mount, try to obtain
ACL information - which linux obviously does not have. The messages
ACL information - which Linux obviously does not have. The messages
can safely be ignored.
</para>
<para>
@ -249,11 +368,16 @@ svc: unknown program 100227 (me 100003)
/home -rw=slave1.foo.com,slave2.foo.com, root=slave1.foo.com,slave2.foo.com
</programlisting>
</para>
<para>
Again, the <userinput>root</userinput> option is listed for informational
purposes and is not recommended unless necessary.
</para>
</sect3>
<sect3 id="sunosclient">
<title>SunOS Clients</title>
<para>
Be advised that SunOS makes all NFS locking requests as daemon, and
Be advised that SunOS makes all NFS locking requests
as <computeroutput>daemon</computeroutput>, and
therefore you will need to add the <userinput>insecure_locks</userinput> option to any
volumes you export to a SunOS machine. See the <command>exports</command> man page
for details.

View File

@ -17,7 +17,8 @@
</para>
<para>
There are other systems that provide similar functionality to NFS.
Samba provides file services to Windows clients. The Andrew File
Samba (<ulink url="http://www.samba.org">http://www.samba.org</ulink>)
provides file services to Windows clients. The Andrew File
System from IBM (<ulink url="http://www.transarc.com/Product/EFS/AFS/index.html">http://www.transarc.com/Product/EFS/AFS/index.html</ulink>),
recently open-sourced, provides a file sharing mechanism with some
additional security and performance features. The Coda File System
@ -42,10 +43,8 @@
<para>
This HOWTO is not a description of the guts and
underlying structure of NFS. For that you may wish to read
<emphasis>Managing NFS and NIS</emphasis> by Hal Stern, published by O'Reilly &
Associates, Inc. While that book is severely out of date, much
of the structure of NFS has not changed, and the book describes it
very articulately. A much more advanced and up-to-date technical
<emphasis>Linux NFS and Automounter Administration</emphasis> by Erez Zadok (Sybex, 2001). The classic NFS book, updated and still quite useful, is <emphasis>Managing NFS and NIS</emphasis> by Hal Stern, published by O'Reilly &
Associates, Inc. A much more advanced technical
description of NFS is available in <emphasis>NFS Illustrated</emphasis> by Brent Callaghan.
</para>
<para>
@ -58,7 +57,7 @@
</para>
<para>
It will also not cover PC-NFS, which is considered obsolete (users
are encouraged to use Samba to share files with PC's) or NFS
are encouraged to use Samba to share files with Windows machines) or NFS
Version 4, which is still in development.
</para>
</sect2>
@ -98,7 +97,8 @@
patches have been added because NFS Version 3 server support will be
a configuration option. However, unless you have some particular
reason to use an older kernel, you should upgrade because many bugs
have been fixed along the way.
have been fixed along the way. Kernel 2.2.19 contains some additional
locking improvements over 2.2.18.
</para>
<para>
Version 3 functionality will also require the nfs-utils package of
@ -111,11 +111,24 @@
<para>
All 2.4 and higher kernels have full NFS Version 3 functionality.
</para>
<para>
In all cases, if you are building your own kernel, you will need
to select NFS and NFS Version 3 support at compile time. Most
(but not all) standard distributions come with kernels that support
NFS version 3.
</para>
<para>
Handling files larger than 2 GB will require a 2.4x kernel and a
2.2.x version of <application>glibc</application>.
</para>
<para>
All kernels after 2.2.18 support NFS over TCP on the client side.
As of this writing, server-side NFS over TCP only exists in the
later 2.2 series (but not yet in the 2.4 kernels), is considered
experimental, and is somewhat buggy.
As of this writing, server-side NFS over TCP only exists in a
buggy form as an experimental option in the post-2.2.18 series;
patches for 2.4 and 2.5 kernels have been introduced starting with
2.4.17 and 2.5.6. The patches are believed to be stable, though
as of this writing they are relatively new and have not seen
widespread use or integration into the mainstream 2.4 kernel.
</para>
<para>
Because so many of the above functionalities were introduced in
@ -125,8 +138,9 @@
correctly.
</para>
<para>
As we write this document, NFS version 4 is still in development
as a protocol, and it will not be dealt with here.
As we write this document, NFS version 4 has only recently been
finalized as a protocol, and no implementations are considered
production-ready. It will not be dealt with here.
</para>
</sect2>
<sect2 id="furtherhelp">
@ -137,7 +151,66 @@
mailing lists as well as the latest version of nfs-utils, NFS
kernel patches, and other NFS related packages.
</para>
<para>
<para>
When you encounter a problem or have a question not covered in this
manual, the faq or the man pages, you should send a message to the nfs
mailing list (<email>nfs@lists.sourceforge.net</email>). To best help the developers
and other users help you assess your problem you should include:
</para>
<itemizedlist>
<listitem>
<para>
the version of <application>nfs-utils</application> you are using
</para>
</listitem>
<listitem
<para>
the version of the kernel and any non-stock applied kernels.
</para>
</listitem>
<listitem>
<para>
the distribution of linux you are using
</para>
</listitem>
<listitem>
<para>
the version(s) of other operating systems involved.
</para>
</listitem>
</itemizedlist>
<para>
It is also useful to know the networking configuration connecting the
hosts.
</para>
<para>
If your problem involves the inability mount or export shares please
also include:
</para>
<itemizedlist>
<listitem>
<para>
a copy of your <filename>/etc/exports</filename> file
</para>
</listitem>
<listitem>
<para>
the output of <command>rpcinfo -p</command> <emphasis>localhost</emphasis> run on the server
</para>
</listitem>
<listitem>
<para>
the output of <command>rpcinfo -p</command> <emphasis>servername</emphasis> run on the client
</para>
</listitem>
</itemizedlist>
<para>
Sending all of this information with a specific question, after reading
all the documentation, is the best way to ensure a helpful response from
the list.
</para>
<para>
You may also wish to look at the man pages for <emphasis>nfs(5)</emphasis>,
<emphasis>exports(5)</emphasis>, <emphasis>mount(8)</emphasis>, <emphasis>fstab(5)</emphasis>,
<emphasis>nfsd(8)</emphasis>, <emphasis>lockd(8)</emphasis>, <emphasis>statd(8)</emphasis>,

View File

@ -1,252 +1,652 @@
<sect1 id="performance">
<title>Optimizing NFS Performance</title>
<para>
Getting network settings right can improve NFS performance many times
over -- a tenfold increase in transfer speeds is not unheard of.
The most important things to get right are the <userinput>rsize</userinput>
and <userinput>wsize</userinput> <command>mount</command> options. Other factors listed below
may affect people with particular hardware setups.
</para>
Careful analysis of your environment, both from the client and from the server
point of view, is the first step necessary for optimal NFS performance. The
first sections will address issues that are generally important to the client.
Later (<xref linkend="frag-overflow"> and beyond), server side issues
will be discussed. In both
cases, these issues will not be limited exclusively to one side or the other,
but it is useful to separate the two in order to get a clearer picture of
cause and effect.
</para>
<para>
Aside from the general network configuration - appropriate network capacity,
faster NICs, full duplex settings in order to reduce collisions, agreement in
network speed among the switches and hubs, etc. - one of the most important
client optimization settings are the NFS data transfer buffer sizes, specified
by the <command>mount</command> command options <userinput>rsize</userinput>
and <userinput>wsize</userinput>.
</para>
<sect2 id="blocksizes">
<title>Setting Block Size to Optimize Transfer Speeds</title>
<para>
The <userinput>rsize</userinput> and <userinput>wsize</userinput>
<command>mount</command> options specify the size of the chunks of data
that the client and server pass back and forth to each other. If no
<userinput>rsize</userinput> and <userinput>wsize</userinput> options
are specified, the default varies by which version of NFS we are using.
4096 bytes is the most common default, although for TCP-based mounts
in 2.2 kernels, and for all mounts beginning with 2.4 kernels, the
server specifies the default block size.
</para>
<para>
The defaults may be too big or too small. On the one hand, some
combinations of Linux kernels and network cards (largely on older
machines) cannot handle blocks that large. On the other hand, if they
can handle larger blocks, a bigger size might be faster.
</para>
<para>
So we'll want to experiment and find an rsize and wsize that works
and is as fast as possible. You can test the speed of your options
with some simple commands.
</para>
<para>
The first of these commands transfers 16384 blocks of 16k each from
the special file <filename>/dev/zero</filename> (which if you read it
just spits out zeros _really_ fast) to the mounted partition. We will
time it to see how long it takes. So, from the client machine, type:
<screen>
The <command>mount</command> command options <userinput>rsize</userinput>
and <userinput>wsize</userinput> specify the size of the chunks of
data that the client and server pass back and forth
to each other. If no <userinput>rsize</userinput>
and <userinput>wsize</userinput> options are specified,
the default varies by which version of NFS we
are using. The most common default is 4K (4096 bytes), although for TCP-based
mounts in 2.2 kernels, and for all mounts beginning with 2.4 kernels, the
server specifies the default block size.
</para>
<para>
The theoretical limit for the NFS V2 protocol is 8K. For the V3 protocol, the
limit is specific to the server. On the Linux server, the maximum block size
is defined by the value of the kernel constant
<userinput>NFSSVC_MAXBLKSIZE</userinput>, found in the
Linux kernel source file <filename>./include/linux/nfsd/const.h</filename>.
The current maximum block size for the kernel, as of 2.4.17, is 8K (8192 bytes),
but the patch set implementing NFS over TCP/IP transport in the 2.4
series, as of this writing, uses a value of 32K (defined in the
patch as 32*1024) for the maximum block size.
</para>
<para>
All 2.4 clients currently support up to 32K block transfer sizes, allowing the
standard 32K block transfers across NFS mounts from other servers, such as
Solaris, without client modification.
</para>
<para>
The defaults may be too big or too small, depending on the specific
combination of hardware and kernels. On the one hand, some combinations of
Linux kernels and network cards (largely on older machines) cannot handle
blocks that large. On the other hand, if they can handle larger blocks, a
bigger size might be faster.
</para>
<para>
You will want to experiment and find an <userinput>rsize</userinput>
and <userinput>wsize</userinput> that works and is as
fast as possible. You can test the speed of your options with some simple
commands, if your network environment is not heavily used. Note that your
results may vary widely unless you resort to using more complex benchmarks,
such as <application>Bonnie</application>, <application>Bonnie++</application>,
or <application>IOzone</application>.
</para>
<para>
The first of these commands transfers 16384 blocks of 16k each from the
special file <filename>/dev/zero</filename> (which if
you read it just spits out zeros <emphasis>really</emphasis>
fast) to the mounted partition. We will time it to see how long it takes. So,
from the client machine, type:
</para>
<programlisting>
# time dd if=/dev/zero of=/mnt/home/testfile bs=16k count=16384
</screen>
</para>
<para>
This creates a 256Mb file of zeroed bytes. In general, you should
create a file that's at least twice as large as the system RAM
on the server, but make sure you have enough disk space! Then read
back the file into the great black hole on the client machine
(<filename>/dev/null</filename>) by typing the following:
<screen>
</programlisting>
<para>
This creates a 256Mb file of zeroed bytes. In general, you should create a
file that's at least twice as large as the system RAM on the server, but make
sure you have enough disk space! Then read back the file into the great black
hole on the client machine (<filename>/dev/null</filename>) by
typing the following:
</para>
<programlisting>
# time dd if=/mnt/home/testfile of=/dev/null bs=16k
</screen>
</para>
<para>
Repeat this a few times and average how long it takes. Be sure to
unmount and remount the filesystem each time (both on the client and,
if you are zealous, locally on the server as well), which should clear
out any caches.
</para>
<para>
Then unmount, and mount again with a larger and smaller block size.
They should probably be multiples of 1024, and not larger than
8192 bytes since that's the maximum size in NFS version 2. (Though
if you are using Version 3 you might want to try up to 32768.)
Wisdom has it that the block size should be a power of two since most
of the parameters that would constrain it (such as file system block
sizes and network packet size) are also powers of two. However, some
users have reported better successes with block sizes that are not
powers of two but are still multiples of the file system block size
and the network packet size.
</para>
<para>
Directly after mounting with a larger size, cd into the mounted
file system and do things like ls, explore the fs a bit to make
sure everything is as it should. If the rsize/wsize is too large
the symptoms are very odd and not 100% obvious. A typical symptom
is incomplete file lists when doing 'ls', and no error messages.
Or reading files failing mysteriously with no error messages. After
establishing that the given rsize/wsize works you can do the speed
tests again. Different server platforms are likely to have different
optimal sizes. SunOS and Solaris is reputedly a lot faster with 4096
byte blocks than with anything else.
</para>
<para>
<emphasis>Remember to edit <filename>/etc/fstab</filename> to reflect the rsize/wsize you found.</emphasis>
</para>
</programlisting>
<para>
Repeat this a few times and average how long it takes. Be sure to unmount and
remount the filesystem each time (both on the client and, if you are zealous,
locally on the server as well), which should clear out any caches.
</para>
<para>
Then unmount, and mount again with a larger and smaller block size. They
should be multiples of 1024, and not larger than the maximum block size
allowed by your system. Note that NFS Version 2 is limited to a maximum of 8K,
regardless of the maximum block size defined by
<userinput>NFSSVC_MAXBLKSIZE</userinput>; Version 3
will support up to 64K, if permitted. The block size should be a power of two
since most of the parameters that would constrain it (such as file system
block sizes and network packet size) are also powers of two. However, some
users have reported better successes with block sizes that are not powers of
two but are still multiples of the file system block size and the network
packet size.
</para>
<para>
Directly after mounting with a larger size, cd into the mounted
file system and do things like <command>ls</command>, explore
the filesystem a bit to make sure everything is as it
should. If the <userinput>rsize</userinput>/<userinput>wsize</userinput>
is too large the symptoms are very odd and not 100%
obvious. A typical symptom is incomplete file lists when doing
<command>ls</command>, and no
error messages, or reading files failing mysteriously with no error messages.
After establishing that the given <userinput>rsize</userinput>/
<userinput>wsize</userinput> works you can do the speed tests
again. Different server platforms are likely to have different optimal sizes.
</para>
<para>
Remember to edit <filename>/etc/fstab</filename> to reflect the
<userinput>rsize</userinput>/<userinput>wsize</userinput> you found
to be the most desirable.
</para>
<para>
If your results seem inconsistent, or doubtful, you may need to analyze your
network more extensively while varying the <userinput>rsize</userinput>
and <userinput>wsize</userinput> values. In that
case, here are several pointers to benchmarks that may prove useful:
</para>
<itemizedlist>
<listitem>
<para>
Bonnie <ulink
url="http://www.textuality.com/bonnie/">http://www.textuality.com/bonnie/</ulink>
</para>
</listitem>
<listitem>
<para>
Bonnie++ <ulink
url="http://www.coker.com.au/bonnie++/">http://www.coker.com.au/bonnie++/</ulink>
</para>
</listitem>
<listitem>
<para>
IOzone file system benchmark <ulink url="http://www.iozone.org/">http://www.iozone.org/</ulink>
</para>
</listitem>
<listitem>
<para>
The official NFS benchmark,
SPECsfs97 <ulink url="http://www.spec.org/osg/sfs97/">http://www.spec.org/osg/sfs97/</ulink>
</para>
</listitem>
</itemizedlist>
<para>
The easiest benchmark with the widest coverage, including an extensive spread
of file sizes, and of IO types - reads, & writes, rereads & rewrites, random
access, etc. - seems to be IOzone. A recommended invocation of IOzone (for
which you must have root privileges) includes unmounting and remounting the
directory under test, in order to clear out the caches between tests, and
including the file close time in the measurements. Assuming you've already
exported <filename>/tmp</filename> to everyone from the server
<computeroutput>foo</computeroutput>,
and that you've installed IOzone in the local directory, this should work:
</para>
<programlisting>
# echo "foo:/tmp /mnt/foo nfs rw,hard,intr,rsize=8192,wsize=8192 0 0"
>> /etc/fstab
# mkdir /mnt/foo
# mount /mnt/foo
# ./iozone -a -R -c -U /mnt/foo -f /mnt/foo/testfile > logfile
</programlisting>
<para>
The benchmark should take 2-3 hours at most, but of course you will need to
run it for each value of rsize and wsize that is of interest. The web site
gives full documentation of the parameters, but the specific options used
above are:
</para>
<itemizedlist>
<listitem>
<para>
<userinput>-a</userinput> Full automatic mode, which tests file sizes of 64K to 512M, using
record sizes of 4K to 16M
</para>
</listitem>
<listitem>
<para>
<userinput>-R</userinput> Generate report in excel spreadsheet form (The "surface plot"
option for graphs is best)
</para>
</listitem>
<listitem>
<para>
<userinput>-c</userinput> Include the file close time in the tests, which will pick up the
NFS version 3 commit time
</para>
</listitem>
<listitem>
<para>
<userinput>-U</userinput> Use the given mount point to unmount and remount between tests;
it clears out caches
</para>
</listitem>
<listitem>
<para>
<userinput>-f</userinput> When using unmount, you have to locate the test file in the
mounted file system
</para>
</listitem>
</itemizedlist>
</sect2>
<sect2 id="packet-and-network">
<title>Packet Size and Network Drivers</title>
<para>
There are many shoddy network drivers available for Linux,
including for some fairly standard cards.
</para>
While many Linux network card drivers are excellent, some are quite shoddy,
including a few drivers for some fairly standard cards. It is worth
experimenting with your network card directly to find out how it can
best handle traffic.
</para>
<para>
Try <command>ping</command>ing back and forth
between the two machines with large packets using
the <userinput>-f</userinput> and <userinput>-s</userinput>
options with <command>ping</command> (see <emphasis>ping(8)</emphasis>
for more details) and see if a
lot of packets get dropped, or if they take a long time for a reply. If so,
you may have a problem with the performance of your network card.
</para>
<para>
For a more extensive analysis of NFS behavior in particular, use the <command>
nfsstat</command> command to look at nfs transactions, client and server statistics, network
statistics, and so forth. The <userinput>"-o net"</userinput> option will show you the number of
dropped packets in relation to the total number of transactions. In UDP
transactions, the most important statistic is the number of retransmissions,
due to dropped packets, socket buffer overflows, general server congestion,
timeouts, etc. This will have a tremendously important effect on NFS
performance, and should be carefully monitored.
Note that <command>nfsstat</command> does not yet
implement the <userinput>-z</userinput> option, which would zero out all counters, so you must look
at the current <command>nfsstat</command> counter values prior to running the benchmarks.
</para>
<para>
To correct network problems, you may wish to reconfigure the packet size that
your network card uses. Very often there is a constraint somewhere else in the
network (such as a router) that causes a smaller maximum packet size between
two machines than what the network cards on the machines are actually capable
of. TCP should autodiscover the appropriate packet size for a network, but UDP
will simply stay at a default value. So determining the appropriate packet
size is especially important if you are using NFS over UDP.
</para>
<para>
You can test for the network packet size using the <command>tracepath</command> command: From
the client machine, just type <userinput>tracepath</userinput>
<emphasis>server</emphasis> <userinput>2049</userinput> and the path MTU should
be reported at the bottom. You can then set the MTU on your network card equal
to the path MTU, by using the <userinput>MTU</userinput>
option to <command>ifconfig</command>, and see if fewer packets
get dropped. See the <command>ifconfig</command> man pages for details on how to reset the MTU.
</para>
<para>
In addition, <command>netstat -s</command> will give the statistics collected for traffic across
all supported protocols. You may also look at
<filename>/proc/net/snmp</filename> for information
about current network behavior; see the next section for more details.
</para>
</sect2>
<sect2 id="frag-overflow">
<title>Overflow of Fragmented Packets</title>
<para>
Try pinging back and forth between the two machines with large
packets using the <option>-f</option> and <option>-s</option>
options with <command>ping</command> (see <command>man ping</command>)
for more details and see if a lot of packets get or if they
take a long time for a reply. If so, you may have a problem
with the performance of your network card.
</para>
Using an <userinput>rsize</userinput> or <userinput>wsize</userinput>
larger than your network's MTU (often set to 1500, in
many networks) will cause IP packet fragmentation when using NFS over UDP. IP
packet fragmentation and reassembly require a significant amount of CPU
resource at both ends of a network connection. In addition, packet
fragmentation also exposes your network traffic to greater unreliability,
since a complete RPC request must be retransmitted if a UDP packet fragment is
dropped for any reason. Any increase of RPC retransmissions, along with the
possibility of increased timeouts, are the single worst impediment to
performance for NFS over UDP.
</para>
<para>
Packets may be dropped for many reasons. If your network topography is
complex, fragment routes may differ, and may not all arrive at the Server for
reassembly. NFS Server capacity may also be an issue, since the kernel has a
limit of how many fragments it can buffer before it starts throwing away
packets. With kernels that support the <filename>/proc</filename>
filesystem, you can monitor the
files <filename>/proc/sys/net/ipv4/ipfrag_high_thresh</filename> and
<filename>/proc/sys/net/ipv4/ipfrag_low_thresh</filename>. Once the number of unprocessed,
fragmented packets reaches the number specified by <filename>ipfrag_high_thresh</filename> (in
bytes), the kernel will simply start throwing away fragmented packets until
the number of incomplete packets reaches the number specified by
<filename>ipfrag_low_thresh.</filename>
</para>
<para>
Another counter to monitor is <userinput>IP: ReasmFails</userinput>
in the file <filename>/proc/net/snmp</filename>; this
is the number of fragment reassembly failures. if it goes up too quickly
during heavy file activity, you may have problem.
</para>
</sect2>
<sect2 id="nfs-tcp">
<title>NFS over TCP</title>
<para>
To correct such a problem, you may wish to reconfigure the packet
size that your network card uses. Very often there is a constraint
somewhere else in the network (such as a router) that causes a
smaller maximum packet size between two machines than what the
network cards on the machines are actually capable of. TCP should
autodiscover the appropriate packet size for a network, but UDP
will simply stay at a default value. So determining the appropriate
packet size is especially important if you are using NFS over UDP.
</para>
A new feature, available for both 2.4 and 2.5 kernels but not yet
integrated into the mainstream kernel at the time of
this writing, is NFS over TCP. Using TCP
has a distinct advantage and a distinct disadvantage over UDP. The advantage
is that it works far better than UDP on lossy networks.
When using TCP, a single dropped packet can be retransmitted, without
the retransmission of the entire RPC request, resulting in better performance
on lossy networks. In addition, TCP will handle network speed differences
better than UDP, due to the underlying flow control at the network level.
</para>
<para>
The disadvantage of using TCP is that it is not a stateless protocol like
UDP. If your server crashes in the middle of a packet transmission,
the client will hang and any shares will need to be unmounted and remounted.
</para>
<para>
The overhead incurred by the TCP protocol will result in
somewhat slower performance than UDP under ideal network
conditions, but the cost is not severe, and is often not
noticable without careful measurement. If you are using
gigabit ethernet from end to end, you might also investigate the
usage of jumbo frames, since the high speed network may
allow the larger frame sizes without encountering increased
collision rates, particularly if you have set the network
to full duplex.
</para>
</sect2>
<sect2 id="timeout">
<title>Timeout and Retransmission Values</title>
<para>
You can test for the network packet size using the tracepath command:
From the client machine, just type <command>tracepath [server] 2049</command>
and the path MTU should be reported at the bottom. You can then set the
MTU on your network card equal to the path MTU, by using the MTU option
to <command>ifconfig</command>, and see if fewer packets get dropped.
See the <command>ifconfig</command> man pages for details on how to reset the MTU.
</para>
Two mount command options, <userinput>timeo</userinput>
and <userinput>retrans</userinput>, control the behavior of UDP
requests when encountering client timeouts due to dropped packets, network
congestion, and so forth. The <userinput>-o timeo</userinput>
option allows designation of the length
of time, in tenths of seconds, that the client will wait until it decides it
will not get a reply from the server, and must try to send the request again.
The default value is 7 tenths of a second. The
<userinput>-o retrans</userinput> option allows
designation of the number of timeouts allowed before the client gives up, and
displays the <computeroutput>Server not responding</computeroutput>
message. The default value is 3 attempts.
Once the client displays this message, it will continue to try to send
the request, but only once before displaying the error message if
another timeout occurs. When the client reestablishes contact, it
will fall back to using the correct <userinput>retrans</userinput>
value, and will display the <computeroutput>Server OK</computeroutput> message.
</para>
<para>
If you are already encountering excessive retransmissions (see the output of
the <command>nfsstat</command> command), or want to increase the block transfer size without
encountering timeouts and retransmissions, you may want to adjust these
values. The specific adjustment will depend upon your environment, and in most
cases, the current defaults are appropriate.
</para>
</sect2>
<sect2 id="nfsd-instance">
<title>Number of Instances of NFSD</title>
<title>Number of Instances of the NFSD Server Daemon</title>
<para>
Most startup scripts, Linux and otherwise, start 8 instances of nfsd.
In the early days of NFS, Sun decided on this number as a rule of thumb,
and everyone else copied. There are no good measures of how many
instances are optimal, but a more heavily-trafficked server may require
more. If you are using a 2.4 or higher kernel and you want to see how
heavily each nfsd thread is being used, you can look at the file
<filename>/proc/net/rpc/nfsd</filename>. The last ten numbers on the
<emphasis>th</emphasis> line in that file indicate the number of seconds
that the thread usage was at that percentage of the maximum allowable.
If you have a large number in the top three deciles, you may wish to
increase the number of <command>nfsd</command> instances. This is done
upon starting <command>nfsd</command> using the number of instances as
the command line option. See the <command>nfsd</command> man page for
more information.
</para>
Most startup scripts, Linux and otherwise, start 8 instances of
<command>nfsd</command>. In the
early days of NFS, Sun decided on this number as a rule of thumb, and everyone
else copied. There are no good measures of how many instances are optimal, but
a more heavily-trafficked server may require more.
You should use at the very least one daemon per processor, but
four to eight per processor may be a better rule of thumb.
If you are using a 2.4 or
higher kernel and you want to see how heavily each
<command>nfsd</command> thread is being used,
you can look at the file <filename>/proc/net/rpc/nfsd</filename>.
The last ten numbers on the <userinput>th</userinput>
line in that file indicate the number of seconds that the thread usage was at
that percentage of the maximum allowable. If you have a large number in the
top three deciles, you may wish to increase the number
of <command>nfsd</command> instances. This
is done upon starting <command>nfsd</command> using the
number of instances as the command line
option, and is specified in the NFS startup script
(<filename>/etc/rc.d/init.d/nfs</filename> on
Red Hat) as <userinput>RPCNFSDCOUNT</userinput>.
See the <emphasis>nfsd(8)</emphasis> man page for more information.
</para>
</sect2>
<sect2 id="memlimits">
<title>Memory Limits on the Input Queue</title>
<para>
On 2.2 and 2.4 kernels, the socket input queue, where requests
sit while they are currently being processed, has a small default
size limit of 64k. This means that if you are running 8 instances of
<command>nfsd</command>, each will only have 8k to store requests while it processes
them.
</para>
<para>
You should consider increasing this number to at least 256k for <command>nfsd</command>.
This limit is set in the proc file system using the files
<filename>/proc/sys/net/core/rmem_default</filename> and <filename>/proc/sys/net/core/rmem_max</filename>.
It can be increased in three steps; the following method is a bit of
a hack but should work and should not cause any problems:
</para>
<para>
<orderedlist Numeration="loweralpha">
<listitem>
<para>Increase the size listed in the file:
<programlisting>
echo 262144 > /proc/sys/net/core/rmem_default
echo 262144 > /proc/sys/net/core/rmem_max
</programlisting>
</para>
</listitem>
<listitem>
<para>
Restart <command>nfsd</command>, e.g., type <command>/etc/rc.d/init.d/nfsd restart</command> on Red Hat
</para>
</listitem>
<listitem>
<para>
Return the size limits to their normal size in case other kernel systems depend on it:
<programlisting>
echo 65536 > /proc/sys/net/core/rmem_default
echo 65536 > /proc/sys/net/core/rmem_max
</programlisting>
</para>
<para>
<emphasis>
Be sure to perform this last step because machines have been reported
to crash if these values are left changed for long periods of time.
</emphasis>
</para>
</listitem>
</orderedlist>
On 2.2 and 2.4 kernels, the socket input queue, where requests sit while they
are currently being processed, has a small default size limit (<filename>rmem_default</filename>)
of 64k. This queue is important for clients with heavy read loads, and servers
with heavy write loads. As an example, if you are running 8 instances of nfsd
on the server, each will only have 8k to store write requests while it
processes them. In addition, the socket output queue - important for clients
with heavy write loads and servers with heavy read loads - also has a small
default size (<filename>wmem_default</filename>).
</para>
<para>
Several published runs of the NFS benchmark
<ulink url="http://www.spec.org/osg/sfs97/">SPECsfs</ulink>
specify usage of a much higher value for both
the read and write value sets, <filename>[rw]mem_default</filename> and
<filename>[rw]mem_max</filename>. You might
consider increasing these values to at least 256k. The read and write limits
are set in the proc file system using (for example) the files
<filename>/proc/sys/net/core/rmem_default</filename> and
<filename>/proc/sys/net/core/rmem_max</filename>. The
<filename>rmem_default</filename> value can be increased in three steps; the following method is a
bit of a hack but should work and should not cause any problems:
</para>
<itemizedlist>
<listitem>
<para>
Increase the size listed in the file:
</para>
<programlisting>
# echo 262144 > /proc/sys/net/core/rmem_default
# echo 262144 > /proc/sys/net/core/rmem_max
</programlisting>
</listitem>
<listitem>
<para>
Restart NFS. For example, on Red Hat systems,
</para>
<programlisting>
# /etc/rc.d/init.d/nfs restart
</programlisting>
</listitem>
<listitem>
<para>
You might return the size limits to their normal size in case other
kernel systems depend on it:
</para>
<programlisting>
# echo 65536 > /proc/sys/net/core/rmem_default
# echo 65536 > /proc/sys/net/core/rmem_max
</programlisting>
</listitem>
</itemizedlist>
<para>
This last step may be necessary because machines have been reported to
crash if these values are left changed for long periods of time.
</para>
</sect2>
<sect2 id="frag-overflow">
<title>Overflow of Fragmented Packets</title>
<para>
The NFS protocol uses fragmented UDP packets. The kernel has
a limit of how many fragments of incomplete packets it can
buffer before it starts throwing away packets. With 2.2 kernels
that support the <filename>/proc</filename> filesystem, you can
specify how many by editing the files
<filename>/proc/sys/net/ipv4/ipfrag_high_thresh</filename> and
<filename>/proc/sys/net/ipv4/ipfrag_low_thresh</filename>.
</para>
<para>
Once the number of unprocessed, fragmented packets reaches the
number specified by <userinput>ipfrag_high_thresh</userinput> (in bytes), the kernel
will simply start throwing away fragmented packets until the number
of incomplete packets reaches the number specified
by <userinput>ipfrag_low_thresh</userinput>. (With 2.2 kernels, the default is usually 256K).
This will look like packet loss, and if the high threshold is
reached your server performance drops a lot.
</para>
<para>
One way to monitor this is to look at the field IP: ReasmFails in the
file <filename>/proc/net/snmp</filename>; if it goes up too quickly during heavy file
activity, you may have problem. Good alternative values for
<userinput>ipfrag_high_thresh</userinput> and <userinput>ipfrag_low_thresh</userinput>
have not been reported; if you have a good experience with a
particular value, please let the maintainers and development team know.
</para>
</sect2>
<sect2 id="autonegotiation">
<title>Turning Off Autonegotiation of NICs and Hubs</title>
<para>
Sometimes network cards will auto-negotiate badly with
hubs and switches and this can have strange effects.
Moreover, hubs may lose packets if they have different
ports running at different speeds. Try playing around
with the network speed and duplex settings.
</para>
If network cards auto-negotiate badly with hubs and switches, and ports run at
different speeds, or with different duplex configurations, performance will be
severely impacted due to excessive collisions, dropped packets, etc. If you
see excessive numbers of dropped packets in the
<command>nfsstat</command> output, or poor
network performance in general, try playing around with the network speed and
duplex settings. If possible, concentrate on establishing a 100BaseT full
duplex subnet; the virtual elimination of collisions in full duplex will
remove the most severe performance inhibitor for NFS over UDP. Be careful
when turning off autonegotiation on a card: The hub or switch that the card
is attached to will then resort to other mechanisms (such as parallel detection)
to determine the duplex settings, and some cards default to half duplex
because it is more likely to be supported by an old hub. The best solution,
if the driver supports it, is to force the card to negotiate 100BaseT
full duplex.
</para>
</sect2>
<sect2 id="sync-async">
<title>Synchronous vs. Asynchronous Behavior in NFS</title>
<para>
The default export behavior for both NFS Version 2 and Version 3 protocols,
used by <command>exportfs</command> in <application>nfs-utils</application>
versions prior to Version 1.11 (the latter is in the CVS tree,
but not yet released in a package, as of January, 2002) is
"asynchronous". This default permits the server to reply to client requests as
soon as it has processed the request and handed it off to the local file
system, without waiting for the data to be written to stable storage. This is
indicated by the <userinput>async</userinput> option denoted in the server's export list. It yields
better performance at the cost of possible data corruption if the server
reboots while still holding unwritten data and/or metadata in its caches. This
possible data corruption is not detectable at the time of occurrence, since
the <userinput>async</userinput> option instructs the server to lie to the client, telling the
client that all data has indeed been written to the stable storage, regardless
of the protocol used.
</para>
<para>
In order to conform with "synchronous" behavior, used as the default for most
proprietary systems supporting NFS (Solaris, HP-UX, RS/6000, etc.), and now
used as the default in the latest version of <command>exportfs</command>, the Linux Server's
file system must be exported with the <userinput>sync</userinput> option. Note that specifying
synchronous exports will result in no option being seen in the server's export
list:
</para>
<itemizedlist>
<listitem>
<para>
Export a couple file systems to everyone, using slightly different
options:
</para>
<para>
<programlisting>
# /usr/sbin/exportfs -o rw,sync *:/usr/local
# /usr/sbin/exportfs -o rw *:/tmp
</programlisting>
</para>
</listitem>
<listitem>
<para>
Now we can see what the exported file system parameters look like:
</para>
<para>
<programlisting>
# /usr/sbin/exportfs -v
/usr/local *(rw)
/tmp *(rw,async)
</programlisting>
</para>
</listitem>
</itemizedlist>
<para>
If your kernel is compiled with the <filename>/proc</filename> filesystem,
then the file <filename>/proc/fs/nfs/exports</filename> will also show the
full list of export options.
</para>
<para>
When synchronous behavior is specified, the server will not complete (that is,
reply to the client) an NFS version 2 protocol request until the local file
system has written all data/metadata to the disk. The server
<emphasis>will</emphasis> complete a
synchronous NFS version 3 request without this delay, and will return the
status of the data in order to inform the client as to what data should be
maintained in its caches, and what data is safe to discard. There are 3
possible status values, defined an enumerated type, <userinput>nfs3_stable_how</userinput>, in
<filename>include/linux/nfs.h</filename>. The values, along with the subsequent actions taken due
to these results, are as follows:
</para>
<itemizedlist>
<listitem>
<para>
NFS_UNSTABLE - Data/Metadata was not committed to stable storage on the
server, and must be cached on the client until a subsequent client commit
request assures that the server does send data to stable storage.
</para>
</listitem>
<listitem>
<para>
NFS_DATA_SYNC - Metadata was not sent to stable storage, and must be cached
on the client. A subsequent commit is necessary, as is required above.
</para>
</listitem>
<listitem>
<para>
NFS_FILE_SYNC - No data/metadata need be cached, and a subsequent commit
need not be sent for the range covered by this request.
</para>
</listitem>
</itemizedlist>
<para>
In addition to the above definition of synchronous behavior, the client may
explicitly insist on total synchronous behavior, regardless of the protocol,
by opening all files with the <userinput>O_SYNC</userinput> option. In this case, all replies to
client requests will wait until the data has hit the server's disk, regardless
of the protocol used (meaning that, in NFS version 3, all requests will be
<userinput>NFS_FILE_SYNC</userinput> requests, and will require that the Server returns this status).
In that case, the performance of NFS Version 2 and NFS Version 3 will be
virtually identical.
</para>
<para>
If, however, the old default <userinput>async</userinput>
behavior is used, the <userinput>O_SYNC</userinput> option has
no effect at all in either version of NFS, since the server will reply to the
client without waiting for the write to complete. In that case the performance
differences between versions will also disappear.
</para>
<para>
Finally, note that, for NFS version 3 protocol requests, a subsequent commit
request from the NFS client at file close time, or at <command>fsync()</command> time, will force
the server to write any previously unwritten data/metadata to the disk, and
the server will not reply to the client until this has been completed, as long
as <userinput>sync</userinput> behavior is followed. If <userinput>async</userinput> is used, the commit is essentially
a no-op, since the server once again lies to the client, telling the client that
the data has been sent to stable storage. This again exposes the client and
server to data corruption, since cached data may be discarded on the client
due to its belief that the server now has the data maintained in stable
storage.
</para>
</sect2>
<sect2 id="non-nfs-performance">
<title>Non-NFS-Related Means of Enhancing Server Performance</title>
<para>
Offering general guidelines for setting up a well-functioning
file server is outside the scope of this document, but a few
hints may be worth mentioning: First, RAID 5 gives you good
read speeds but lousy write speeds; consider RAID 1/0 if both
write speed and redundancy are important. Second, using a
journalling filesystem will drastically reduce your reboot
time in the event of a system crash; as of this writing, ext3
(<ulink url="ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/">ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/</ulink>) was the only
journalling filesystem that worked correctly with
NFS version 3, but no doubt that will change soon.
In particular, it looks like <ulink url="http://www.namesys.com">Reiserfs</ulink>
should work with NFS version 3 on 2.4 kernels, though not yet
on 2.2 kernels. Finally, using an automounter (such as autofs
or amd) may prevent hangs if you cross-mount files
on your machines (whether on purpose or by oversight) and one of those
machines goes down. See the
<ulink url="http://www.linuxdoc.org/HOWTO/mini/Automount.html">Automount Mini-HOWTO</ulink>
for details.
</para>
In general, server performance and server disk access speed will have an
important effect on NFS performance.
Offering general guidelines for setting up a well-functioning file server is
outside the scope of this document, but a few hints may be worth mentioning:
</para>
<itemizedlist>
<listitem>
<para>
If you have access to RAID arrays, use RAID 1/0 for both write speed and
redundancy; RAID 5 gives you good read speeds but lousy write speeds.
</para>
</listitem>
<listitem>
<para>
A journalling filesystem will drastically reduce your reboot time in the
event of a system crash. Currently,
<ulink url="ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/">ext3
</ulink> will work correctly with NFS
version 3. In addition, Reiserfs version 3.6 will work with NFS version 3 on
2.4.7 or later kernels (patches are available for previous kernels). Earlier versions
of Reiserfs did not include room for generation numbers in the inode, exposing
the possibility of undetected data corruption during a server reboot.
</para>
</listitem>
<listitem>
<para>
Additionally, journalled file systems can be configured to maximize
performance by taking advantage of the fact that journal updates are all that
is necessary for data protection. One example is using ext3 with <userinput>data=journal</userinput>
so that all updates go first to the journal, and later to the main file
system. Once the journal has been updated, the NFS server can safely issue the
reply to the clients, and the main file system update can occur at the
server's leisure.
</para>
<para>
The journal in a journalling file system may also reside on a separate device
such as a flash memory card so that journal updates normally require no seeking. With only rotational
delay imposing a cost, this gives reasonably good synchronous IO performance.
Note that ext3 currently supports journal relocation, and ReiserFS will
(officially) support it soon. The Reiserfs tool package found at <ulink
url="ftp://ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.x.0k.tar.gz">
ftp://ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.x.0k.tar.gz </ulink>
contains
the <command>reiserfstune</command> tool, which will allow journal relocation. It does, however,
require a kernel patch which has not yet been officially released as of
January, 2002.</para>
</listitem>
<listitem>
<para>
Using an automounter (such as <application>autofs</application> or <application>amd</application>) may prevent hangs if you
cross-mount files on your machines (whether on purpose or by oversight) and
one of those machines goes down. See the <ulink url="http://www.linuxdoc.org/HOWTO/mini/Automount.html">Automount Mini-HOWTO</ulink> for details.
</para>
</listitem>
<listitem>
<para>
Some manufacturers (Network Appliance, Hewlett Packard, and others) provide NFS
accelerators in the form of Non-Volatile RAM. NVRAM will boost access speed to
stable storage up to the equivalent of <userinput>async</userinput> access.
</para>
</listitem>
</itemizedlist>
</sect2>
</sect1>

View File

@ -3,7 +3,8 @@
<sect2 id="legal">
<title>Legal stuff</title>
<para>
Copyright (c) <2001> by Tavis Barr, Nicolai Langfeldt, and Seth Vidal.
Copyright (c) <2002> by Tavis Barr, Nicolai Langfeldt,
Seth Vidal, and Tom McNeal.
This material may be distributed only subject to the terms and conditions set
forth in the Open Publication License, v1.0 or later (the latest version
is presently available at <ulink url="http://www.opencontent.org/openpub/">http://www.opencontent.org/openpub/</ulink>).
@ -23,7 +24,7 @@ divorce, or any other calamity.
<sect2 id="feedback">
<title>Feedback</title>
<para>This will never be a finished document; we welcome feedback about
how it can be improved. As of October 2000, the Linux NFS home
how it can be improved. As of February 2002, the Linux NFS home
page is being hosted at <ulink url="http://nfs.sourceforge.net">http://nfs.sourceforge.net</ulink>. Check there
for mailing lists, bug fixes, and updates, and also to verify
who currently maintains this document.
@ -57,6 +58,8 @@ The original version of this document was developed by Nicolai
Langfeldt. It was heavily rewritten in 2000 by Tavis Barr
and Seth Vidal to reflect substantial changes in the workings
of NFS for Linux developed between the 2.0 and 2.4 kernels.
It was edited again in February 2002, when Tom McNeal made substantial
additions to the performance section.
Thomas Emmel, Neil Brown, Trond Myklebust, Erez Zadok, and Ion Badulescu
also provided valuable comments and contributions.
</para>

View File

@ -2,48 +2,55 @@
<title>Security and NFS</title>
<para>
This list of security tips and explanations will not make your site
completely secure. <emphasis>NOTHING</emphasis> will make your site completely secure. This
completely secure. <emphasis>NOTHING</emphasis> will make your site completely secure. Reading this section
may help you get an idea of the security problems with NFS. This is not
a comprehensive guide and it will always be undergoing changes. If you
have any tips or hints to give us please send them to the HOWTO
maintainer.
</para>
<para>
If you're on a network with no access to the outside world (not even a
If you are on a network with no access to the outside world (not even a
modem) and you trust all the internal machines and all your users then
this section will be of no use to you. However, its our belief that
there are relatively few networks in this situation so we would suggest
reading this section thoroughly for anyone setting up NFS.
</para>
<para>
There are two steps to file/mount access in NFS. The first step is mount
With NFS, there are two steps required for a client to gain access to
a file contained in a remote directory on the server. The first step is mount
access. Mount access is achieved by the client machine attempting to
attach to the server. The security for this is provided by the
<filename>/etc/exports</filename> file. This file lists the names or ip addresses for machines
<filename>/etc/exports</filename> file. This file lists the names or IP addresses for machines
that are allowed to access a share point. If the client's ip address
matches one of the entries in the access list then it will be allowed to
mount. This is not terribly secure. If someone is capable of spoofing or
taking over a trusted address then they can access your mount points. To
give a real-world example of this type of "authentication": This is
equivalent to someone introducing themselves to you and you believe they
equivalent to someone introducing themselves to you and you believing they
are who they claim to be because they are wearing a sticker that says
"Hello, My Name is ...."
"Hello, My Name is ...." Once the machine has mounted a volume, its
operating system will have access to all files on the volume (with the
possible exception of those owned by root; see below) and write access
to those files as well, if the volume was exported with the
<userinput>rw</userinput> option.
</para>
<para>
The second step is file access. This is a function of normal file system
access controls and not a specialized function of NFS. Once the drive is
mounted the user and group permissions on the files take over access
control.
access controls on the client and not a specialized function of NFS.
Once the drive is mounted the user and group permissions on the files
determine access control.
</para>
<para>
An example: bob on the server maps to the UserID 9999. Bob
makes a file on the server that is only accessible the user (0600 in
octal). A client is allowed to mount the drive where the file is stored.
makes a file on the server that is only accessible the user
(the equivalent to typing
<userinput>chmod 600</userinput> <emphasis>filename</emphasis>).
A client is allowed to mount the drive where the file is stored.
On the client mary maps to UserID 9999. This means that the client
user mary can access bob's file that is marked as only accessible by him.
It gets worse, if someone has root on the client machine they can
<command>su - [username]</command> and become ANY user. NFS will be none
the wiser.
It gets worse: If someone has become superuser on the client machine they can
<command>su - </command> <emphasis>username</emphasis>
and become <emphasis>any</emphasis> user. NFS will be none the wiser.
</para>
<para>
Its not all terrible. There are a few measures you can take on the server
@ -92,8 +99,8 @@
be careful and keep up diligent monitoring of those systems.
</para>
<para>
Not all Linux distributions were created equal. Some seemingly up-to-
date distributions do not include a securable portmapper.
Not all Linux distributions were created equal. Some seemingly up-to-date
distributions do not include a securable portmapper.
The easy way to check if your portmapper is good or not is to run
<emphasis>strings(1)</emphasis> and see if it reads the relevant files, <filename>/etc/hosts.deny</filename> and
<filename>/etc/hosts.allow</filename>. Assuming your portmapper is <filename>/sbin/portmap</filename> you can
@ -137,22 +144,28 @@
the mill Linux system there are very few machines that need any access
for any reason. The portmapper administers <command>nfsd</command>,
<command>mountd</command>, <command>ypbind</command>/<command>ypserv</command>,
<command>pcnfsd</command>, and 'r' services like <command>ruptime</command> and <command>rusers</command>.
<command>rquotad</command>, <command>lockd</command> (which shows up
as <computeroutput>nlockmgr</computeroutput>), <command>statd</command>
(which shows up as <computeroutput>status</computeroutput>)
and 'r' services like <command>ruptime</command>
and <command>rusers</command>.
Of these only <command>nfsd</command>, <command>mountd</command>,
<command>ypbind</command>/<command>ypserv</command> and perhaps
<command>pcnfsd</command> are of any consequence. All machines that need
<command>rquotad</command>,<command>lockd</command>
and <command>statd</command> are of any consequence. All machines that need
to access services on your machine should be allowed to do that. Let's
say that your machine's address is <emphasis>192.168.0.254</emphasis> and
that it lives on the subnet <emphasis>192.168.0.0</emphasis>, and that all
machines on the subnet should have access to it (those are terms introduced
by the <ulink url="http://www.linuxdoc.org/HOWTO/Networking-Overview-HOWTO.html">Networking-Overview-HOWTO</ulink>,
go back and refresh your memory if you need to). Then we write:
machines on the subnet should have access to it (for an overview of those
terms see the the <ulink url="http://www.linuxdoc.org/HOWTO/Networking-Overview-HOWTO.html">Networking-Overview-HOWTO</ulink>). Then we write:
<screen>
portmap: 192.168.0.0/255.255.255.0
</screen>
in <filename>/etc/hosts.allow</filename>. This is the same as the network
address you give to route and the subnet mask you give to <command>ifconfig</command>. For the
device eth0 on this machine <command>ifconfig</command> should show:
in <filename>/etc/hosts.allow</filename>. If you are not sure what your
network or netmask are, you can use the <command>ifconfig</command> command to
determine the netmask and the <command>netstat</command> command to
determine the network. For, example, for the
device eth0 on the above machine <command>ifconfig</command> should show:
</para>
<para>
<screen>
@ -175,7 +188,7 @@
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 174412 eth0
...
</screen>
(Network address in first column).
(The network address is in the first column).
</para>
<para>
The <filename>/etc/hosts.deny</filename> and <filename>/etc/hosts.allow</filename> files are
@ -194,28 +207,29 @@
<filename>hosts.allow</filename> and <filename>hosts.deny</filename>
files, so you should put in entries for <command>lockd</command>,
<command>statd</command>, <command>mountd</command>, and
<command>rquotad</command> in these files too.
<command>rquotad</command> in these files too. For a complete example,
see <xref linkend="hosts">.
</para>
<para>
The above things should make your server tighter. The only remaining
problem (Yeah, right!) is someone breaking root (or boot MS-DOS) on a
trusted machine and using that privilege to send requests from a
secure port as any user they want to be.
problem is if someone gains administrative access to one of your trusted
client machines and is able to send bogus NFS requests. The next section
deals with safeguards against this problem.
</para>
</sect2>
<sect2 id="server.security">
<title>Server security: nfsd and mountd</title>
<para>
On the server we can decide that we don't want to trust the client's
root account. We can do that by using the <userinput>root_squash</userinput> option in
<filename>/etc/exports</filename>:
On the server we can decide that we don't want to trust any requests
made as root on the client. We can do that by using the
<userinput>root_squash</userinput> option in <filename>/etc/exports</filename>:
<programlisting>
/home slave1(rw,root_squash)
</programlisting>
</para>
<para>
This is, in fact, the default. It should always be turned on unless you
have a VERY good reason to turn it off. To turn it off use the
have a <emphasis>very</emphasis> good reason to turn it off. To turn it off use the
<userinput>no_root_squash</userinput> option.
</para>
<para>
@ -239,10 +253,11 @@
<para>
The TCP ports 1-1024 are reserved for root's use (and therefore sometimes
referred to as "secure ports") A non-root user cannot bind these ports.
Adding the secure option to an <filename>/etc/exports</filename> entry forces it to run on a
port below 1024, so that a malicious non-root user cannot come along and
open up a spoofed NFS dialogue on a non-reserved port. This option is set
by default.
Adding the <userinput>secure</userinput> option to an
<filename>/etc/exports</filename> means that it will only listed to
requests coming from ports 1-1024 on the client, so that a malicious
non-root user on the client cannot come along and open up a spoofed
NFS dialogue on a non-reserved port. This option is set by default.
</para>
</sect2>
<sect2 id="client.security">
@ -252,19 +267,23 @@
<para>
On the client we can decide that we don't want to trust the server too
much a couple of ways with options to mount. For example we can
forbid suid programs to work off the NFS file system with the nosuid
forbid suid programs to work off the NFS file system with the
<userinput>nosuid</userinput>
option. Some unix programs, such as passwd, are called "suid" programs:
They set the id of the person running them to whomever is the owner of
the file. If a file is owned by root and is suid, then the program will
execute as root, so that they can perform operations (such as writing to
the password file) that only root is allowed to do. Using the nosuid
the password file) that only root is allowed to do. Using the
<userinput>nosuid</userinput>
option is a good idea and you should consider using this with all NFS
mounted disks. It means that the server's root user cannot make a suid-root
mounted disks. It means that the server's root user
cannot make a suid-root
program on the file system, log in to the client as a normal user
and then use the suid-root program to become root on the client too.
One could also forbid execution of files on the mounted file system
altogether with the <userinput>noexec</userinput> option.
But this is more likely to be impractical than nosuid since a file
But this is more likely to be impractical than
<userinput>nosuid</userinput> since a file
system is likely to at least contain some scripts or programs that need
to be executed.
</para>
@ -295,7 +314,7 @@
<sect3 id="securing-daemons">
<title>Securing portmapper, rpc.statd, and rpc.lockd on the client</title>
<para>
In the current (2.2.18+) implementation of nfs, full file locking is
In the current (2.2.18+) implementation of NFS, full file locking is
supported. This means that <command>rpc.statd</command> and <command>rpc.lockd</command>
must be running on the client in order for locks to function correctly.
These services require the portmapper to be running. So, most of the
@ -310,20 +329,21 @@
<para>
IPchains (under the 2.2.X kernels) and netfilter (under the 2.4.x
kernels) allow a good level of security - instead of relying on the
daemon (or in this case the tcp wrapper) to determine who can connect,
daemon (or perhaps its TCP wrapper) to
determine which machines can connect,
the connection attempt is allowed or disallowed at a lower level. In
this case you canstop the connection much earlier and more globaly which
this case, you can stop the connection much earlier and more globally, which
can protect you from all sorts of attacks.
</para>
<para>
Describing how to set up a Linux firewall is well beyond the scope of
this document. Interested readers may wish to read the Firewall-HOWTO
this document. Interested readers may wish to read the <ulink url="http://www.linuxdoc.org/HOWTO/Firewall-HOWTO.html">Firewall-HOWTO</ulink>
or the <ulink url="http://www.linuxdoc.org/HOWTO/IPCHAINS-HOWTO.HTML">IPCHAINS-HOWTO</ulink>.
For users of kernel 2.4 and above you might want to visit the netfilter webpage at:
<ulink url="http://netfilter.filewatcher.org">http://netfilter.filewatcher.org</ulink>.
If you are already familiar with the workings of ipchains or netfilter
this section will give you a few tips on how to better setup your
firewall to work with NFS.
NFS daemons to more easily firewall and protect them.
</para>
<para>
A good rule to follow for your firewall configuration is to deny all, and
@ -331,103 +351,290 @@
than you intended.
</para>
<para>
Ports to be concerned with:
<orderedlist numeration="loweralpha">
<listitem>
<para>The portmapper is on 111. (tcp and udp)</para>
</listitem>
<listitem>
<para>
nfsd is on 2049 and it can be TCP and UDP. Although NFS over TCP
is currently experimental on the server end and you will usually
just see UDP on the server, using TCP is quite stable on the
client end.
</para>
</listitem>
<listitem>
<para>
<command>mountd</command>, <command>lockd</command>, and <command>statd</command>
float around (which is why we need the portmapper to begin with) - this causes
problems. You basically have two options to deal with it:
<orderedlist numeration="lowerroman">
<listitem>
<para>
You more can more or less do a deny all on connecting ports
but explicitly allow most ports certain ips.
</para>
</listitem>
<listitem>
<para>
More recent versions of these utilities have a "-p" option
that allows you to assign them to a certain port. See the
man pages to be sure if your version supports this. You can
then allow access to the ports you have specified for your
NFS client machines, and seal off all other ports, even for
your local network.
</para>
</listitem>
</orderedlist>
</para>
</listitem>
</orderedlist>
</para>
<para>
Using IPCHAINS, a simply firewall using the first option would look
something like this:
<programlisting>
ipchains -A input -f -j ACCEPT
ipchains -A input -s trusted.net.here/trusted.netmask -d host.ip/255.255.255.255 -j ACCEPT
ipchains -A input -s 0/0 -d 0/0 -p 6 -j DENY -y -l
ipchains -A input -s 0/0 -d 0/0 -p 17 -j DENY -l
</programlisting>
</para>
<para>
The equivalent set of commands in netfilter (the firewalling tool in 2.4) is:
<programlisting>
iptables -A INPUT -f -j ACCEPT
iptables -A INPUT -s trusted.net.here/trusted.netmask -d \
host.ip/255.255.255.255 -j ACCEPT
iptables -A INPUT -s 0/0 -d 0/0 -p 6 -j DENY --syn --log-level 5
iptables -A INPUT -s 0/0 -d 0/0 -p 17 -j DENY --log-level 5
</programlisting>
</para>
<para>
The first line says to accept all packet fragments (except the first
packet fragment which will be treated as a normal packet). In theory
no packet will pass through until it is reassembled, and it won't be
reassembled unless the first packet fragment is passed. Of course
there are attacks that can be generated by overloading a machine
with packet fragments. But NFS won't work correctly unless you
let fragments through. See <xref linkend="troubleshooting"> for details.
</para>
<para>
The other three lines say trust your local networks and deny and log
everything else. It's not great and more specific rules pay off, but
more specific rules are outside of the scope of this discussion.
</para>
<para>
Some pointers if you'd like to be more paranoid or strict about your
rules. If you choose to reset your firewall rules each time <command>statd</command>,
<command>rquotad</command>, <command>mountd</command> or <command>lockd</command>
move (which is possible) you'll want to make sure you allow fragments to
your nfs server FROM your nfs client(s). If you don't you will get some very
interesting reports from the kernel regarding packets being denied. The messages
will say that a packet from port 65535 on the client to 65535 on the server
is being denied. Allowing fragments will solve this.
</para>
In order to understand how to firewall the NFS daemons, it will help
to breifly review how they bind to ports.
</para>
<para>
When a daemon starts up, it requests a free port from the portmapper.
The portmapper gets the port for the daemon and keeps track of
the port currently used by that daemon. When other hosts or processes
need to communicate with the daemon, they request the port number
from the portmapper in order to find the
daemon. So the ports will perpetually float because different ports may
be free at different times and so the portmapper will allocate them
differently each time. This is a pain for setting up a firewall. If
you never know where the daemons are going to be then you don't
know precisely which ports to allow access to. This might not be a big deal
for many people running on a protected or isolated LAN. For those
people on a public network, though, this is horrible.
</para>
<para>
In kernels 2.4.13 and later with nfs-utils 0.3.3 or later you no
longer have to worry about the floating of ports in the portmapper.
Now all of the daemons pertaining to nfs can be "pinned" to a port.
Most of them nicely take a <command>-p</command> option when they are started;
those daemons that are started by the kernel take some kernel arguments
or module options. They are described below.
</para>
<para>
Some of the daemons involved in sharing data via nfs are already
bound to a port. <command>portmap</command> is always on port
111 tcp and udp. <command>nfsd</command> is
always on port 2049 TCP and UDP (however, as of kernel 2.4.17, NFS over
TCP is considered experimental and is not for use on production machines).
</para>
<para>
The other daemons, <command>statd</command>, <command>mountd</command>,
<command>lockd</command>, and <command>rquotad</command>, will normally move
around to the first available port they are informed of by the portmapper.
</para>
<para>
To force <command>statd</command> to bind to a particular port, use the
<userinput>-p</userinput>
<emphasis>portnum</emphasis> option. To force <command>statd</command> to
respond on a particular port, additionally use the
<userinput>-o</userinput> <emphasis>portnum</emphasis> option when starting it.
</para>
<para>
To force <command>mountd</command> to bind to a particular port use the
<userinput>-p</userinput> <emphasis>portnum</emphasis> option.
</para>
<para>
For example, to have statd broadcast of port 32765 and listen on port
32766, and mountd listen on port 32767, you would type:
</para>
<programlisting>
# statd -p 32765 -o 32766
# mountd -p 32767
</programlisting>
<para>
<command>lockd</command> is started by the kernel when it is needed.
Therefore you need
to pass module options (if you have it built as a module) or kernel
options to force <command>lockd</command> to listen and respond
only on certain ports.
</para>
<para>
If you are using loadable modules and you would like to specify these
options in your <filename>/etc/modules.conf</filename> file add
a line like this to the file:
</para>
<programlisting>
options lockd nlm.uddpport=32768 nlm.tcpport=32768
</programlisting>
<para>
The above line would specify the udp and tcp port for
<command>lockd</command> to be 32768.
</para>
<para>
If you are not using loadable modules or if you have compiled
<command>lockd</command> into the kernel instead of building it
as a module then you will need to pass it an option on the kernel boot line.
</para>
<para>
It should look something like this:
</para>
<programlisting>
vmlinuz 3 root=/dev/hda1 lockd.udpport=32768 lockd.tcpport=32768
</programlisting>
<para>
The port numbers do not have to match but it would simply add
unnecessary confusion if they didn't.
</para>
<para>
If you are using quotas and using <command>rpc.quotad</command> to make these
quotas viewable over nfs you will need to also take it into
account when setting up your firewall. There are two
<command>rpc.rquotad</command>
source trees. One of those is maintained in the
<application>nfs-utils</application> tree.
The other in the <application>quota-tools</application> tree.
They do not operate identically.
The one provided with <application>nfs-utils</application> supports
binding the daemon to a port with the <userinput>-p</userinput>
directive. The one in <application>quota-tools</application> does not.
Consult your distribution's documentation to determine if yours does.
</para>
<para>
For the sake of this discussion lets describe a network and setup a
firewall to protect our nfs server.
Our nfs server is 192.168.0.42 our client is 192.168.0.45 only.
As in the example above, <command>statd</command> has been
started so that it only
binds to port 32765 for incoming requests and it must answer on
port 32766. <command>mountd</command> is forced to bind to port 32767.
<command>lockd</command>'s module parameters have been set to bind to 32768.
<command>nfsd</command> is, of course, on port 2049 and the portmapper is on port 111.
</para>
<para>
We are not using quotas.
</para>
<para>
Using <application>IPCHAINS</application>, a simple firewall
might look something like this:
</para>
<programlisting>
ipchains -A input -f -j ACCEPT -s 192.168.0.45
ipchains -A input -s 192.168.0.45 -d 0/0 32765:32768 -p 6 -j ACCEPT
ipchains -A input -s 192.168.0.45 -d 0/0 32765:32768 -p 17 -j ACCEPT
ipchains -A input -s 192.168.0.45 -d 0/0 2049 -p 17 -j ACCEPT
ipchains -A input -s 192.168.0.45 -d 0/0 2049 -p 6 -j ACCEPT
ipchains -A input -s 192.168.0.45 -d 0/0 111 -p 6 -j ACCEPT
ipchains -A input -s 192.168.0.45 -d 0/0 111 -p 17 -j ACCEPT
ipchains -A input -s 0/0 -d 0/0 -p 6 -j DENY -y -l
ipchains -A input -s 0/0 -d 0/0 -p 17 -j DENY -l
</programlisting>
<para>
The equivalent set of commands in <application>netfilter</application> is:
</para>
<programlisting>
iptables -A INPUT -f -j ACCEPT -s 192.168.0.45
iptables -A INPUT -s 192.168.0.45 -d 0/0 32765:32768 -p 6 -j ACCEPT
iptables -A INPUT -s 192.168.0.45 -d 0/0 32765:32768 -p 17 -j ACCEPT
iptables -A INPUT -s 192.168.0.45 -d 0/0 2049 -p 17 -j ACCEPT
iptables -A INPUT -s 192.168.0.45 -d 0/0 2049 -p 6 -j ACCEPT
iptables -A INPUT -s 192.168.0.45 -d 0/0 111 -p 6 -j ACCEPT
iptables -A INPUT -s 192.168.0.45 -d 0/0 111 -p 17 -j ACCEPT
iptables -A INPUT -s 0/0 -d 0/0 -p 6 -j DENY --syn --log-level 5
iptables -A INPUT -s 0/0 -d 0/0 -p 17 -j DENY --log-level 5
</programlisting>
<para>
The first line says to accept all packet fragments (except the
first packet fragment which will be treated as a normal packet).
In theory no packet will pass through until it is reassembled,
and it won't be reassembled unless the first packet fragment
is passed. Of course there are attacks that can be generated
by overloading a machine with packet fragments. But NFS won't
work correctly unless you let fragments through. See <xref linkend="symptom8">
for details.
</para>
<para>
The other lines allow specific connections from any port on our
client host to the specific ports we have made available on
our server. This means that if, say, 192.158.0.46 attempts to contact
the NFS server it will not be able to mount or see what mounts
are available.
</para>
<para>
With the new port pinning capabilities it is obviously much easier
to control what hosts are allowed to mount your NFS shares. It is
worth mentioning that NFS is not an encrypted protocol and anyone
on the same physical network could sniff the traffic and reassemble
the information being passed back and forth.
</para>
</sect2>
<sect2 id="nfs-ssh">
<title>Tunneling NFS through SSH</title>
<para>
One method of encrypting NFS traffic over a network is to
use the port-forwarding capabilities of <command>ssh</command>.
However, as we shall see, doing so has a serious drawback if you do not
utterly and completely trust the local users on your server.
</para>
<para>
The first step will be to export files to the localhost. For example, to
export the <filename>/home</filename> partition, enter the following into
<filename>/etc/exports</filename>:
<programlisting>
/home 127.0.0.1(rw)
</programlisting>
</para>
<para>
The next step is to use <command>ssh</command> to forward ports. For example,
<command>ssh</command> can tell the server to forward to any port on any
machine from a port on the client. Let us assume, as in the previous
section, that our server is 192.168.0.42, and that we have pinned
<command>mountd</command> to port 32767
using the argument <userinput>-p 32767</userinput>. Then, on the client,
we'll type:
<programlisting>
# ssh root@192.168.0.42 -L 250:localhost:2049 -f sleep 60m
# ssh root@192.168.0.42 -L 251:localhost:32767 -f sleep 60m
</programlisting>
</para>
<para>
The above command causes <command>ssh</command> on the client to take
any request directed at the client's port 250 and forward it,
first through <command>sshd</command> on the server, and then on
to the server's port 2049. The second line
causes a similar type of forwarding between requests to port 251 on
the client and port 32767 on the server. The
<userinput>localhost</userinput> is relative to the server; that is,
the forwarding will be done to the server itself. The port could otherwise
have been made to forward to any other machine, and the requests would look to
the outside world as if they were coming from the server. Thus, the requests
will appear to NFSD on the server as if they are coming from the server itself.
Note that in order to bind to a port below 1024 on the client, we have
to run this command as root on the client. Doing this will be necessary
if we have exported our filesystem with the default
<userinput>secure</userinput> option.
</para>
<para>
Finally, we are pulling a little trick with the last option,
<userinput>-f sleep 60m</userinput>. Normally, when
we use <command>ssh</command>, even with the <userinput>-L</userinput> option,
we will open up a shell on the remote machine. But instead, we just want
the port forwarding to execute in the background so that we get our shell
on the client back. So, we tell <command>ssh</command> to execute a command
in the background on the server to sleep for 60 minutes. This will cause
the port to be forwarded for 60 minutes until it gets a connection; at that
point, the port will continue to be forwarded until the connection dies or
until the 60 minutes are up, whichever happens later. The above command
could be put in our startup scripts on the client, right after the network
is started.
</para>
<para>
Next, we have to mount the filesystem on the client. To do this, we tell
the client to mount a filesystem on the localhost, but at a different
port from the usual 2049. Specifically, an entry in <filename>/etc/fstab</filename>
would look like:
<programlisting>
localhost:/home /mnt/home nfs rw,hard,intr,port=250,mountport=251 0 0
</programlisting>
</para>
<para>Having done this, we can see why the above will be incredibly insecure
if we have <emphasis>any</emphasis> ordinary users who are able to log in
to the server locally. If they can, there is nothing preventing them from
doing what we did and using <command>ssh</command> to forward a privileged
port on their own client machine (where they are legitimately root) to ports
2049 and 32767 on the server. Thus, any ordinary user on the server can
mount our filesystems with the same rights as root on our client.
</para>
<para>
If you are using an NFS server that does not have a way for ordinary users
to log in, and you wish to use this method, there are two additional caveats:
First, the connection travels from the client to the server via
<command>sshd</command>; therefore you will have to leave port 22 (where
<command>sshd</command> listens) open to your client on the firewall. However
you do not need to leave the other ports, such as 2049 and 32767, open
anymore. Second, file locking will no longer work. It is not possible
to ask <command>statd</command> or the locking manager to make requests
to a particular port for a particular mount; therefore, any locking requests
will cause <command>statd</command> to connect to <command>statd</command>
on localhost, i.e., itself, and it will fail with an error. Any attempt
to correct this would require a major rewrite of NFS.
</para>
<para>
It may also be possible to use <application>IPSec</application> to encrypt
network traffic between your client and your server, without compromising
any local security on the server; this will not be taken up here.
See the <ulink url="http://www.freeswan.org/">FreeS/WAN</ulink> home page
for details on using IPSec under Linux.
</para>
</sect2>
<sect2 id="summary">
<title>Summary</title>
<para>
If you use the <filename>hosts.allow</filename>, <filename>hosts.deny</filename>,
<filename>root_squash</filename>, <userinput>nosuid</userinput> and privileged
port features in the portmapper/nfs software you avoid many of the
presently known bugs in nfs and can almost feel secure about that at
<userinput>root_squash</userinput>, <userinput>nosuid</userinput> and privileged
port features in the portmapper/NFS software, you avoid many of the
presently known bugs in NFS and can almost feel secure about that at
least. But still, after all that: When an intruder has access to your
network, s/he can make strange commands appear in your <filename>.forward</filename> or
read your mail when <filename>/home</filename> or <filename>/var/mail</filename> is
NFS exported. For the same reason, you should never access your PGP private key
over nfs. Or at least you should know the risk involved. And now you know a bit
over NFS. Or at least you should know the risk involved. And now you know a bit
of it.
</para>
<para>
@ -438,4 +645,3 @@
</para>
</sect2>
</sect1>

View File

@ -59,9 +59,11 @@
<glossdef>
<para>
client machines that will have access to the directory. The machines
may be listed by their IP address or their DNS address
may be listed by their DNS address or their IP address
(e.g., <emphasis>machine.company.com</emphasis> or <emphasis>192.168.0.8</emphasis>).
Using IP addresses is more reliable and more secure.
Using IP addresses is more reliable and more secure. If you need to
use DNS addresses, and they do not seem to be resolving to the right
machine, see <xref linkend="symptom3">.
</para>
</glossdef>
</glossentry>
@ -85,11 +87,14 @@
</listitem>
<listitem>
<para>
<userinput>no_root_squash</userinput>: By default, any file request made by user root
<userinput>no_root_squash</userinput>: By default,
any file request made by user <computeroutput>root</computeroutput>
on the client machine is treated as if it is made by user
nobody on the server. (Excatly which UID the request is
<computeroutput>nobody</computeroutput> on the
server. (Excatly which UID the request is
mapped to depends on the UID of user "nobody" on the server,
not the client.) If no_root_squash is selected, then
not the client.) If <userinput>no_root_squash</userinput>
is selected, then
root on the client machine will have the same level of access
to the files on the system as root on the server. This
can have serious security implications, although it may be
@ -109,19 +114,17 @@
</listitem>
<listitem>
<para>
<userinput>sync</userinput>: By default, a Version 2 NFS server will tell a client
machine that a file write is complete when NFS has finished
handing the write over to the filesysytem; however, the file
system may not sync it to the disk, even if the client makes
a sync() call on the file system. The default behavior may
therefore cause file corruption if the server reboots. This
option forces the filesystem to sync to disk every time NFS
completes a write operation. It slows down write times
substantially but may be necessary if you are running NFS
Version 2 in a production environment. Version 3 NFS has
a commit operation that the client can call that
actually will result in a disk sync on the server end.
</para>
<userinput>sync</userinput>:
By default, all but the most recent version (version 1.11)
of the <command>exportfs</command> command will use
<userinput>async</userinput> behavior, telling a client
machine that a file write is complete - that is, has been written
to stable storage - when NFS has finished handing the write over to
the filesysytem. This behavior may cause data corruption if the
server reboots, and the <userinput>sync</userinput> option prevents
this. See <xref linkend="sync-async"> for a complete discussion of
<userinput>sync</userinput> and <userinput>async</userinput> behavior.
</para>
</listitem>
</itemizedlist>
</para>
@ -174,7 +177,9 @@
</para>
<para>
Third, you can use wildcards such as <emphasis>*.foo.com</emphasis> or
<emphasis>192.168.</emphasis> instead of hostnames.
<emphasis>192.168.</emphasis> instead of hostnames. There were problems
with wildcard implementation in the 2.2 kernel series that were fixed
in kernel 2.2.19.
</para>
<para>
However, you should keep in mind that any of these simplifications
@ -205,7 +210,8 @@
<title>/etc/hosts.allow and /etc/hosts.deny</title>
<para>
These two files specify which computers on the network can use
services on your machine. Each line of the file is an entry listing
services on your machine. Each line of the file
contains a single entry listing
a service and a set of machines. When the server gets a request
from a machine, it does the following:
<itemizedlist>
@ -232,7 +238,8 @@
</itemizedlist>
</para>
<para>
In addition to controlling access to services handled by inetd (such
In addition to controlling access to services
handled by <command>inetd</command> (such
as telnet and FTP), this file can also control access to NFS
by restricting connections to the daemons that provide NFS services.
Restrictions are done on a per-service basis.
@ -257,8 +264,8 @@
</para>
<para>
In general it is a good idea with NFS (as with most internet services)
to explicitly deny access to hosts that you don't need to allow access
to.
to explicitly deny access to IP addresses that you don't need
to allow access to.
</para>
<para>
The first step in doing this is to add the followng entry to
@ -270,10 +277,10 @@
</screen>
</para>
<para>
Starting with nfs-utils 0.2.0, you can be a bit more careful by
Starting with <application>nfs-utils</application> 0.2.0, you can be a bit more careful by
controlling access to individual daemons. It's a good precaution
since an intruder will often be able to weasel around the portmapper.
If you have a newer version of NFS-utils, add entries for each of the
If you have a newer version of <application>nfs-utils</application>, add entries for each of the
NFS daemons (see the next section to find out what these daemons are;
for now just put entries for them in hosts.deny):
</para>
@ -286,7 +293,7 @@
</screen>
</para>
<para>
Even if you have an older version of <emphasis>nfs-utils</emphasis>, adding these entries
Even if you have an older version of <application>nfs-utils</application>, adding these entries
is at worst harmless (since they will just be ignored) and at best
will save you some trouble when you upgrade. Some sys admins choose
to put the entry <userinput>ALL:ALL</userinput> in the file <filename>/etc/hosts.deny</filename>,
@ -310,7 +317,7 @@
<para>
Here, host is IP address of a potential client; it may be possible
in some versions to use the DNS name of the host, but it is strongly
deprecated.
discouraged.
</para>
<para>
Suppose we have the setup above and we just want to allow access
@ -351,7 +358,8 @@
The NFS server should now be configured and we can start it running.
First, you will need to have the appropriate packages installed.
This consists mainly of a new enough kernel and a new enough version
of the nfs-utils package. See <xref linkend="swprereq"> if you are in doubt.
of the <application>nfs-utils</application> package.
See <xref linkend="swprereq"> if you are in doubt.
</para>
<para>
Next, before you can start NFS, you will need to have TCP/IP
@ -366,7 +374,8 @@
Verifying that NFS is running. If this does not work, or if
you are not in a position to reboot your machine, then the following
section will tell you which daemons need to be started in order to
run NFS services. If for some reason nfsd was already running when
run NFS services. If for some reason <command>nfsd</command>
was already running when
you edited your configuration files above, you will have to flush
your configuration; see <xref linkend="later"> for details.
</para>
@ -385,13 +394,15 @@
<sect3 id="daemons">
<title>The Daemons</title>
<para>
NFS serving is taken care of by five daemons: rpc.nfsd, which does
most of the work; rpc.lockd and rpc.statd, which handle file locking;
rpc.mountd, which handles the initial mount requests, and
rpc.rquotad, which handles user file quotas on exported volumes.
Starting with 2.2.18, lockd is called by nfsd upon demand, so you do
not need to worry about starting it yourself. statd will need to be
started separately. Most recent Linux distributions will
NFS serving is taken care of by five daemons: <command>rpc.nfsd</command>,
which does most of the work; <command>rpc.lockd</command> and
<command>rpc.statd</command>, which handle file locking;
<command>rpc.mountd</command>, which handles the initial mount requests,
and <command>rpc.rquotad</command>, which handles user file quotas on
exported volumes. Starting with 2.2.18, <command>lockd</command>
is called by <command>nfsd</command> upon demand, so you do
not need to worry about starting it yourself. <command>statd</command>
will need to be started separately. Most recent Linux distributions will
have startup scripts for these daemons.
</para>
<para>
@ -403,9 +414,9 @@
then then you should add them, configured to start in the following
order:
<simplelist>
<member>rpc.portmap</member>
<member>rpc.mountd, rpc.nfsd</member>
<member>rpc.statd, rpc.lockd (if necessary), rpc.rquotad</member>
<member><command>rpc.portmap</command></member>
<member><command>rpc.mountd</command>, <command>rpc.nfsd</command></member>
<member><command>rpc.statd</command>, <command>rpc.lockd</command> (if necessary), and <command>rpc.rquotad</command></member>
</simplelist>
</para>
<para>
@ -461,9 +472,12 @@
such as Solaris default to TCP.
</para>
<para>
If you do not at least see a line that says "portmapper", a line
that says "nfs", and a line that says "mountd" then you will need
to backtrack and try again to start up the daemons (see <xref linkend="troubleshooting">,
If you do not at least see a line that says
<computeroutput>portmapper</computeroutput>, a line that says
<computeroutput>nfs</computeroutput>, and a line that says
<computeroutput>mountd</computeroutput> then you will need
to backtrack and try again to start up the daemons
(see <xref linkend="troubleshooting">,
Troubleshooting, if this still doesn't work).
</para>
<para>
@ -476,16 +490,16 @@
<para>
If you come back and change your <filename>/etc/exports</filename> file, the changes you
make may not take effect immediately. You should run the command
<command>exportfs -ra</command> to force nfsd to re-read the <filename>/etc/exports</filename>
  file. If you can't find the <command>exportfs</command> command, then you can kill nfsd with the
<command>exportfs -ra</command> to force <command>nfsd</command> to re-read the <filename>/etc/exports</filename>
  file. If you can't find the <command>exportfs</command> command, then you can kill <command>nfsd</command> with the
<userinput> -HUP</userinput> flag (see the man pages for kill for details).
</para>
<para>
If that still doesn't work, don't forget to check <filename>hosts.allow</filename> to
make sure you haven't forgotten to list any new client machines
there. Also check the host listings on any firewalls you may have
set up (see <xref linkend="troubleshooting"> for more details on firewalls
and NFS).
set up (see <xref linkend="troubleshooting"> and
<xref linkend="security"> for more details on firewalls and NFS).
</para>
</sect2>
</sect1>

View File

@ -15,9 +15,10 @@
There are several ways of doing this. The most reliable
way is to look at the file <filename>/proc/mounts</filename>,
which will list all mounted filesystems and give details about them. If
this doesn't work (for example if you don't have the /proc
this doesn't work (for example if you don't
have the <filename>/proc</filename>
filesystem compiled into your kernel), you can type
'mount -f' although you get less information.
<userinput>mount -f</userinput> although you get less information.
</para>
<para>
If the file system appears to be mounted, then you may
@ -50,7 +51,8 @@
<orderedlist numeration="loweralpha">
<listitem>
<para>
failed, reason given by server: Permission denied
failed, reason given by server:
<computeroutput>Permission denied</computeroutput>
</para>
<para>
This means that the server does not recognize that you
@ -63,7 +65,8 @@
volume is exported and that your client has the right
kind of access to it. For example, if a client only
has read access then you have to mount the volume
with the ro option rather than the rw option.
with the <userinput>ro</userinput> option rather
than the <userinput>rw</userinput> option.
</para>
</listitem>
<listitem>
@ -88,16 +91,33 @@
in <filename>/etc/hosts</filename> that is throwing off the server, or
you may not have listed the client's complete address
and it may be resolving to a machine in a different
domain. Try to ping the client from the server, and try
to ping the server from the client. If this doesn't work,
domain. One trick is login to the server from the
client via <command>ssh</command> or <command>telnet</command>;
if you then type <command>who</command>, one of the listings
should be your login session and the name of your client
machine as the server sees it. Try using this machine name
in your <filename>/etc/exports</filename> entry.
Finally, try to ping the client from the server, and try
to <command>ping</command> the server from the client. If this doesn't work,
or if there is packet loss, you may have lower-level network
problems.
</para>
</listitem>
<listitem>
<para>
It is not possible to export both a directory and its child
(for example both
<filename>/usr</filename> and <filename>/usr/local</filename>).
You should export the parent directory with the necessary
permissions, and all of its subdirectories can then be
mounted with those same permissions.
</para>
</listitem>
</orderedlist>
</listitem>
<listitem>
<para>RPC: Program Not Registered (or another "RPC" error):</para>
<para><computeroutput>
RPC: Program Not Registered</computeroutput>: (or another "RPC" error):</para>
<para>
This means that the client does not detect NFS running
on the server. This could be for several reasons.
@ -136,13 +156,15 @@
This says that we have NFS versions 2 and 3, rpc.statd
version 1, network lock manager (the service name for
rpc.lockd) versions 1, 3, and 4. There are also different
<command>rpc.lockd</command>) versions 1, 3, and 4.
There are also different
service listings depending on whether NFS is travelling over
TCP or UDP. UDP is usually (but not always) the default
unless TCP is explicitly requested.
</para>
<para>
If you do not see at least portmapper, nfs, and mountd, then
If you do not see at least <computeroutput>portmapper</computeroutput>, <computeroutput>nfs</computeroutput>, and
<computeroutput>mountd</computeroutput>, then
you need to restart NFS. If you are not able to restart
successfully, proceed to <xref linkend="symptom9" endterm="sym9short">.
</para>
@ -150,8 +172,9 @@
<listitem>
<para>
Now check to make sure you can see it from the client.
On the client, type <command>rpcinfo -p [server]</command> where
<command>[server]</command> is the DNS name or IP address of your server.
On the client, type <command>rpcinfo -p </command>
<emphasis>server</emphasis> where <emphasis>server</emphasis>
is the DNS name or IP address of your server.
</para>
<para>
If you get a listing, then make sure that the type
@ -160,13 +183,14 @@
NFS, make sure Version 3 is listed; if you are trying
to mount using NFS over TCP, make sure that is
registered. (Some non-Linux clients default to TCP).
See man rpcinfo for more details on how
Type <userinput>man rpcinfo</userinput> for more details on how
to read the output. If the type of mount you are
trying to perform is not listed, try a different
type of mount.
</para>
<para>
If you get the error No Remote Programs Registered,
If you get the error
<computeroutput>No Remote Programs Registered</computeroutput>,
then you need to check your <filename>/etc/hosts.allow</filename> and
<filename>/etc/hosts.deny</filename> files on the server and make sure
your client actually is allowed access. Again, if the
@ -176,9 +200,11 @@
the client. Also check the error logs on the system
for helpful messages: Authentication errors from bad
<filename>/etc/hosts.allow</filename> entries will usually appear in
<filename>/var/log/messages</filename>, but may appear somewhere else depending
<filename>/var/log/messages</filename>,
but may appear somewhere else depending
on how your system logs are set up. The man pages
for syslog can help you figure out how your logs are
for <computeroutput>syslog</computeroutput> can
help you figure out how your logs are
set up. Finally, some older operating systems may
behave badly when routes between the two machines
are asymmetric. Try typing <command>tracepath [server]</command> from
@ -188,8 +214,9 @@
not usually a problem on recent linux distributions.
</para>
<para>
If you get the error Remote system error - No route
to host, but you can ping the server correctly,
If you get the error
<computeroutput>Remote system error - No route to host</computeroutput>,
but you can ping the server correctly,
then you are the victim of an overzealous
firewall. Check any firewalls that may be set up,
either on the server or on any routers in between
@ -221,7 +248,7 @@
<filename>/proc/mounts</filename> and make sure the volume
is mounted read/write (although if it is mounted read-only
you ought to get a more specific error message). If not then
you need to re-mount with the rw option.
you need to re-mount with the <userinput>rw</userinput> option.
</para>
<para>
The second problem has to do with username mappings, and is
@ -292,7 +319,8 @@
</screen>
</para>
<para>
These happen when a NFS setattr operation is attempted on a
These happen when a NFS <computeroutput>setattr</computeroutput>
operation is attempted on a
file you don't have write access to. The messages are
harmless.
</para>
@ -309,8 +337,8 @@
</screen>
</para>
<para>
The "can't get a request slot" message means that the client-
side RPC code has detected a lot of timeouts (perhaps due to
The "can't get a request slot" message means that the client-side
RPC code has detected a lot of timeouts (perhaps due to
network congestion, perhaps due to an overloaded server), and
is throttling back the number of concurrent outstanding
requests in an attempt to lighten the load. The cause of
@ -336,7 +364,7 @@ nfs warning: mount version older than kernel
</listitem>
<listitem>
<para>
Errors in startup/shutdown log for lockd
Errors in startup/shutdown log for <command>lockd</command>
</para>
<para>
You may see a message of the following kind in your boot log:
@ -345,10 +373,12 @@ nfslock: rpc.lockd startup failed
</screen>
</para>
<para>
They are harmless. Older versions of rpc.lockd needed to be
They are harmless. Older versions of <command>rpc.lockd</command> needed to be
started up manually, but newer versions are started automatically
by knfsd. Many of the default startup scripts still try to start
up lockd by hand, in case it is necessary. You can alter your
by <command>nfsd</command>. Many of the
default startup scripts still try to start
up <command>lockd</command> by hand, in case
it is necessary. You can alter your
startup scripts if you want the messages to go away.
</para>
</listitem>
@ -376,19 +406,19 @@ kmem_create: forcing size word alignment - nfs_fh
</title>
<titleabbrev id="sym7short">Symptom 7</titleabbrev>
<para>
<emphasis>
<filename>/etc/exports</filename> is VERY sensitive to whitespace - so the
<filename>/etc/exports</filename> is <emphasis>very</emphasis> sensitive to whitespace - so the
following statements are not the same:
</emphasis>
<programlisting>
/export/dir hostname(rw,no_root_squash)
/export/dir hostname (rw,no_root_squash)
</programlisting>
The first will grant hostname rw access to <filename>/export/dir</filename>
The first will grant <userinput>hostname rw</userinput>
access to <filename>/export/dir</filename>
without squashing root privileges. The second will grant
hostname rw privs w/root squash and it will grant EVERYONE
else read-write access, without squashing root privileges.
Nice huh?
<userinput>hostname rw</userinput> privileges with
<userinput>root squash</userinput> and it will grant
<emphasis>everyone</emphasis> else read/write access, without
squashing root privileges. Nice huh?
</para>
</sect2>
<sect2 id="symptom8">
@ -412,8 +442,10 @@ kmem_create: forcing size word alignment - nfs_fh
</listitem>
<listitem>
<para>
You may be using a larger rsize and wsize in your mount options
than the server supports. Try reducing rsize and wsize to 1024 and
You may be using a larger <userinput>rsize</userinput>
and <userinput>wsize</userinput> in your mount options
than the server supports. Try reducing <userinput>rsize</userinput>
and <userinput>wsize</userinput> to 1024 and
seeing if the problem goes away. If it does, then increase them
slowly to a more reasonable value.
</para>
@ -430,6 +462,19 @@ kmem_create: forcing size word alignment - nfs_fh
to reinstall your binaries if none of these ideas helps.
</para>
</sect2>
<sect2 id="symptom10">
<title>File Corruption When Using Multiple Clients</title>
<titleabbrev id="sym10short">Symptom 10</titleabbrev>
<para>
If a file has been modified within one second of its
previous modification and left the same size, it will
continue to generate the same inode number. Because
of this, constant reads and writes to a file by
multiple clients may cause file corruption. Fixing
this bug requires changes deep within the filesystem
layer, and therefore it is a 2.5 item.
</para>
</sect2>
</sect1>