mirror of https://github.com/tLDP/LDP
updated
This commit is contained in:
parent
77b94ddeee
commit
f61249f8a1
|
@ -22,7 +22,7 @@ jade -t sgml -i html -d /usr/lib/sgml/stylesheets/ldp.dsl\#html ../nfs-howto.sgm
|
|||
<surname>Barr</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>tavis@mahler.econ.columbia.edu</email>
|
||||
<email>tavis dot barr at liu dot edu</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
|
@ -31,7 +31,7 @@ jade -t sgml -i html -d /usr/lib/sgml/stylesheets/ldp.dsl\#html ../nfs-howto.sgm
|
|||
<surname>Langfeldt</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>janl@linpro.no</email>
|
||||
<email>janl at linpro dot no</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
|
@ -40,7 +40,16 @@ jade -t sgml -i html -d /usr/lib/sgml/stylesheets/ldp.dsl\#html ../nfs-howto.sgm
|
|||
<surname>Vidal</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>skvidal@phy.duke.edu</email>
|
||||
<email>skvidal at phy dot duke dot edu</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
<author>
|
||||
<firstname>Tom</firstname>
|
||||
<surname>McNeal</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>trmcneal at attbi dot com</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
|
|
|
@ -9,19 +9,24 @@
|
|||
standard distributions do. If you are using a 2.2 or later kernel
|
||||
with the <filename>/proc</filename> filesystem you can check the latter by reading the
|
||||
file <filename>/proc/filesystems</filename> and making sure there is a line containing
|
||||
nfs. If not, you will need to build (or download) a kernel that has
|
||||
NFS support built in.
|
||||
nfs. If not, typing <userinput>insmod nfs</userinput> may make it
|
||||
magically appear if NFS has been compiled as a module; otherwise,
|
||||
you will need to build (or download) a kernel that has
|
||||
NFS support built in. In general, kernels that do not have NFS
|
||||
compiled in will give a very specific error when the
|
||||
<command>mount</command> command below is run.
|
||||
</para>
|
||||
<para>
|
||||
To begin using machine as an NFS client, you will need the portmapper
|
||||
running on that machine, and to use NFS file locking, you will
|
||||
also need <filename>rpc.statd</filename> and <filename>rpc.lockd</filename>
|
||||
also need <command>rpc.statd</command> and <command>rpc.lockd</command>
|
||||
running on both the client and the server. Most recent distributions
|
||||
start those services by default at boot time; if yours doesn't, see
|
||||
<xref linkend="config"> for information on how to start them up.
|
||||
</para>
|
||||
<para>
|
||||
With portmapper, lockd, and statd running, you should now be able to
|
||||
With <command>portmap</command>, <command>lockd</command>,
|
||||
and <command>statd</command> running, you should now be able to
|
||||
mount the remote directory from your server just the way you mount
|
||||
a local hard drive, with the mount command. Continuing our example
|
||||
from the previous section, suppose our server above is called
|
||||
|
@ -32,7 +37,9 @@
|
|||
# mount master.foo.com:/home /mnt/home
|
||||
</screen>
|
||||
and the directory <filename>/home</filename> on master will appear as the directory
|
||||
<filename>/mnt/home</filename> on <emphasis>slave1</emphasis>.
|
||||
<filename>/mnt/home</filename> on <emphasis>slave1</emphasis>. (Note that
|
||||
this assumes we have created the directory <filename>/mnt/home</filename>
|
||||
as an empty mount point beforehand.)
|
||||
</para>
|
||||
<para>
|
||||
If this does not work, see the Troubleshooting section (<xref linkend="troubleshooting">).
|
||||
|
@ -106,9 +113,10 @@
|
|||
The program accessing a file on a NFS mounted file system
|
||||
will hang when the server crashes. The process cannot be
|
||||
interrupted or killed (except by a "sure kill") unless you also
|
||||
specify intr. When the NFS server is back online the program will
|
||||
continue undisturbed from where it was. We recommend using hard,
|
||||
intr on all NFS mounted file systems.
|
||||
specify <userinput>intr</userinput>. When the
|
||||
NFS server is back online the program will
|
||||
continue undisturbed from where it was. We recommend using
|
||||
<userinput>hard,intr</userinput> on all NFS mounted file systems.
|
||||
</para>
|
||||
</glossdef>
|
||||
</glossentry>
|
||||
|
|
|
@ -53,8 +53,10 @@
|
|||
<orderedlist numeration="lowerroman">
|
||||
<listitem>
|
||||
<para>
|
||||
Version 4.3.2 of AIX requires that file systems be exported with
|
||||
the insecure option, which causes NFS to listen to requests from
|
||||
Version 4.3.2 of AIX, and possibly earlier versions as well,
|
||||
requires that file systems be exported with
|
||||
the <userinput>insecure</userinput> option, which
|
||||
causes NFS to listen to requests from
|
||||
insecure ports (i.e., ports above 1024, to which non-root users can
|
||||
bind). Older versions of AIX do not seem to require this.
|
||||
</para>
|
||||
|
@ -63,12 +65,14 @@
|
|||
<para>
|
||||
AIX clients will default to mounting version 3 NFS over TCP.
|
||||
If your Linux server does not support this, then you may need
|
||||
to specify vers=2 and/or proto=udp in your mount options.
|
||||
to specify <userinput>vers=2</userinput> and/or
|
||||
<userinput>proto=udp</userinput> in your mount options.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Using netmasks in <filename>/etc/exports</filename> seems to sometimes cause clients
|
||||
Using netmasks in <filename>/etc/exports</filename>
|
||||
seems to sometimes cause clients
|
||||
to lose mounts when another client is reset. This can be fixed
|
||||
by listing out hosts explicitly.
|
||||
</para>
|
||||
|
@ -95,13 +99,15 @@
|
|||
<title>Linux servers and BSD clients</title>
|
||||
<para>
|
||||
Some versions of BSD may make requests to the server from insecure ports,
|
||||
in which case you will need to export your volumes with the insecure
|
||||
option. See the man page for <emphasis>exports(5)</emphasis> for more details.
|
||||
in which case you will need to export your volumes with the
|
||||
<userinput>insecure</userinput>
|
||||
option. See the man page for <emphasis>exports(5)</emphasis>
|
||||
for more details.
|
||||
</para>
|
||||
</sect3>
|
||||
</sect2>
|
||||
<sect2 id="tru64">
|
||||
<title>Compaq Tru64 Unix</title>
|
||||
<title>Tru64 Unix</title>
|
||||
<sect3 id="tru64server">
|
||||
<title>Tru64 Unix Servers and Linux Clients</title>
|
||||
<para>
|
||||
|
@ -117,6 +123,12 @@
|
|||
-root=slave1.foo.com:slave2.foo.com
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
(The <userinput>root</userinput> option is listed in the last
|
||||
entry for informational purposes only; its use is not recommended
|
||||
unless necessary.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Tru64 checks the <filename>/etc/exports</filename> file every time there is a mount request
|
||||
so you do not need to run the <command>exportfs</command> command; in fact on many
|
||||
|
@ -129,7 +141,8 @@
|
|||
There are two issues to watch out for here. First, Tru64 Unix mounts
|
||||
using Version 3 NFS by default. You will see mount errors if your
|
||||
Linux server does not support Version 3 NFS. Second, in Tru64 Unix
|
||||
4.x, NFS locking requests are made by daemon. You will therefore
|
||||
4.x, NFS locking requests are made by
|
||||
<computeroutput>daemon</computeroutput>. You will therefore
|
||||
need to specify the <userinput>insecure_locks</userinput> option on all volumes you export
|
||||
to a Tru64 Unix 4.x client; see the <command>exports</command> man pages for details.
|
||||
</para>
|
||||
|
@ -153,7 +166,9 @@
|
|||
<title>Linux Servers and HP-UX Clients</title>
|
||||
<para>
|
||||
HP-UX diskless clients will require at least a kernel version 2.2.19
|
||||
(or patched 2.2.18) for device files to export correctly.
|
||||
(or patched 2.2.18) for device files to export correctly. Also, any
|
||||
exports to an HP-UX client will need to be exported with the
|
||||
<userinput>insecure_locks</userinput> option.
|
||||
</para>
|
||||
</sect3>
|
||||
</sect2>
|
||||
|
@ -176,11 +191,112 @@
|
|||
2.4 kernel. As a workaround, you can export and mount lower-down
|
||||
file systems separately.
|
||||
</para>
|
||||
</sect3>
|
||||
<para>
|
||||
As of Kernel 2.4.17, there continue to be several minor interoperability
|
||||
issues that may require a kernel upgrade. In particular:
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Make sure that Trond Myklebust's <application>seekdir</application>
|
||||
(or <application>dir</application>) kernel patch is applied.
|
||||
The latest version (for 2.4.17) is located at:
|
||||
</para>
|
||||
<para>
|
||||
<ulink url="http://www.fys.uio.no/~trondmy/src/2.4.17/linux-2.4.17-seekdir.dif">
|
||||
http://www.fys.uio.no/~trondmy/src/2.4.17/linux-2.4.17-seekdir.dif</ulink>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
IRIX servers do not always use the same
|
||||
<computeroutput>fsid</computeroutput> attribute field across
|
||||
reboots, which results in <computeroutput>inode number mismatch</computeroutput>
|
||||
errors on a Linux
|
||||
client if the mounted IRIX server reboots. A patch is available from:
|
||||
</para>
|
||||
<para><ulink url="http://www.geocrawler.com/lists/3/SourceForge/789/0/7777454/">
|
||||
http://www.geocrawler.com/lists/3/SourceForge/789/0/7777454/</ulink>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Linux kernels v2.4.9 and above have problems reading large directories
|
||||
(hundreds of files) from exported IRIX XFS file systems that were made
|
||||
with <userinput>naming version=1</userinput>.
|
||||
The reason for the problem can be found at:
|
||||
</para>
|
||||
<para>
|
||||
<ulink url="http://www.geocrawler.com/archives/3/789/2001/9/100/6531172/">
|
||||
http://www.geocrawler.com/archives/3/789/2001/9/100/6531172/</ulink>
|
||||
</para>
|
||||
<para>
|
||||
The naming version can be found by using (on the IRIX server):
|
||||
</para>
|
||||
<programlisting>
|
||||
xfs_growfs -n mount_point
|
||||
</programlisting>
|
||||
<para>
|
||||
The workaround is to export these file systems using the
|
||||
<userinput>-32bitclients</userinput>
|
||||
option in the <filename>/etc/exports</filename> file.
|
||||
The fix is to convert the file system to 'naming version=2'.
|
||||
Unfortunately the only way to do this is by a
|
||||
<userinput>backup</userinput>/<userinput>mkfs</userinput>/<userinput>restore</userinput>.
|
||||
</para>
|
||||
<para>
|
||||
<command>mkfs_xfs</command> on IRIX 6.5.14 (and above)
|
||||
creates <userinput>naming version=2</userinput> XFS file
|
||||
systems by default. On IRIX 6.5.5 to 6.5.13, use:
|
||||
<programlisting>
|
||||
mkfs_xfs -n version=2 device
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Versions of IRIX prior to 6.5.5 do not support
|
||||
<userinput>naming version=2</userinput> XFS file systems.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
|
||||
</sect3>
|
||||
<sect3 id="irixclient">
|
||||
<title>IRIX clients and Linux servers</title>
|
||||
<para>
|
||||
There are no known interoperability issues.
|
||||
Irix versions up to 6.5.12 have problems mounting file systems exported
|
||||
from Linux boxes - the mount point "gets lost," e.g.,
|
||||
<programlisting>
|
||||
# mount linux:/disk1 /mnt
|
||||
# cd /mnt/xyz/abc
|
||||
# pwd
|
||||
/xyz/abc
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
This is known IRIX bug (SGI bug 815265 - IRIX not liking file handles of
|
||||
less than 32 bytes), which is fixed in <application>IRIX 6.5.13</application>.
|
||||
If it is not possible
|
||||
to upgrade to <application>IRIX 6.5.13</application>, then the unofficial
|
||||
workaround is to force the Linux <command>nfsd</command>
|
||||
to always use 32 byte file handles.
|
||||
</para>
|
||||
<para>
|
||||
A number of patches exist - see:
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
<ulink url="http://www.geocrawler.com/archives/3/789/2001/8/50/6371896/">
|
||||
http://www.geocrawler.com/archives/3/789/2001/8/50/6371896/</ulink>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<ulink url="http://oss.sgi.com/projects/xfs/mail_archive/0110/msg00006.html">
|
||||
http://oss.sgi.com/projects/xfs/mail_archive/0110/msg00006.html</ulink>
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</sect3>
|
||||
</sect2>
|
||||
|
@ -190,10 +306,11 @@
|
|||
<title>Solaris Servers</title>
|
||||
<para>
|
||||
Solaris has a slightly different format on the server end from
|
||||
other operating systems. Instead of <filename>/etc/exports</filename>, the configuration
|
||||
file is <filename>/etc/dfs/dfstab</filename>. Entries are of the form of a "share"
|
||||
command, where the syntax for the example in <xref linkend="server"> would
|
||||
look like
|
||||
other operating systems. Instead of
|
||||
<filename>/etc/exports</filename>, the configuration
|
||||
file is <filename>/etc/dfs/dfstab</filename>. Entries are of
|
||||
the form of a <command>share</command> command, where the syntax
|
||||
for the example in <xref linkend="server"> would look like
|
||||
<programlisting>
|
||||
share -o rw=slave1,slave2 -d "Master Usr" /usr
|
||||
</programlisting>
|
||||
|
@ -202,11 +319,13 @@ share -o rw=slave1,slave2 -d "Master Usr" /usr
|
|||
<para>
|
||||
Solaris servers are especially sensitive to packet size. If you
|
||||
are using a Linux client with a Solaris server, be sure to set
|
||||
<userinput>rsize</userinput> and <userinput>wsize</userinput> to 32768 at mount time.
|
||||
<userinput>rsize</userinput> and <userinput>wsize</userinput>
|
||||
to 32768 at mount time.
|
||||
</para>
|
||||
<para>
|
||||
Finally, there is an issue with root squashing on Solaris: root gets
|
||||
mapped to the user <emphasis>noone</emphasis>, which is not the same as the user <emphasis>nobody</emphasis>.
|
||||
mapped to the user <computeroutput>noone</computeroutput>, which
|
||||
is not the same as the user <computeroutput>nobody</computeroutput>.
|
||||
If you are having trouble with file permissions as root on the client
|
||||
machine, be sure to check that the mapping works as you expect.
|
||||
</para>
|
||||
|
@ -221,7 +340,7 @@ svc: unknown program 100227 (me 100003)
|
|||
</screen>
|
||||
<para>
|
||||
This happens because Solaris clients, when they mount, try to obtain
|
||||
ACL information - which linux obviously does not have. The messages
|
||||
ACL information - which Linux obviously does not have. The messages
|
||||
can safely be ignored.
|
||||
</para>
|
||||
<para>
|
||||
|
@ -249,11 +368,16 @@ svc: unknown program 100227 (me 100003)
|
|||
/home -rw=slave1.foo.com,slave2.foo.com, root=slave1.foo.com,slave2.foo.com
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Again, the <userinput>root</userinput> option is listed for informational
|
||||
purposes and is not recommended unless necessary.
|
||||
</para>
|
||||
</sect3>
|
||||
<sect3 id="sunosclient">
|
||||
<title>SunOS Clients</title>
|
||||
<para>
|
||||
Be advised that SunOS makes all NFS locking requests as daemon, and
|
||||
Be advised that SunOS makes all NFS locking requests
|
||||
as <computeroutput>daemon</computeroutput>, and
|
||||
therefore you will need to add the <userinput>insecure_locks</userinput> option to any
|
||||
volumes you export to a SunOS machine. See the <command>exports</command> man page
|
||||
for details.
|
||||
|
|
|
@ -17,7 +17,8 @@
|
|||
</para>
|
||||
<para>
|
||||
There are other systems that provide similar functionality to NFS.
|
||||
Samba provides file services to Windows clients. The Andrew File
|
||||
Samba (<ulink url="http://www.samba.org">http://www.samba.org</ulink>)
|
||||
provides file services to Windows clients. The Andrew File
|
||||
System from IBM (<ulink url="http://www.transarc.com/Product/EFS/AFS/index.html">http://www.transarc.com/Product/EFS/AFS/index.html</ulink>),
|
||||
recently open-sourced, provides a file sharing mechanism with some
|
||||
additional security and performance features. The Coda File System
|
||||
|
@ -42,10 +43,8 @@
|
|||
<para>
|
||||
This HOWTO is not a description of the guts and
|
||||
underlying structure of NFS. For that you may wish to read
|
||||
<emphasis>Managing NFS and NIS</emphasis> by Hal Stern, published by O'Reilly &
|
||||
Associates, Inc. While that book is severely out of date, much
|
||||
of the structure of NFS has not changed, and the book describes it
|
||||
very articulately. A much more advanced and up-to-date technical
|
||||
<emphasis>Linux NFS and Automounter Administration</emphasis> by Erez Zadok (Sybex, 2001). The classic NFS book, updated and still quite useful, is <emphasis>Managing NFS and NIS</emphasis> by Hal Stern, published by O'Reilly &
|
||||
Associates, Inc. A much more advanced technical
|
||||
description of NFS is available in <emphasis>NFS Illustrated</emphasis> by Brent Callaghan.
|
||||
</para>
|
||||
<para>
|
||||
|
@ -58,7 +57,7 @@
|
|||
</para>
|
||||
<para>
|
||||
It will also not cover PC-NFS, which is considered obsolete (users
|
||||
are encouraged to use Samba to share files with PC's) or NFS
|
||||
are encouraged to use Samba to share files with Windows machines) or NFS
|
||||
Version 4, which is still in development.
|
||||
</para>
|
||||
</sect2>
|
||||
|
@ -98,7 +97,8 @@
|
|||
patches have been added because NFS Version 3 server support will be
|
||||
a configuration option. However, unless you have some particular
|
||||
reason to use an older kernel, you should upgrade because many bugs
|
||||
have been fixed along the way.
|
||||
have been fixed along the way. Kernel 2.2.19 contains some additional
|
||||
locking improvements over 2.2.18.
|
||||
</para>
|
||||
<para>
|
||||
Version 3 functionality will also require the nfs-utils package of
|
||||
|
@ -111,11 +111,24 @@
|
|||
<para>
|
||||
All 2.4 and higher kernels have full NFS Version 3 functionality.
|
||||
</para>
|
||||
<para>
|
||||
In all cases, if you are building your own kernel, you will need
|
||||
to select NFS and NFS Version 3 support at compile time. Most
|
||||
(but not all) standard distributions come with kernels that support
|
||||
NFS version 3.
|
||||
</para>
|
||||
<para>
|
||||
Handling files larger than 2 GB will require a 2.4x kernel and a
|
||||
2.2.x version of <application>glibc</application>.
|
||||
</para>
|
||||
<para>
|
||||
All kernels after 2.2.18 support NFS over TCP on the client side.
|
||||
As of this writing, server-side NFS over TCP only exists in the
|
||||
later 2.2 series (but not yet in the 2.4 kernels), is considered
|
||||
experimental, and is somewhat buggy.
|
||||
As of this writing, server-side NFS over TCP only exists in a
|
||||
buggy form as an experimental option in the post-2.2.18 series;
|
||||
patches for 2.4 and 2.5 kernels have been introduced starting with
|
||||
2.4.17 and 2.5.6. The patches are believed to be stable, though
|
||||
as of this writing they are relatively new and have not seen
|
||||
widespread use or integration into the mainstream 2.4 kernel.
|
||||
</para>
|
||||
<para>
|
||||
Because so many of the above functionalities were introduced in
|
||||
|
@ -125,8 +138,9 @@
|
|||
correctly.
|
||||
</para>
|
||||
<para>
|
||||
As we write this document, NFS version 4 is still in development
|
||||
as a protocol, and it will not be dealt with here.
|
||||
As we write this document, NFS version 4 has only recently been
|
||||
finalized as a protocol, and no implementations are considered
|
||||
production-ready. It will not be dealt with here.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="furtherhelp">
|
||||
|
@ -137,7 +151,66 @@
|
|||
mailing lists as well as the latest version of nfs-utils, NFS
|
||||
kernel patches, and other NFS related packages.
|
||||
</para>
|
||||
<para>
|
||||
|
||||
<para>
|
||||
When you encounter a problem or have a question not covered in this
|
||||
manual, the faq or the man pages, you should send a message to the nfs
|
||||
mailing list (<email>nfs@lists.sourceforge.net</email>). To best help the developers
|
||||
and other users help you assess your problem you should include:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
the version of <application>nfs-utils</application> you are using
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem
|
||||
<para>
|
||||
the version of the kernel and any non-stock applied kernels.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
the distribution of linux you are using
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
the version(s) of other operating systems involved.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
It is also useful to know the networking configuration connecting the
|
||||
hosts.
|
||||
</para>
|
||||
<para>
|
||||
If your problem involves the inability mount or export shares please
|
||||
also include:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
a copy of your <filename>/etc/exports</filename> file
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
the output of <command>rpcinfo -p</command> <emphasis>localhost</emphasis> run on the server
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
the output of <command>rpcinfo -p</command> <emphasis>servername</emphasis> run on the client
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
Sending all of this information with a specific question, after reading
|
||||
all the documentation, is the best way to ensure a helpful response from
|
||||
the list.
|
||||
</para>
|
||||
<para>
|
||||
You may also wish to look at the man pages for <emphasis>nfs(5)</emphasis>,
|
||||
<emphasis>exports(5)</emphasis>, <emphasis>mount(8)</emphasis>, <emphasis>fstab(5)</emphasis>,
|
||||
<emphasis>nfsd(8)</emphasis>, <emphasis>lockd(8)</emphasis>, <emphasis>statd(8)</emphasis>,
|
||||
|
|
|
@ -1,252 +1,652 @@
|
|||
<sect1 id="performance">
|
||||
<title>Optimizing NFS Performance</title>
|
||||
<para>
|
||||
Getting network settings right can improve NFS performance many times
|
||||
over -- a tenfold increase in transfer speeds is not unheard of.
|
||||
The most important things to get right are the <userinput>rsize</userinput>
|
||||
and <userinput>wsize</userinput> <command>mount</command> options. Other factors listed below
|
||||
may affect people with particular hardware setups.
|
||||
</para>
|
||||
Careful analysis of your environment, both from the client and from the server
|
||||
point of view, is the first step necessary for optimal NFS performance. The
|
||||
first sections will address issues that are generally important to the client.
|
||||
Later (<xref linkend="frag-overflow"> and beyond), server side issues
|
||||
will be discussed. In both
|
||||
cases, these issues will not be limited exclusively to one side or the other,
|
||||
but it is useful to separate the two in order to get a clearer picture of
|
||||
cause and effect.
|
||||
</para>
|
||||
<para>
|
||||
Aside from the general network configuration - appropriate network capacity,
|
||||
faster NICs, full duplex settings in order to reduce collisions, agreement in
|
||||
network speed among the switches and hubs, etc. - one of the most important
|
||||
client optimization settings are the NFS data transfer buffer sizes, specified
|
||||
by the <command>mount</command> command options <userinput>rsize</userinput>
|
||||
and <userinput>wsize</userinput>.
|
||||
</para>
|
||||
|
||||
<sect2 id="blocksizes">
|
||||
<title>Setting Block Size to Optimize Transfer Speeds</title>
|
||||
<para>
|
||||
The <userinput>rsize</userinput> and <userinput>wsize</userinput>
|
||||
<command>mount</command> options specify the size of the chunks of data
|
||||
that the client and server pass back and forth to each other. If no
|
||||
<userinput>rsize</userinput> and <userinput>wsize</userinput> options
|
||||
are specified, the default varies by which version of NFS we are using.
|
||||
4096 bytes is the most common default, although for TCP-based mounts
|
||||
in 2.2 kernels, and for all mounts beginning with 2.4 kernels, the
|
||||
server specifies the default block size.
|
||||
</para>
|
||||
<para>
|
||||
The defaults may be too big or too small. On the one hand, some
|
||||
combinations of Linux kernels and network cards (largely on older
|
||||
machines) cannot handle blocks that large. On the other hand, if they
|
||||
can handle larger blocks, a bigger size might be faster.
|
||||
</para>
|
||||
<para>
|
||||
So we'll want to experiment and find an rsize and wsize that works
|
||||
and is as fast as possible. You can test the speed of your options
|
||||
with some simple commands.
|
||||
</para>
|
||||
<para>
|
||||
The first of these commands transfers 16384 blocks of 16k each from
|
||||
the special file <filename>/dev/zero</filename> (which if you read it
|
||||
just spits out zeros _really_ fast) to the mounted partition. We will
|
||||
time it to see how long it takes. So, from the client machine, type:
|
||||
<screen>
|
||||
The <command>mount</command> command options <userinput>rsize</userinput>
|
||||
and <userinput>wsize</userinput> specify the size of the chunks of
|
||||
data that the client and server pass back and forth
|
||||
to each other. If no <userinput>rsize</userinput>
|
||||
and <userinput>wsize</userinput> options are specified,
|
||||
the default varies by which version of NFS we
|
||||
are using. The most common default is 4K (4096 bytes), although for TCP-based
|
||||
mounts in 2.2 kernels, and for all mounts beginning with 2.4 kernels, the
|
||||
server specifies the default block size.
|
||||
</para>
|
||||
<para>
|
||||
The theoretical limit for the NFS V2 protocol is 8K. For the V3 protocol, the
|
||||
limit is specific to the server. On the Linux server, the maximum block size
|
||||
is defined by the value of the kernel constant
|
||||
<userinput>NFSSVC_MAXBLKSIZE</userinput>, found in the
|
||||
Linux kernel source file <filename>./include/linux/nfsd/const.h</filename>.
|
||||
The current maximum block size for the kernel, as of 2.4.17, is 8K (8192 bytes),
|
||||
but the patch set implementing NFS over TCP/IP transport in the 2.4
|
||||
series, as of this writing, uses a value of 32K (defined in the
|
||||
patch as 32*1024) for the maximum block size.
|
||||
</para>
|
||||
<para>
|
||||
All 2.4 clients currently support up to 32K block transfer sizes, allowing the
|
||||
standard 32K block transfers across NFS mounts from other servers, such as
|
||||
Solaris, without client modification.
|
||||
</para>
|
||||
<para>
|
||||
The defaults may be too big or too small, depending on the specific
|
||||
combination of hardware and kernels. On the one hand, some combinations of
|
||||
Linux kernels and network cards (largely on older machines) cannot handle
|
||||
blocks that large. On the other hand, if they can handle larger blocks, a
|
||||
bigger size might be faster.
|
||||
</para>
|
||||
<para>
|
||||
You will want to experiment and find an <userinput>rsize</userinput>
|
||||
and <userinput>wsize</userinput> that works and is as
|
||||
fast as possible. You can test the speed of your options with some simple
|
||||
commands, if your network environment is not heavily used. Note that your
|
||||
results may vary widely unless you resort to using more complex benchmarks,
|
||||
such as <application>Bonnie</application>, <application>Bonnie++</application>,
|
||||
or <application>IOzone</application>.
|
||||
</para>
|
||||
<para>
|
||||
The first of these commands transfers 16384 blocks of 16k each from the
|
||||
special file <filename>/dev/zero</filename> (which if
|
||||
you read it just spits out zeros <emphasis>really</emphasis>
|
||||
fast) to the mounted partition. We will time it to see how long it takes. So,
|
||||
from the client machine, type:
|
||||
</para>
|
||||
<programlisting>
|
||||
# time dd if=/dev/zero of=/mnt/home/testfile bs=16k count=16384
|
||||
</screen>
|
||||
</para>
|
||||
<para>
|
||||
This creates a 256Mb file of zeroed bytes. In general, you should
|
||||
create a file that's at least twice as large as the system RAM
|
||||
on the server, but make sure you have enough disk space! Then read
|
||||
back the file into the great black hole on the client machine
|
||||
(<filename>/dev/null</filename>) by typing the following:
|
||||
<screen>
|
||||
</programlisting>
|
||||
<para>
|
||||
This creates a 256Mb file of zeroed bytes. In general, you should create a
|
||||
file that's at least twice as large as the system RAM on the server, but make
|
||||
sure you have enough disk space! Then read back the file into the great black
|
||||
hole on the client machine (<filename>/dev/null</filename>) by
|
||||
typing the following:
|
||||
</para>
|
||||
<programlisting>
|
||||
# time dd if=/mnt/home/testfile of=/dev/null bs=16k
|
||||
</screen>
|
||||
</para>
|
||||
<para>
|
||||
Repeat this a few times and average how long it takes. Be sure to
|
||||
unmount and remount the filesystem each time (both on the client and,
|
||||
if you are zealous, locally on the server as well), which should clear
|
||||
out any caches.
|
||||
</para>
|
||||
<para>
|
||||
Then unmount, and mount again with a larger and smaller block size.
|
||||
They should probably be multiples of 1024, and not larger than
|
||||
8192 bytes since that's the maximum size in NFS version 2. (Though
|
||||
if you are using Version 3 you might want to try up to 32768.)
|
||||
Wisdom has it that the block size should be a power of two since most
|
||||
of the parameters that would constrain it (such as file system block
|
||||
sizes and network packet size) are also powers of two. However, some
|
||||
users have reported better successes with block sizes that are not
|
||||
powers of two but are still multiples of the file system block size
|
||||
and the network packet size.
|
||||
</para>
|
||||
<para>
|
||||
Directly after mounting with a larger size, cd into the mounted
|
||||
file system and do things like ls, explore the fs a bit to make
|
||||
sure everything is as it should. If the rsize/wsize is too large
|
||||
the symptoms are very odd and not 100% obvious. A typical symptom
|
||||
is incomplete file lists when doing 'ls', and no error messages.
|
||||
Or reading files failing mysteriously with no error messages. After
|
||||
establishing that the given rsize/wsize works you can do the speed
|
||||
tests again. Different server platforms are likely to have different
|
||||
optimal sizes. SunOS and Solaris is reputedly a lot faster with 4096
|
||||
byte blocks than with anything else.
|
||||
</para>
|
||||
<para>
|
||||
<emphasis>Remember to edit <filename>/etc/fstab</filename> to reflect the rsize/wsize you found.</emphasis>
|
||||
</para>
|
||||
</programlisting>
|
||||
<para>
|
||||
Repeat this a few times and average how long it takes. Be sure to unmount and
|
||||
remount the filesystem each time (both on the client and, if you are zealous,
|
||||
locally on the server as well), which should clear out any caches.
|
||||
</para>
|
||||
<para>
|
||||
Then unmount, and mount again with a larger and smaller block size. They
|
||||
should be multiples of 1024, and not larger than the maximum block size
|
||||
allowed by your system. Note that NFS Version 2 is limited to a maximum of 8K,
|
||||
regardless of the maximum block size defined by
|
||||
<userinput>NFSSVC_MAXBLKSIZE</userinput>; Version 3
|
||||
will support up to 64K, if permitted. The block size should be a power of two
|
||||
since most of the parameters that would constrain it (such as file system
|
||||
block sizes and network packet size) are also powers of two. However, some
|
||||
users have reported better successes with block sizes that are not powers of
|
||||
two but are still multiples of the file system block size and the network
|
||||
packet size.
|
||||
</para>
|
||||
<para>
|
||||
Directly after mounting with a larger size, cd into the mounted
|
||||
file system and do things like <command>ls</command>, explore
|
||||
the filesystem a bit to make sure everything is as it
|
||||
should. If the <userinput>rsize</userinput>/<userinput>wsize</userinput>
|
||||
is too large the symptoms are very odd and not 100%
|
||||
obvious. A typical symptom is incomplete file lists when doing
|
||||
<command>ls</command>, and no
|
||||
error messages, or reading files failing mysteriously with no error messages.
|
||||
After establishing that the given <userinput>rsize</userinput>/
|
||||
<userinput>wsize</userinput> works you can do the speed tests
|
||||
again. Different server platforms are likely to have different optimal sizes.
|
||||
</para>
|
||||
<para>
|
||||
Remember to edit <filename>/etc/fstab</filename> to reflect the
|
||||
<userinput>rsize</userinput>/<userinput>wsize</userinput> you found
|
||||
to be the most desirable.
|
||||
</para>
|
||||
<para>
|
||||
If your results seem inconsistent, or doubtful, you may need to analyze your
|
||||
network more extensively while varying the <userinput>rsize</userinput>
|
||||
and <userinput>wsize</userinput> values. In that
|
||||
case, here are several pointers to benchmarks that may prove useful:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Bonnie <ulink
|
||||
url="http://www.textuality.com/bonnie/">http://www.textuality.com/bonnie/</ulink>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Bonnie++ <ulink
|
||||
url="http://www.coker.com.au/bonnie++/">http://www.coker.com.au/bonnie++/</ulink>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
IOzone file system benchmark <ulink url="http://www.iozone.org/">http://www.iozone.org/</ulink>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
The official NFS benchmark,
|
||||
SPECsfs97 <ulink url="http://www.spec.org/osg/sfs97/">http://www.spec.org/osg/sfs97/</ulink>
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
The easiest benchmark with the widest coverage, including an extensive spread
|
||||
of file sizes, and of IO types - reads, & writes, rereads & rewrites, random
|
||||
access, etc. - seems to be IOzone. A recommended invocation of IOzone (for
|
||||
which you must have root privileges) includes unmounting and remounting the
|
||||
directory under test, in order to clear out the caches between tests, and
|
||||
including the file close time in the measurements. Assuming you've already
|
||||
exported <filename>/tmp</filename> to everyone from the server
|
||||
<computeroutput>foo</computeroutput>,
|
||||
and that you've installed IOzone in the local directory, this should work:
|
||||
</para>
|
||||
<programlisting>
|
||||
# echo "foo:/tmp /mnt/foo nfs rw,hard,intr,rsize=8192,wsize=8192 0 0"
|
||||
>> /etc/fstab
|
||||
# mkdir /mnt/foo
|
||||
# mount /mnt/foo
|
||||
# ./iozone -a -R -c -U /mnt/foo -f /mnt/foo/testfile > logfile
|
||||
</programlisting>
|
||||
<para>
|
||||
The benchmark should take 2-3 hours at most, but of course you will need to
|
||||
run it for each value of rsize and wsize that is of interest. The web site
|
||||
gives full documentation of the parameters, but the specific options used
|
||||
above are:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
<userinput>-a</userinput> Full automatic mode, which tests file sizes of 64K to 512M, using
|
||||
record sizes of 4K to 16M
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<userinput>-R</userinput> Generate report in excel spreadsheet form (The "surface plot"
|
||||
option for graphs is best)
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<userinput>-c</userinput> Include the file close time in the tests, which will pick up the
|
||||
NFS version 3 commit time
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<userinput>-U</userinput> Use the given mount point to unmount and remount between tests;
|
||||
it clears out caches
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<userinput>-f</userinput> When using unmount, you have to locate the test file in the
|
||||
mounted file system
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</sect2>
|
||||
<sect2 id="packet-and-network">
|
||||
<title>Packet Size and Network Drivers</title>
|
||||
<para>
|
||||
There are many shoddy network drivers available for Linux,
|
||||
including for some fairly standard cards.
|
||||
</para>
|
||||
While many Linux network card drivers are excellent, some are quite shoddy,
|
||||
including a few drivers for some fairly standard cards. It is worth
|
||||
experimenting with your network card directly to find out how it can
|
||||
best handle traffic.
|
||||
</para>
|
||||
<para>
|
||||
Try <command>ping</command>ing back and forth
|
||||
between the two machines with large packets using
|
||||
the <userinput>-f</userinput> and <userinput>-s</userinput>
|
||||
options with <command>ping</command> (see <emphasis>ping(8)</emphasis>
|
||||
for more details) and see if a
|
||||
lot of packets get dropped, or if they take a long time for a reply. If so,
|
||||
you may have a problem with the performance of your network card.
|
||||
</para>
|
||||
<para>
|
||||
For a more extensive analysis of NFS behavior in particular, use the <command>
|
||||
nfsstat</command> command to look at nfs transactions, client and server statistics, network
|
||||
statistics, and so forth. The <userinput>"-o net"</userinput> option will show you the number of
|
||||
dropped packets in relation to the total number of transactions. In UDP
|
||||
transactions, the most important statistic is the number of retransmissions,
|
||||
due to dropped packets, socket buffer overflows, general server congestion,
|
||||
timeouts, etc. This will have a tremendously important effect on NFS
|
||||
performance, and should be carefully monitored.
|
||||
Note that <command>nfsstat</command> does not yet
|
||||
implement the <userinput>-z</userinput> option, which would zero out all counters, so you must look
|
||||
at the current <command>nfsstat</command> counter values prior to running the benchmarks.
|
||||
</para>
|
||||
<para>
|
||||
To correct network problems, you may wish to reconfigure the packet size that
|
||||
your network card uses. Very often there is a constraint somewhere else in the
|
||||
network (such as a router) that causes a smaller maximum packet size between
|
||||
two machines than what the network cards on the machines are actually capable
|
||||
of. TCP should autodiscover the appropriate packet size for a network, but UDP
|
||||
will simply stay at a default value. So determining the appropriate packet
|
||||
size is especially important if you are using NFS over UDP.
|
||||
</para>
|
||||
<para>
|
||||
You can test for the network packet size using the <command>tracepath</command> command: From
|
||||
the client machine, just type <userinput>tracepath</userinput>
|
||||
<emphasis>server</emphasis> <userinput>2049</userinput> and the path MTU should
|
||||
be reported at the bottom. You can then set the MTU on your network card equal
|
||||
to the path MTU, by using the <userinput>MTU</userinput>
|
||||
option to <command>ifconfig</command>, and see if fewer packets
|
||||
get dropped. See the <command>ifconfig</command> man pages for details on how to reset the MTU.
|
||||
</para>
|
||||
<para>
|
||||
In addition, <command>netstat -s</command> will give the statistics collected for traffic across
|
||||
all supported protocols. You may also look at
|
||||
<filename>/proc/net/snmp</filename> for information
|
||||
about current network behavior; see the next section for more details.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="frag-overflow">
|
||||
<title>Overflow of Fragmented Packets</title>
|
||||
<para>
|
||||
Try pinging back and forth between the two machines with large
|
||||
packets using the <option>-f</option> and <option>-s</option>
|
||||
options with <command>ping</command> (see <command>man ping</command>)
|
||||
for more details and see if a lot of packets get or if they
|
||||
take a long time for a reply. If so, you may have a problem
|
||||
with the performance of your network card.
|
||||
</para>
|
||||
Using an <userinput>rsize</userinput> or <userinput>wsize</userinput>
|
||||
larger than your network's MTU (often set to 1500, in
|
||||
many networks) will cause IP packet fragmentation when using NFS over UDP. IP
|
||||
packet fragmentation and reassembly require a significant amount of CPU
|
||||
resource at both ends of a network connection. In addition, packet
|
||||
fragmentation also exposes your network traffic to greater unreliability,
|
||||
since a complete RPC request must be retransmitted if a UDP packet fragment is
|
||||
dropped for any reason. Any increase of RPC retransmissions, along with the
|
||||
possibility of increased timeouts, are the single worst impediment to
|
||||
performance for NFS over UDP.
|
||||
</para>
|
||||
<para>
|
||||
Packets may be dropped for many reasons. If your network topography is
|
||||
complex, fragment routes may differ, and may not all arrive at the Server for
|
||||
reassembly. NFS Server capacity may also be an issue, since the kernel has a
|
||||
limit of how many fragments it can buffer before it starts throwing away
|
||||
packets. With kernels that support the <filename>/proc</filename>
|
||||
filesystem, you can monitor the
|
||||
files <filename>/proc/sys/net/ipv4/ipfrag_high_thresh</filename> and
|
||||
<filename>/proc/sys/net/ipv4/ipfrag_low_thresh</filename>. Once the number of unprocessed,
|
||||
fragmented packets reaches the number specified by <filename>ipfrag_high_thresh</filename> (in
|
||||
bytes), the kernel will simply start throwing away fragmented packets until
|
||||
the number of incomplete packets reaches the number specified by
|
||||
<filename>ipfrag_low_thresh.</filename>
|
||||
</para>
|
||||
<para>
|
||||
Another counter to monitor is <userinput>IP: ReasmFails</userinput>
|
||||
in the file <filename>/proc/net/snmp</filename>; this
|
||||
is the number of fragment reassembly failures. if it goes up too quickly
|
||||
during heavy file activity, you may have problem.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="nfs-tcp">
|
||||
<title>NFS over TCP</title>
|
||||
<para>
|
||||
To correct such a problem, you may wish to reconfigure the packet
|
||||
size that your network card uses. Very often there is a constraint
|
||||
somewhere else in the network (such as a router) that causes a
|
||||
smaller maximum packet size between two machines than what the
|
||||
network cards on the machines are actually capable of. TCP should
|
||||
autodiscover the appropriate packet size for a network, but UDP
|
||||
will simply stay at a default value. So determining the appropriate
|
||||
packet size is especially important if you are using NFS over UDP.
|
||||
</para>
|
||||
A new feature, available for both 2.4 and 2.5 kernels but not yet
|
||||
integrated into the mainstream kernel at the time of
|
||||
this writing, is NFS over TCP. Using TCP
|
||||
has a distinct advantage and a distinct disadvantage over UDP. The advantage
|
||||
is that it works far better than UDP on lossy networks.
|
||||
When using TCP, a single dropped packet can be retransmitted, without
|
||||
the retransmission of the entire RPC request, resulting in better performance
|
||||
on lossy networks. In addition, TCP will handle network speed differences
|
||||
better than UDP, due to the underlying flow control at the network level.
|
||||
</para>
|
||||
<para>
|
||||
The disadvantage of using TCP is that it is not a stateless protocol like
|
||||
UDP. If your server crashes in the middle of a packet transmission,
|
||||
the client will hang and any shares will need to be unmounted and remounted.
|
||||
</para>
|
||||
<para>
|
||||
The overhead incurred by the TCP protocol will result in
|
||||
somewhat slower performance than UDP under ideal network
|
||||
conditions, but the cost is not severe, and is often not
|
||||
noticable without careful measurement. If you are using
|
||||
gigabit ethernet from end to end, you might also investigate the
|
||||
usage of jumbo frames, since the high speed network may
|
||||
allow the larger frame sizes without encountering increased
|
||||
collision rates, particularly if you have set the network
|
||||
to full duplex.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="timeout">
|
||||
<title>Timeout and Retransmission Values</title>
|
||||
<para>
|
||||
You can test for the network packet size using the tracepath command:
|
||||
From the client machine, just type <command>tracepath [server] 2049</command>
|
||||
and the path MTU should be reported at the bottom. You can then set the
|
||||
MTU on your network card equal to the path MTU, by using the MTU option
|
||||
to <command>ifconfig</command>, and see if fewer packets get dropped.
|
||||
See the <command>ifconfig</command> man pages for details on how to reset the MTU.
|
||||
</para>
|
||||
Two mount command options, <userinput>timeo</userinput>
|
||||
and <userinput>retrans</userinput>, control the behavior of UDP
|
||||
requests when encountering client timeouts due to dropped packets, network
|
||||
congestion, and so forth. The <userinput>-o timeo</userinput>
|
||||
option allows designation of the length
|
||||
of time, in tenths of seconds, that the client will wait until it decides it
|
||||
will not get a reply from the server, and must try to send the request again.
|
||||
The default value is 7 tenths of a second. The
|
||||
<userinput>-o retrans</userinput> option allows
|
||||
designation of the number of timeouts allowed before the client gives up, and
|
||||
displays the <computeroutput>Server not responding</computeroutput>
|
||||
message. The default value is 3 attempts.
|
||||
Once the client displays this message, it will continue to try to send
|
||||
the request, but only once before displaying the error message if
|
||||
another timeout occurs. When the client reestablishes contact, it
|
||||
will fall back to using the correct <userinput>retrans</userinput>
|
||||
value, and will display the <computeroutput>Server OK</computeroutput> message.
|
||||
</para>
|
||||
<para>
|
||||
If you are already encountering excessive retransmissions (see the output of
|
||||
the <command>nfsstat</command> command), or want to increase the block transfer size without
|
||||
encountering timeouts and retransmissions, you may want to adjust these
|
||||
values. The specific adjustment will depend upon your environment, and in most
|
||||
cases, the current defaults are appropriate.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="nfsd-instance">
|
||||
<title>Number of Instances of NFSD</title>
|
||||
<title>Number of Instances of the NFSD Server Daemon</title>
|
||||
<para>
|
||||
Most startup scripts, Linux and otherwise, start 8 instances of nfsd.
|
||||
In the early days of NFS, Sun decided on this number as a rule of thumb,
|
||||
and everyone else copied. There are no good measures of how many
|
||||
instances are optimal, but a more heavily-trafficked server may require
|
||||
more. If you are using a 2.4 or higher kernel and you want to see how
|
||||
heavily each nfsd thread is being used, you can look at the file
|
||||
<filename>/proc/net/rpc/nfsd</filename>. The last ten numbers on the
|
||||
<emphasis>th</emphasis> line in that file indicate the number of seconds
|
||||
that the thread usage was at that percentage of the maximum allowable.
|
||||
If you have a large number in the top three deciles, you may wish to
|
||||
increase the number of <command>nfsd</command> instances. This is done
|
||||
upon starting <command>nfsd</command> using the number of instances as
|
||||
the command line option. See the <command>nfsd</command> man page for
|
||||
more information.
|
||||
</para>
|
||||
Most startup scripts, Linux and otherwise, start 8 instances of
|
||||
<command>nfsd</command>. In the
|
||||
early days of NFS, Sun decided on this number as a rule of thumb, and everyone
|
||||
else copied. There are no good measures of how many instances are optimal, but
|
||||
a more heavily-trafficked server may require more.
|
||||
You should use at the very least one daemon per processor, but
|
||||
four to eight per processor may be a better rule of thumb.
|
||||
If you are using a 2.4 or
|
||||
higher kernel and you want to see how heavily each
|
||||
<command>nfsd</command> thread is being used,
|
||||
you can look at the file <filename>/proc/net/rpc/nfsd</filename>.
|
||||
The last ten numbers on the <userinput>th</userinput>
|
||||
line in that file indicate the number of seconds that the thread usage was at
|
||||
that percentage of the maximum allowable. If you have a large number in the
|
||||
top three deciles, you may wish to increase the number
|
||||
of <command>nfsd</command> instances. This
|
||||
is done upon starting <command>nfsd</command> using the
|
||||
number of instances as the command line
|
||||
option, and is specified in the NFS startup script
|
||||
(<filename>/etc/rc.d/init.d/nfs</filename> on
|
||||
Red Hat) as <userinput>RPCNFSDCOUNT</userinput>.
|
||||
See the <emphasis>nfsd(8)</emphasis> man page for more information.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="memlimits">
|
||||
<title>Memory Limits on the Input Queue</title>
|
||||
<para>
|
||||
On 2.2 and 2.4 kernels, the socket input queue, where requests
|
||||
sit while they are currently being processed, has a small default
|
||||
size limit of 64k. This means that if you are running 8 instances of
|
||||
<command>nfsd</command>, each will only have 8k to store requests while it processes
|
||||
them.
|
||||
</para>
|
||||
<para>
|
||||
You should consider increasing this number to at least 256k for <command>nfsd</command>.
|
||||
This limit is set in the proc file system using the files
|
||||
<filename>/proc/sys/net/core/rmem_default</filename> and <filename>/proc/sys/net/core/rmem_max</filename>.
|
||||
It can be increased in three steps; the following method is a bit of
|
||||
a hack but should work and should not cause any problems:
|
||||
</para>
|
||||
<para>
|
||||
<orderedlist Numeration="loweralpha">
|
||||
<listitem>
|
||||
<para>Increase the size listed in the file:
|
||||
<programlisting>
|
||||
echo 262144 > /proc/sys/net/core/rmem_default
|
||||
echo 262144 > /proc/sys/net/core/rmem_max
|
||||
</programlisting>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Restart <command>nfsd</command>, e.g., type <command>/etc/rc.d/init.d/nfsd restart</command> on Red Hat
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Return the size limits to their normal size in case other kernel systems depend on it:
|
||||
<programlisting>
|
||||
echo 65536 > /proc/sys/net/core/rmem_default
|
||||
echo 65536 > /proc/sys/net/core/rmem_max
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
<emphasis>
|
||||
Be sure to perform this last step because machines have been reported
|
||||
to crash if these values are left changed for long periods of time.
|
||||
</emphasis>
|
||||
</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
On 2.2 and 2.4 kernels, the socket input queue, where requests sit while they
|
||||
are currently being processed, has a small default size limit (<filename>rmem_default</filename>)
|
||||
of 64k. This queue is important for clients with heavy read loads, and servers
|
||||
with heavy write loads. As an example, if you are running 8 instances of nfsd
|
||||
on the server, each will only have 8k to store write requests while it
|
||||
processes them. In addition, the socket output queue - important for clients
|
||||
with heavy write loads and servers with heavy read loads - also has a small
|
||||
default size (<filename>wmem_default</filename>).
|
||||
</para>
|
||||
<para>
|
||||
Several published runs of the NFS benchmark
|
||||
<ulink url="http://www.spec.org/osg/sfs97/">SPECsfs</ulink>
|
||||
specify usage of a much higher value for both
|
||||
the read and write value sets, <filename>[rw]mem_default</filename> and
|
||||
<filename>[rw]mem_max</filename>. You might
|
||||
consider increasing these values to at least 256k. The read and write limits
|
||||
are set in the proc file system using (for example) the files
|
||||
<filename>/proc/sys/net/core/rmem_default</filename> and
|
||||
<filename>/proc/sys/net/core/rmem_max</filename>. The
|
||||
<filename>rmem_default</filename> value can be increased in three steps; the following method is a
|
||||
bit of a hack but should work and should not cause any problems:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Increase the size listed in the file:
|
||||
</para>
|
||||
<programlisting>
|
||||
# echo 262144 > /proc/sys/net/core/rmem_default
|
||||
# echo 262144 > /proc/sys/net/core/rmem_max
|
||||
</programlisting>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Restart NFS. For example, on Red Hat systems,
|
||||
</para>
|
||||
<programlisting>
|
||||
# /etc/rc.d/init.d/nfs restart
|
||||
</programlisting>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
You might return the size limits to their normal size in case other
|
||||
kernel systems depend on it:
|
||||
</para>
|
||||
<programlisting>
|
||||
# echo 65536 > /proc/sys/net/core/rmem_default
|
||||
# echo 65536 > /proc/sys/net/core/rmem_max
|
||||
</programlisting>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
This last step may be necessary because machines have been reported to
|
||||
crash if these values are left changed for long periods of time.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="frag-overflow">
|
||||
<title>Overflow of Fragmented Packets</title>
|
||||
<para>
|
||||
The NFS protocol uses fragmented UDP packets. The kernel has
|
||||
a limit of how many fragments of incomplete packets it can
|
||||
buffer before it starts throwing away packets. With 2.2 kernels
|
||||
that support the <filename>/proc</filename> filesystem, you can
|
||||
specify how many by editing the files
|
||||
<filename>/proc/sys/net/ipv4/ipfrag_high_thresh</filename> and
|
||||
<filename>/proc/sys/net/ipv4/ipfrag_low_thresh</filename>.
|
||||
</para>
|
||||
<para>
|
||||
Once the number of unprocessed, fragmented packets reaches the
|
||||
number specified by <userinput>ipfrag_high_thresh</userinput> (in bytes), the kernel
|
||||
will simply start throwing away fragmented packets until the number
|
||||
of incomplete packets reaches the number specified
|
||||
by <userinput>ipfrag_low_thresh</userinput>. (With 2.2 kernels, the default is usually 256K).
|
||||
This will look like packet loss, and if the high threshold is
|
||||
reached your server performance drops a lot.
|
||||
</para>
|
||||
<para>
|
||||
One way to monitor this is to look at the field IP: ReasmFails in the
|
||||
file <filename>/proc/net/snmp</filename>; if it goes up too quickly during heavy file
|
||||
activity, you may have problem. Good alternative values for
|
||||
<userinput>ipfrag_high_thresh</userinput> and <userinput>ipfrag_low_thresh</userinput>
|
||||
have not been reported; if you have a good experience with a
|
||||
particular value, please let the maintainers and development team know.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="autonegotiation">
|
||||
<title>Turning Off Autonegotiation of NICs and Hubs</title>
|
||||
<para>
|
||||
Sometimes network cards will auto-negotiate badly with
|
||||
hubs and switches and this can have strange effects.
|
||||
Moreover, hubs may lose packets if they have different
|
||||
ports running at different speeds. Try playing around
|
||||
with the network speed and duplex settings.
|
||||
</para>
|
||||
If network cards auto-negotiate badly with hubs and switches, and ports run at
|
||||
different speeds, or with different duplex configurations, performance will be
|
||||
severely impacted due to excessive collisions, dropped packets, etc. If you
|
||||
see excessive numbers of dropped packets in the
|
||||
<command>nfsstat</command> output, or poor
|
||||
network performance in general, try playing around with the network speed and
|
||||
duplex settings. If possible, concentrate on establishing a 100BaseT full
|
||||
duplex subnet; the virtual elimination of collisions in full duplex will
|
||||
remove the most severe performance inhibitor for NFS over UDP. Be careful
|
||||
when turning off autonegotiation on a card: The hub or switch that the card
|
||||
is attached to will then resort to other mechanisms (such as parallel detection)
|
||||
to determine the duplex settings, and some cards default to half duplex
|
||||
because it is more likely to be supported by an old hub. The best solution,
|
||||
if the driver supports it, is to force the card to negotiate 100BaseT
|
||||
full duplex.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="sync-async">
|
||||
<title>Synchronous vs. Asynchronous Behavior in NFS</title>
|
||||
<para>
|
||||
The default export behavior for both NFS Version 2 and Version 3 protocols,
|
||||
used by <command>exportfs</command> in <application>nfs-utils</application>
|
||||
versions prior to Version 1.11 (the latter is in the CVS tree,
|
||||
but not yet released in a package, as of January, 2002) is
|
||||
"asynchronous". This default permits the server to reply to client requests as
|
||||
soon as it has processed the request and handed it off to the local file
|
||||
system, without waiting for the data to be written to stable storage. This is
|
||||
indicated by the <userinput>async</userinput> option denoted in the server's export list. It yields
|
||||
better performance at the cost of possible data corruption if the server
|
||||
reboots while still holding unwritten data and/or metadata in its caches. This
|
||||
possible data corruption is not detectable at the time of occurrence, since
|
||||
the <userinput>async</userinput> option instructs the server to lie to the client, telling the
|
||||
client that all data has indeed been written to the stable storage, regardless
|
||||
of the protocol used.
|
||||
</para>
|
||||
<para>
|
||||
In order to conform with "synchronous" behavior, used as the default for most
|
||||
proprietary systems supporting NFS (Solaris, HP-UX, RS/6000, etc.), and now
|
||||
used as the default in the latest version of <command>exportfs</command>, the Linux Server's
|
||||
file system must be exported with the <userinput>sync</userinput> option. Note that specifying
|
||||
synchronous exports will result in no option being seen in the server's export
|
||||
list:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Export a couple file systems to everyone, using slightly different
|
||||
options:
|
||||
</para>
|
||||
<para>
|
||||
<programlisting>
|
||||
# /usr/sbin/exportfs -o rw,sync *:/usr/local
|
||||
# /usr/sbin/exportfs -o rw *:/tmp
|
||||
</programlisting>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Now we can see what the exported file system parameters look like:
|
||||
</para>
|
||||
<para>
|
||||
<programlisting>
|
||||
# /usr/sbin/exportfs -v
|
||||
/usr/local *(rw)
|
||||
/tmp *(rw,async)
|
||||
</programlisting>
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
If your kernel is compiled with the <filename>/proc</filename> filesystem,
|
||||
then the file <filename>/proc/fs/nfs/exports</filename> will also show the
|
||||
full list of export options.
|
||||
</para>
|
||||
<para>
|
||||
When synchronous behavior is specified, the server will not complete (that is,
|
||||
reply to the client) an NFS version 2 protocol request until the local file
|
||||
system has written all data/metadata to the disk. The server
|
||||
<emphasis>will</emphasis> complete a
|
||||
synchronous NFS version 3 request without this delay, and will return the
|
||||
status of the data in order to inform the client as to what data should be
|
||||
maintained in its caches, and what data is safe to discard. There are 3
|
||||
possible status values, defined an enumerated type, <userinput>nfs3_stable_how</userinput>, in
|
||||
<filename>include/linux/nfs.h</filename>. The values, along with the subsequent actions taken due
|
||||
to these results, are as follows:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
NFS_UNSTABLE - Data/Metadata was not committed to stable storage on the
|
||||
server, and must be cached on the client until a subsequent client commit
|
||||
request assures that the server does send data to stable storage.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
NFS_DATA_SYNC - Metadata was not sent to stable storage, and must be cached
|
||||
on the client. A subsequent commit is necessary, as is required above.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
NFS_FILE_SYNC - No data/metadata need be cached, and a subsequent commit
|
||||
need not be sent for the range covered by this request.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
In addition to the above definition of synchronous behavior, the client may
|
||||
explicitly insist on total synchronous behavior, regardless of the protocol,
|
||||
by opening all files with the <userinput>O_SYNC</userinput> option. In this case, all replies to
|
||||
client requests will wait until the data has hit the server's disk, regardless
|
||||
of the protocol used (meaning that, in NFS version 3, all requests will be
|
||||
<userinput>NFS_FILE_SYNC</userinput> requests, and will require that the Server returns this status).
|
||||
In that case, the performance of NFS Version 2 and NFS Version 3 will be
|
||||
virtually identical.
|
||||
</para>
|
||||
<para>
|
||||
If, however, the old default <userinput>async</userinput>
|
||||
behavior is used, the <userinput>O_SYNC</userinput> option has
|
||||
no effect at all in either version of NFS, since the server will reply to the
|
||||
client without waiting for the write to complete. In that case the performance
|
||||
differences between versions will also disappear.
|
||||
</para>
|
||||
<para>
|
||||
Finally, note that, for NFS version 3 protocol requests, a subsequent commit
|
||||
request from the NFS client at file close time, or at <command>fsync()</command> time, will force
|
||||
the server to write any previously unwritten data/metadata to the disk, and
|
||||
the server will not reply to the client until this has been completed, as long
|
||||
as <userinput>sync</userinput> behavior is followed. If <userinput>async</userinput> is used, the commit is essentially
|
||||
a no-op, since the server once again lies to the client, telling the client that
|
||||
the data has been sent to stable storage. This again exposes the client and
|
||||
server to data corruption, since cached data may be discarded on the client
|
||||
due to its belief that the server now has the data maintained in stable
|
||||
storage.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="non-nfs-performance">
|
||||
<title>Non-NFS-Related Means of Enhancing Server Performance</title>
|
||||
<para>
|
||||
Offering general guidelines for setting up a well-functioning
|
||||
file server is outside the scope of this document, but a few
|
||||
hints may be worth mentioning: First, RAID 5 gives you good
|
||||
read speeds but lousy write speeds; consider RAID 1/0 if both
|
||||
write speed and redundancy are important. Second, using a
|
||||
journalling filesystem will drastically reduce your reboot
|
||||
time in the event of a system crash; as of this writing, ext3
|
||||
(<ulink url="ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/">ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/</ulink>) was the only
|
||||
journalling filesystem that worked correctly with
|
||||
NFS version 3, but no doubt that will change soon.
|
||||
In particular, it looks like <ulink url="http://www.namesys.com">Reiserfs</ulink>
|
||||
should work with NFS version 3 on 2.4 kernels, though not yet
|
||||
on 2.2 kernels. Finally, using an automounter (such as autofs
|
||||
or amd) may prevent hangs if you cross-mount files
|
||||
on your machines (whether on purpose or by oversight) and one of those
|
||||
machines goes down. See the
|
||||
<ulink url="http://www.linuxdoc.org/HOWTO/mini/Automount.html">Automount Mini-HOWTO</ulink>
|
||||
for details.
|
||||
</para>
|
||||
In general, server performance and server disk access speed will have an
|
||||
important effect on NFS performance.
|
||||
Offering general guidelines for setting up a well-functioning file server is
|
||||
outside the scope of this document, but a few hints may be worth mentioning:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
If you have access to RAID arrays, use RAID 1/0 for both write speed and
|
||||
redundancy; RAID 5 gives you good read speeds but lousy write speeds.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
A journalling filesystem will drastically reduce your reboot time in the
|
||||
event of a system crash. Currently,
|
||||
<ulink url="ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/">ext3
|
||||
</ulink> will work correctly with NFS
|
||||
version 3. In addition, Reiserfs version 3.6 will work with NFS version 3 on
|
||||
2.4.7 or later kernels (patches are available for previous kernels). Earlier versions
|
||||
of Reiserfs did not include room for generation numbers in the inode, exposing
|
||||
the possibility of undetected data corruption during a server reboot.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Additionally, journalled file systems can be configured to maximize
|
||||
performance by taking advantage of the fact that journal updates are all that
|
||||
is necessary for data protection. One example is using ext3 with <userinput>data=journal</userinput>
|
||||
so that all updates go first to the journal, and later to the main file
|
||||
system. Once the journal has been updated, the NFS server can safely issue the
|
||||
reply to the clients, and the main file system update can occur at the
|
||||
server's leisure.
|
||||
</para>
|
||||
<para>
|
||||
The journal in a journalling file system may also reside on a separate device
|
||||
such as a flash memory card so that journal updates normally require no seeking. With only rotational
|
||||
delay imposing a cost, this gives reasonably good synchronous IO performance.
|
||||
Note that ext3 currently supports journal relocation, and ReiserFS will
|
||||
(officially) support it soon. The Reiserfs tool package found at <ulink
|
||||
url="ftp://ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.x.0k.tar.gz">
|
||||
ftp://ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.x.0k.tar.gz </ulink>
|
||||
contains
|
||||
the <command>reiserfstune</command> tool, which will allow journal relocation. It does, however,
|
||||
require a kernel patch which has not yet been officially released as of
|
||||
January, 2002.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Using an automounter (such as <application>autofs</application> or <application>amd</application>) may prevent hangs if you
|
||||
cross-mount files on your machines (whether on purpose or by oversight) and
|
||||
one of those machines goes down. See the <ulink url="http://www.linuxdoc.org/HOWTO/mini/Automount.html">Automount Mini-HOWTO</ulink> for details.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Some manufacturers (Network Appliance, Hewlett Packard, and others) provide NFS
|
||||
accelerators in the form of Non-Volatile RAM. NVRAM will boost access speed to
|
||||
stable storage up to the equivalent of <userinput>async</userinput> access.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
|
|
|
@ -3,7 +3,8 @@
|
|||
<sect2 id="legal">
|
||||
<title>Legal stuff</title>
|
||||
<para>
|
||||
Copyright (c) <2001> by Tavis Barr, Nicolai Langfeldt, and Seth Vidal.
|
||||
Copyright (c) <2002> by Tavis Barr, Nicolai Langfeldt,
|
||||
Seth Vidal, and Tom McNeal.
|
||||
This material may be distributed only subject to the terms and conditions set
|
||||
forth in the Open Publication License, v1.0 or later (the latest version
|
||||
is presently available at <ulink url="http://www.opencontent.org/openpub/">http://www.opencontent.org/openpub/</ulink>).
|
||||
|
@ -23,7 +24,7 @@ divorce, or any other calamity.
|
|||
<sect2 id="feedback">
|
||||
<title>Feedback</title>
|
||||
<para>This will never be a finished document; we welcome feedback about
|
||||
how it can be improved. As of October 2000, the Linux NFS home
|
||||
how it can be improved. As of February 2002, the Linux NFS home
|
||||
page is being hosted at <ulink url="http://nfs.sourceforge.net">http://nfs.sourceforge.net</ulink>. Check there
|
||||
for mailing lists, bug fixes, and updates, and also to verify
|
||||
who currently maintains this document.
|
||||
|
@ -57,6 +58,8 @@ The original version of this document was developed by Nicolai
|
|||
Langfeldt. It was heavily rewritten in 2000 by Tavis Barr
|
||||
and Seth Vidal to reflect substantial changes in the workings
|
||||
of NFS for Linux developed between the 2.0 and 2.4 kernels.
|
||||
It was edited again in February 2002, when Tom McNeal made substantial
|
||||
additions to the performance section.
|
||||
Thomas Emmel, Neil Brown, Trond Myklebust, Erez Zadok, and Ion Badulescu
|
||||
also provided valuable comments and contributions.
|
||||
</para>
|
||||
|
|
|
@ -2,48 +2,55 @@
|
|||
<title>Security and NFS</title>
|
||||
<para>
|
||||
This list of security tips and explanations will not make your site
|
||||
completely secure. <emphasis>NOTHING</emphasis> will make your site completely secure. This
|
||||
completely secure. <emphasis>NOTHING</emphasis> will make your site completely secure. Reading this section
|
||||
may help you get an idea of the security problems with NFS. This is not
|
||||
a comprehensive guide and it will always be undergoing changes. If you
|
||||
have any tips or hints to give us please send them to the HOWTO
|
||||
maintainer.
|
||||
</para>
|
||||
<para>
|
||||
If you're on a network with no access to the outside world (not even a
|
||||
If you are on a network with no access to the outside world (not even a
|
||||
modem) and you trust all the internal machines and all your users then
|
||||
this section will be of no use to you. However, its our belief that
|
||||
there are relatively few networks in this situation so we would suggest
|
||||
reading this section thoroughly for anyone setting up NFS.
|
||||
</para>
|
||||
<para>
|
||||
There are two steps to file/mount access in NFS. The first step is mount
|
||||
With NFS, there are two steps required for a client to gain access to
|
||||
a file contained in a remote directory on the server. The first step is mount
|
||||
access. Mount access is achieved by the client machine attempting to
|
||||
attach to the server. The security for this is provided by the
|
||||
<filename>/etc/exports</filename> file. This file lists the names or ip addresses for machines
|
||||
<filename>/etc/exports</filename> file. This file lists the names or IP addresses for machines
|
||||
that are allowed to access a share point. If the client's ip address
|
||||
matches one of the entries in the access list then it will be allowed to
|
||||
mount. This is not terribly secure. If someone is capable of spoofing or
|
||||
taking over a trusted address then they can access your mount points. To
|
||||
give a real-world example of this type of "authentication": This is
|
||||
equivalent to someone introducing themselves to you and you believe they
|
||||
equivalent to someone introducing themselves to you and you believing they
|
||||
are who they claim to be because they are wearing a sticker that says
|
||||
"Hello, My Name is ...."
|
||||
"Hello, My Name is ...." Once the machine has mounted a volume, its
|
||||
operating system will have access to all files on the volume (with the
|
||||
possible exception of those owned by root; see below) and write access
|
||||
to those files as well, if the volume was exported with the
|
||||
<userinput>rw</userinput> option.
|
||||
</para>
|
||||
<para>
|
||||
The second step is file access. This is a function of normal file system
|
||||
access controls and not a specialized function of NFS. Once the drive is
|
||||
mounted the user and group permissions on the files take over access
|
||||
control.
|
||||
access controls on the client and not a specialized function of NFS.
|
||||
Once the drive is mounted the user and group permissions on the files
|
||||
determine access control.
|
||||
</para>
|
||||
<para>
|
||||
An example: bob on the server maps to the UserID 9999. Bob
|
||||
makes a file on the server that is only accessible the user (0600 in
|
||||
octal). A client is allowed to mount the drive where the file is stored.
|
||||
makes a file on the server that is only accessible the user
|
||||
(the equivalent to typing
|
||||
<userinput>chmod 600</userinput> <emphasis>filename</emphasis>).
|
||||
A client is allowed to mount the drive where the file is stored.
|
||||
On the client mary maps to UserID 9999. This means that the client
|
||||
user mary can access bob's file that is marked as only accessible by him.
|
||||
It gets worse, if someone has root on the client machine they can
|
||||
<command>su - [username]</command> and become ANY user. NFS will be none
|
||||
the wiser.
|
||||
It gets worse: If someone has become superuser on the client machine they can
|
||||
<command>su - </command> <emphasis>username</emphasis>
|
||||
and become <emphasis>any</emphasis> user. NFS will be none the wiser.
|
||||
</para>
|
||||
<para>
|
||||
Its not all terrible. There are a few measures you can take on the server
|
||||
|
@ -92,8 +99,8 @@
|
|||
be careful and keep up diligent monitoring of those systems.
|
||||
</para>
|
||||
<para>
|
||||
Not all Linux distributions were created equal. Some seemingly up-to-
|
||||
date distributions do not include a securable portmapper.
|
||||
Not all Linux distributions were created equal. Some seemingly up-to-date
|
||||
distributions do not include a securable portmapper.
|
||||
The easy way to check if your portmapper is good or not is to run
|
||||
<emphasis>strings(1)</emphasis> and see if it reads the relevant files, <filename>/etc/hosts.deny</filename> and
|
||||
<filename>/etc/hosts.allow</filename>. Assuming your portmapper is <filename>/sbin/portmap</filename> you can
|
||||
|
@ -137,22 +144,28 @@
|
|||
the mill Linux system there are very few machines that need any access
|
||||
for any reason. The portmapper administers <command>nfsd</command>,
|
||||
<command>mountd</command>, <command>ypbind</command>/<command>ypserv</command>,
|
||||
<command>pcnfsd</command>, and 'r' services like <command>ruptime</command> and <command>rusers</command>.
|
||||
<command>rquotad</command>, <command>lockd</command> (which shows up
|
||||
as <computeroutput>nlockmgr</computeroutput>), <command>statd</command>
|
||||
(which shows up as <computeroutput>status</computeroutput>)
|
||||
and 'r' services like <command>ruptime</command>
|
||||
and <command>rusers</command>.
|
||||
Of these only <command>nfsd</command>, <command>mountd</command>,
|
||||
<command>ypbind</command>/<command>ypserv</command> and perhaps
|
||||
<command>pcnfsd</command> are of any consequence. All machines that need
|
||||
<command>rquotad</command>,<command>lockd</command>
|
||||
and <command>statd</command> are of any consequence. All machines that need
|
||||
to access services on your machine should be allowed to do that. Let's
|
||||
say that your machine's address is <emphasis>192.168.0.254</emphasis> and
|
||||
that it lives on the subnet <emphasis>192.168.0.0</emphasis>, and that all
|
||||
machines on the subnet should have access to it (those are terms introduced
|
||||
by the <ulink url="http://www.linuxdoc.org/HOWTO/Networking-Overview-HOWTO.html">Networking-Overview-HOWTO</ulink>,
|
||||
go back and refresh your memory if you need to). Then we write:
|
||||
machines on the subnet should have access to it (for an overview of those
|
||||
terms see the the <ulink url="http://www.linuxdoc.org/HOWTO/Networking-Overview-HOWTO.html">Networking-Overview-HOWTO</ulink>). Then we write:
|
||||
<screen>
|
||||
portmap: 192.168.0.0/255.255.255.0
|
||||
</screen>
|
||||
in <filename>/etc/hosts.allow</filename>. This is the same as the network
|
||||
address you give to route and the subnet mask you give to <command>ifconfig</command>. For the
|
||||
device eth0 on this machine <command>ifconfig</command> should show:
|
||||
in <filename>/etc/hosts.allow</filename>. If you are not sure what your
|
||||
network or netmask are, you can use the <command>ifconfig</command> command to
|
||||
determine the netmask and the <command>netstat</command> command to
|
||||
determine the network. For, example, for the
|
||||
device eth0 on the above machine <command>ifconfig</command> should show:
|
||||
</para>
|
||||
<para>
|
||||
<screen>
|
||||
|
@ -175,7 +188,7 @@
|
|||
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 174412 eth0
|
||||
...
|
||||
</screen>
|
||||
(Network address in first column).
|
||||
(The network address is in the first column).
|
||||
</para>
|
||||
<para>
|
||||
The <filename>/etc/hosts.deny</filename> and <filename>/etc/hosts.allow</filename> files are
|
||||
|
@ -194,28 +207,29 @@
|
|||
<filename>hosts.allow</filename> and <filename>hosts.deny</filename>
|
||||
files, so you should put in entries for <command>lockd</command>,
|
||||
<command>statd</command>, <command>mountd</command>, and
|
||||
<command>rquotad</command> in these files too.
|
||||
<command>rquotad</command> in these files too. For a complete example,
|
||||
see <xref linkend="hosts">.
|
||||
</para>
|
||||
<para>
|
||||
The above things should make your server tighter. The only remaining
|
||||
problem (Yeah, right!) is someone breaking root (or boot MS-DOS) on a
|
||||
trusted machine and using that privilege to send requests from a
|
||||
secure port as any user they want to be.
|
||||
problem is if someone gains administrative access to one of your trusted
|
||||
client machines and is able to send bogus NFS requests. The next section
|
||||
deals with safeguards against this problem.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="server.security">
|
||||
<title>Server security: nfsd and mountd</title>
|
||||
<para>
|
||||
On the server we can decide that we don't want to trust the client's
|
||||
root account. We can do that by using the <userinput>root_squash</userinput> option in
|
||||
<filename>/etc/exports</filename>:
|
||||
On the server we can decide that we don't want to trust any requests
|
||||
made as root on the client. We can do that by using the
|
||||
<userinput>root_squash</userinput> option in <filename>/etc/exports</filename>:
|
||||
<programlisting>
|
||||
/home slave1(rw,root_squash)
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
This is, in fact, the default. It should always be turned on unless you
|
||||
have a VERY good reason to turn it off. To turn it off use the
|
||||
have a <emphasis>very</emphasis> good reason to turn it off. To turn it off use the
|
||||
<userinput>no_root_squash</userinput> option.
|
||||
</para>
|
||||
<para>
|
||||
|
@ -239,10 +253,11 @@
|
|||
<para>
|
||||
The TCP ports 1-1024 are reserved for root's use (and therefore sometimes
|
||||
referred to as "secure ports") A non-root user cannot bind these ports.
|
||||
Adding the secure option to an <filename>/etc/exports</filename> entry forces it to run on a
|
||||
port below 1024, so that a malicious non-root user cannot come along and
|
||||
open up a spoofed NFS dialogue on a non-reserved port. This option is set
|
||||
by default.
|
||||
Adding the <userinput>secure</userinput> option to an
|
||||
<filename>/etc/exports</filename> means that it will only listed to
|
||||
requests coming from ports 1-1024 on the client, so that a malicious
|
||||
non-root user on the client cannot come along and open up a spoofed
|
||||
NFS dialogue on a non-reserved port. This option is set by default.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="client.security">
|
||||
|
@ -252,19 +267,23 @@
|
|||
<para>
|
||||
On the client we can decide that we don't want to trust the server too
|
||||
much a couple of ways with options to mount. For example we can
|
||||
forbid suid programs to work off the NFS file system with the nosuid
|
||||
forbid suid programs to work off the NFS file system with the
|
||||
<userinput>nosuid</userinput>
|
||||
option. Some unix programs, such as passwd, are called "suid" programs:
|
||||
They set the id of the person running them to whomever is the owner of
|
||||
the file. If a file is owned by root and is suid, then the program will
|
||||
execute as root, so that they can perform operations (such as writing to
|
||||
the password file) that only root is allowed to do. Using the nosuid
|
||||
the password file) that only root is allowed to do. Using the
|
||||
<userinput>nosuid</userinput>
|
||||
option is a good idea and you should consider using this with all NFS
|
||||
mounted disks. It means that the server's root user cannot make a suid-root
|
||||
mounted disks. It means that the server's root user
|
||||
cannot make a suid-root
|
||||
program on the file system, log in to the client as a normal user
|
||||
and then use the suid-root program to become root on the client too.
|
||||
One could also forbid execution of files on the mounted file system
|
||||
altogether with the <userinput>noexec</userinput> option.
|
||||
But this is more likely to be impractical than nosuid since a file
|
||||
But this is more likely to be impractical than
|
||||
<userinput>nosuid</userinput> since a file
|
||||
system is likely to at least contain some scripts or programs that need
|
||||
to be executed.
|
||||
</para>
|
||||
|
@ -295,7 +314,7 @@
|
|||
<sect3 id="securing-daemons">
|
||||
<title>Securing portmapper, rpc.statd, and rpc.lockd on the client</title>
|
||||
<para>
|
||||
In the current (2.2.18+) implementation of nfs, full file locking is
|
||||
In the current (2.2.18+) implementation of NFS, full file locking is
|
||||
supported. This means that <command>rpc.statd</command> and <command>rpc.lockd</command>
|
||||
must be running on the client in order for locks to function correctly.
|
||||
These services require the portmapper to be running. So, most of the
|
||||
|
@ -310,20 +329,21 @@
|
|||
<para>
|
||||
IPchains (under the 2.2.X kernels) and netfilter (under the 2.4.x
|
||||
kernels) allow a good level of security - instead of relying on the
|
||||
daemon (or in this case the tcp wrapper) to determine who can connect,
|
||||
daemon (or perhaps its TCP wrapper) to
|
||||
determine which machines can connect,
|
||||
the connection attempt is allowed or disallowed at a lower level. In
|
||||
this case you canstop the connection much earlier and more globaly which
|
||||
this case, you can stop the connection much earlier and more globally, which
|
||||
can protect you from all sorts of attacks.
|
||||
</para>
|
||||
<para>
|
||||
Describing how to set up a Linux firewall is well beyond the scope of
|
||||
this document. Interested readers may wish to read the Firewall-HOWTO
|
||||
this document. Interested readers may wish to read the <ulink url="http://www.linuxdoc.org/HOWTO/Firewall-HOWTO.html">Firewall-HOWTO</ulink>
|
||||
or the <ulink url="http://www.linuxdoc.org/HOWTO/IPCHAINS-HOWTO.HTML">IPCHAINS-HOWTO</ulink>.
|
||||
For users of kernel 2.4 and above you might want to visit the netfilter webpage at:
|
||||
<ulink url="http://netfilter.filewatcher.org">http://netfilter.filewatcher.org</ulink>.
|
||||
If you are already familiar with the workings of ipchains or netfilter
|
||||
this section will give you a few tips on how to better setup your
|
||||
firewall to work with NFS.
|
||||
NFS daemons to more easily firewall and protect them.
|
||||
</para>
|
||||
<para>
|
||||
A good rule to follow for your firewall configuration is to deny all, and
|
||||
|
@ -331,103 +351,290 @@
|
|||
than you intended.
|
||||
</para>
|
||||
<para>
|
||||
Ports to be concerned with:
|
||||
<orderedlist numeration="loweralpha">
|
||||
<listitem>
|
||||
<para>The portmapper is on 111. (tcp and udp)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
nfsd is on 2049 and it can be TCP and UDP. Although NFS over TCP
|
||||
is currently experimental on the server end and you will usually
|
||||
just see UDP on the server, using TCP is quite stable on the
|
||||
client end.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<command>mountd</command>, <command>lockd</command>, and <command>statd</command>
|
||||
float around (which is why we need the portmapper to begin with) - this causes
|
||||
problems. You basically have two options to deal with it:
|
||||
<orderedlist numeration="lowerroman">
|
||||
<listitem>
|
||||
<para>
|
||||
You more can more or less do a deny all on connecting ports
|
||||
but explicitly allow most ports certain ips.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
More recent versions of these utilities have a "-p" option
|
||||
that allows you to assign them to a certain port. See the
|
||||
man pages to be sure if your version supports this. You can
|
||||
then allow access to the ports you have specified for your
|
||||
NFS client machines, and seal off all other ports, even for
|
||||
your local network.
|
||||
</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
</para>
|
||||
<para>
|
||||
Using IPCHAINS, a simply firewall using the first option would look
|
||||
something like this:
|
||||
<programlisting>
|
||||
ipchains -A input -f -j ACCEPT
|
||||
ipchains -A input -s trusted.net.here/trusted.netmask -d host.ip/255.255.255.255 -j ACCEPT
|
||||
ipchains -A input -s 0/0 -d 0/0 -p 6 -j DENY -y -l
|
||||
ipchains -A input -s 0/0 -d 0/0 -p 17 -j DENY -l
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
The equivalent set of commands in netfilter (the firewalling tool in 2.4) is:
|
||||
<programlisting>
|
||||
iptables -A INPUT -f -j ACCEPT
|
||||
iptables -A INPUT -s trusted.net.here/trusted.netmask -d \
|
||||
host.ip/255.255.255.255 -j ACCEPT
|
||||
iptables -A INPUT -s 0/0 -d 0/0 -p 6 -j DENY --syn --log-level 5
|
||||
iptables -A INPUT -s 0/0 -d 0/0 -p 17 -j DENY --log-level 5
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
The first line says to accept all packet fragments (except the first
|
||||
packet fragment which will be treated as a normal packet). In theory
|
||||
no packet will pass through until it is reassembled, and it won't be
|
||||
reassembled unless the first packet fragment is passed. Of course
|
||||
there are attacks that can be generated by overloading a machine
|
||||
with packet fragments. But NFS won't work correctly unless you
|
||||
let fragments through. See <xref linkend="troubleshooting"> for details.
|
||||
</para>
|
||||
<para>
|
||||
The other three lines say trust your local networks and deny and log
|
||||
everything else. It's not great and more specific rules pay off, but
|
||||
more specific rules are outside of the scope of this discussion.
|
||||
</para>
|
||||
<para>
|
||||
Some pointers if you'd like to be more paranoid or strict about your
|
||||
rules. If you choose to reset your firewall rules each time <command>statd</command>,
|
||||
<command>rquotad</command>, <command>mountd</command> or <command>lockd</command>
|
||||
move (which is possible) you'll want to make sure you allow fragments to
|
||||
your nfs server FROM your nfs client(s). If you don't you will get some very
|
||||
interesting reports from the kernel regarding packets being denied. The messages
|
||||
will say that a packet from port 65535 on the client to 65535 on the server
|
||||
is being denied. Allowing fragments will solve this.
|
||||
</para>
|
||||
In order to understand how to firewall the NFS daemons, it will help
|
||||
to breifly review how they bind to ports.
|
||||
</para>
|
||||
<para>
|
||||
When a daemon starts up, it requests a free port from the portmapper.
|
||||
The portmapper gets the port for the daemon and keeps track of
|
||||
the port currently used by that daemon. When other hosts or processes
|
||||
need to communicate with the daemon, they request the port number
|
||||
from the portmapper in order to find the
|
||||
daemon. So the ports will perpetually float because different ports may
|
||||
be free at different times and so the portmapper will allocate them
|
||||
differently each time. This is a pain for setting up a firewall. If
|
||||
you never know where the daemons are going to be then you don't
|
||||
know precisely which ports to allow access to. This might not be a big deal
|
||||
for many people running on a protected or isolated LAN. For those
|
||||
people on a public network, though, this is horrible.
|
||||
</para>
|
||||
<para>
|
||||
In kernels 2.4.13 and later with nfs-utils 0.3.3 or later you no
|
||||
longer have to worry about the floating of ports in the portmapper.
|
||||
Now all of the daemons pertaining to nfs can be "pinned" to a port.
|
||||
Most of them nicely take a <command>-p</command> option when they are started;
|
||||
those daemons that are started by the kernel take some kernel arguments
|
||||
or module options. They are described below.
|
||||
</para>
|
||||
<para>
|
||||
Some of the daemons involved in sharing data via nfs are already
|
||||
bound to a port. <command>portmap</command> is always on port
|
||||
111 tcp and udp. <command>nfsd</command> is
|
||||
always on port 2049 TCP and UDP (however, as of kernel 2.4.17, NFS over
|
||||
TCP is considered experimental and is not for use on production machines).
|
||||
</para>
|
||||
<para>
|
||||
The other daemons, <command>statd</command>, <command>mountd</command>,
|
||||
<command>lockd</command>, and <command>rquotad</command>, will normally move
|
||||
around to the first available port they are informed of by the portmapper.
|
||||
</para>
|
||||
<para>
|
||||
To force <command>statd</command> to bind to a particular port, use the
|
||||
<userinput>-p</userinput>
|
||||
<emphasis>portnum</emphasis> option. To force <command>statd</command> to
|
||||
respond on a particular port, additionally use the
|
||||
<userinput>-o</userinput> <emphasis>portnum</emphasis> option when starting it.
|
||||
</para>
|
||||
<para>
|
||||
To force <command>mountd</command> to bind to a particular port use the
|
||||
<userinput>-p</userinput> <emphasis>portnum</emphasis> option.
|
||||
</para>
|
||||
<para>
|
||||
For example, to have statd broadcast of port 32765 and listen on port
|
||||
32766, and mountd listen on port 32767, you would type:
|
||||
</para>
|
||||
<programlisting>
|
||||
# statd -p 32765 -o 32766
|
||||
# mountd -p 32767
|
||||
</programlisting>
|
||||
<para>
|
||||
<command>lockd</command> is started by the kernel when it is needed.
|
||||
Therefore you need
|
||||
to pass module options (if you have it built as a module) or kernel
|
||||
options to force <command>lockd</command> to listen and respond
|
||||
only on certain ports.
|
||||
</para>
|
||||
<para>
|
||||
If you are using loadable modules and you would like to specify these
|
||||
options in your <filename>/etc/modules.conf</filename> file add
|
||||
a line like this to the file:
|
||||
</para>
|
||||
<programlisting>
|
||||
options lockd nlm.uddpport=32768 nlm.tcpport=32768
|
||||
</programlisting>
|
||||
<para>
|
||||
The above line would specify the udp and tcp port for
|
||||
<command>lockd</command> to be 32768.
|
||||
</para>
|
||||
<para>
|
||||
If you are not using loadable modules or if you have compiled
|
||||
<command>lockd</command> into the kernel instead of building it
|
||||
as a module then you will need to pass it an option on the kernel boot line.
|
||||
</para>
|
||||
<para>
|
||||
It should look something like this:
|
||||
</para>
|
||||
<programlisting>
|
||||
vmlinuz 3 root=/dev/hda1 lockd.udpport=32768 lockd.tcpport=32768
|
||||
</programlisting>
|
||||
<para>
|
||||
The port numbers do not have to match but it would simply add
|
||||
unnecessary confusion if they didn't.
|
||||
</para>
|
||||
<para>
|
||||
If you are using quotas and using <command>rpc.quotad</command> to make these
|
||||
quotas viewable over nfs you will need to also take it into
|
||||
account when setting up your firewall. There are two
|
||||
<command>rpc.rquotad</command>
|
||||
source trees. One of those is maintained in the
|
||||
<application>nfs-utils</application> tree.
|
||||
The other in the <application>quota-tools</application> tree.
|
||||
They do not operate identically.
|
||||
The one provided with <application>nfs-utils</application> supports
|
||||
binding the daemon to a port with the <userinput>-p</userinput>
|
||||
directive. The one in <application>quota-tools</application> does not.
|
||||
Consult your distribution's documentation to determine if yours does.
|
||||
</para>
|
||||
<para>
|
||||
For the sake of this discussion lets describe a network and setup a
|
||||
firewall to protect our nfs server.
|
||||
Our nfs server is 192.168.0.42 our client is 192.168.0.45 only.
|
||||
As in the example above, <command>statd</command> has been
|
||||
started so that it only
|
||||
binds to port 32765 for incoming requests and it must answer on
|
||||
port 32766. <command>mountd</command> is forced to bind to port 32767.
|
||||
<command>lockd</command>'s module parameters have been set to bind to 32768.
|
||||
<command>nfsd</command> is, of course, on port 2049 and the portmapper is on port 111.
|
||||
</para>
|
||||
<para>
|
||||
We are not using quotas.
|
||||
</para>
|
||||
<para>
|
||||
Using <application>IPCHAINS</application>, a simple firewall
|
||||
might look something like this:
|
||||
</para>
|
||||
<programlisting>
|
||||
ipchains -A input -f -j ACCEPT -s 192.168.0.45
|
||||
ipchains -A input -s 192.168.0.45 -d 0/0 32765:32768 -p 6 -j ACCEPT
|
||||
ipchains -A input -s 192.168.0.45 -d 0/0 32765:32768 -p 17 -j ACCEPT
|
||||
ipchains -A input -s 192.168.0.45 -d 0/0 2049 -p 17 -j ACCEPT
|
||||
ipchains -A input -s 192.168.0.45 -d 0/0 2049 -p 6 -j ACCEPT
|
||||
ipchains -A input -s 192.168.0.45 -d 0/0 111 -p 6 -j ACCEPT
|
||||
ipchains -A input -s 192.168.0.45 -d 0/0 111 -p 17 -j ACCEPT
|
||||
ipchains -A input -s 0/0 -d 0/0 -p 6 -j DENY -y -l
|
||||
ipchains -A input -s 0/0 -d 0/0 -p 17 -j DENY -l
|
||||
</programlisting>
|
||||
<para>
|
||||
The equivalent set of commands in <application>netfilter</application> is:
|
||||
</para>
|
||||
<programlisting>
|
||||
iptables -A INPUT -f -j ACCEPT -s 192.168.0.45
|
||||
iptables -A INPUT -s 192.168.0.45 -d 0/0 32765:32768 -p 6 -j ACCEPT
|
||||
iptables -A INPUT -s 192.168.0.45 -d 0/0 32765:32768 -p 17 -j ACCEPT
|
||||
iptables -A INPUT -s 192.168.0.45 -d 0/0 2049 -p 17 -j ACCEPT
|
||||
iptables -A INPUT -s 192.168.0.45 -d 0/0 2049 -p 6 -j ACCEPT
|
||||
iptables -A INPUT -s 192.168.0.45 -d 0/0 111 -p 6 -j ACCEPT
|
||||
iptables -A INPUT -s 192.168.0.45 -d 0/0 111 -p 17 -j ACCEPT
|
||||
iptables -A INPUT -s 0/0 -d 0/0 -p 6 -j DENY --syn --log-level 5
|
||||
iptables -A INPUT -s 0/0 -d 0/0 -p 17 -j DENY --log-level 5
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
The first line says to accept all packet fragments (except the
|
||||
first packet fragment which will be treated as a normal packet).
|
||||
In theory no packet will pass through until it is reassembled,
|
||||
and it won't be reassembled unless the first packet fragment
|
||||
is passed. Of course there are attacks that can be generated
|
||||
by overloading a machine with packet fragments. But NFS won't
|
||||
work correctly unless you let fragments through. See <xref linkend="symptom8">
|
||||
for details.
|
||||
</para>
|
||||
<para>
|
||||
The other lines allow specific connections from any port on our
|
||||
client host to the specific ports we have made available on
|
||||
our server. This means that if, say, 192.158.0.46 attempts to contact
|
||||
the NFS server it will not be able to mount or see what mounts
|
||||
are available.
|
||||
</para>
|
||||
<para>
|
||||
With the new port pinning capabilities it is obviously much easier
|
||||
to control what hosts are allowed to mount your NFS shares. It is
|
||||
worth mentioning that NFS is not an encrypted protocol and anyone
|
||||
on the same physical network could sniff the traffic and reassemble
|
||||
the information being passed back and forth.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="nfs-ssh">
|
||||
<title>Tunneling NFS through SSH</title>
|
||||
<para>
|
||||
One method of encrypting NFS traffic over a network is to
|
||||
use the port-forwarding capabilities of <command>ssh</command>.
|
||||
However, as we shall see, doing so has a serious drawback if you do not
|
||||
utterly and completely trust the local users on your server.
|
||||
</para>
|
||||
<para>
|
||||
The first step will be to export files to the localhost. For example, to
|
||||
export the <filename>/home</filename> partition, enter the following into
|
||||
<filename>/etc/exports</filename>:
|
||||
<programlisting>
|
||||
/home 127.0.0.1(rw)
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
The next step is to use <command>ssh</command> to forward ports. For example,
|
||||
<command>ssh</command> can tell the server to forward to any port on any
|
||||
machine from a port on the client. Let us assume, as in the previous
|
||||
section, that our server is 192.168.0.42, and that we have pinned
|
||||
<command>mountd</command> to port 32767
|
||||
using the argument <userinput>-p 32767</userinput>. Then, on the client,
|
||||
we'll type:
|
||||
<programlisting>
|
||||
# ssh root@192.168.0.42 -L 250:localhost:2049 -f sleep 60m
|
||||
# ssh root@192.168.0.42 -L 251:localhost:32767 -f sleep 60m
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
The above command causes <command>ssh</command> on the client to take
|
||||
any request directed at the client's port 250 and forward it,
|
||||
first through <command>sshd</command> on the server, and then on
|
||||
to the server's port 2049. The second line
|
||||
causes a similar type of forwarding between requests to port 251 on
|
||||
the client and port 32767 on the server. The
|
||||
<userinput>localhost</userinput> is relative to the server; that is,
|
||||
the forwarding will be done to the server itself. The port could otherwise
|
||||
have been made to forward to any other machine, and the requests would look to
|
||||
the outside world as if they were coming from the server. Thus, the requests
|
||||
will appear to NFSD on the server as if they are coming from the server itself.
|
||||
Note that in order to bind to a port below 1024 on the client, we have
|
||||
to run this command as root on the client. Doing this will be necessary
|
||||
if we have exported our filesystem with the default
|
||||
<userinput>secure</userinput> option.
|
||||
</para>
|
||||
<para>
|
||||
Finally, we are pulling a little trick with the last option,
|
||||
<userinput>-f sleep 60m</userinput>. Normally, when
|
||||
we use <command>ssh</command>, even with the <userinput>-L</userinput> option,
|
||||
we will open up a shell on the remote machine. But instead, we just want
|
||||
the port forwarding to execute in the background so that we get our shell
|
||||
on the client back. So, we tell <command>ssh</command> to execute a command
|
||||
in the background on the server to sleep for 60 minutes. This will cause
|
||||
the port to be forwarded for 60 minutes until it gets a connection; at that
|
||||
point, the port will continue to be forwarded until the connection dies or
|
||||
until the 60 minutes are up, whichever happens later. The above command
|
||||
could be put in our startup scripts on the client, right after the network
|
||||
is started.
|
||||
</para>
|
||||
<para>
|
||||
Next, we have to mount the filesystem on the client. To do this, we tell
|
||||
the client to mount a filesystem on the localhost, but at a different
|
||||
port from the usual 2049. Specifically, an entry in <filename>/etc/fstab</filename>
|
||||
would look like:
|
||||
<programlisting>
|
||||
localhost:/home /mnt/home nfs rw,hard,intr,port=250,mountport=251 0 0
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>Having done this, we can see why the above will be incredibly insecure
|
||||
if we have <emphasis>any</emphasis> ordinary users who are able to log in
|
||||
to the server locally. If they can, there is nothing preventing them from
|
||||
doing what we did and using <command>ssh</command> to forward a privileged
|
||||
port on their own client machine (where they are legitimately root) to ports
|
||||
2049 and 32767 on the server. Thus, any ordinary user on the server can
|
||||
mount our filesystems with the same rights as root on our client.
|
||||
</para>
|
||||
<para>
|
||||
If you are using an NFS server that does not have a way for ordinary users
|
||||
to log in, and you wish to use this method, there are two additional caveats:
|
||||
First, the connection travels from the client to the server via
|
||||
<command>sshd</command>; therefore you will have to leave port 22 (where
|
||||
<command>sshd</command> listens) open to your client on the firewall. However
|
||||
you do not need to leave the other ports, such as 2049 and 32767, open
|
||||
anymore. Second, file locking will no longer work. It is not possible
|
||||
to ask <command>statd</command> or the locking manager to make requests
|
||||
to a particular port for a particular mount; therefore, any locking requests
|
||||
will cause <command>statd</command> to connect to <command>statd</command>
|
||||
on localhost, i.e., itself, and it will fail with an error. Any attempt
|
||||
to correct this would require a major rewrite of NFS.
|
||||
</para>
|
||||
<para>
|
||||
It may also be possible to use <application>IPSec</application> to encrypt
|
||||
network traffic between your client and your server, without compromising
|
||||
any local security on the server; this will not be taken up here.
|
||||
See the <ulink url="http://www.freeswan.org/">FreeS/WAN</ulink> home page
|
||||
for details on using IPSec under Linux.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="summary">
|
||||
<title>Summary</title>
|
||||
<para>
|
||||
If you use the <filename>hosts.allow</filename>, <filename>hosts.deny</filename>,
|
||||
<filename>root_squash</filename>, <userinput>nosuid</userinput> and privileged
|
||||
port features in the portmapper/nfs software you avoid many of the
|
||||
presently known bugs in nfs and can almost feel secure about that at
|
||||
<userinput>root_squash</userinput>, <userinput>nosuid</userinput> and privileged
|
||||
port features in the portmapper/NFS software, you avoid many of the
|
||||
presently known bugs in NFS and can almost feel secure about that at
|
||||
least. But still, after all that: When an intruder has access to your
|
||||
network, s/he can make strange commands appear in your <filename>.forward</filename> or
|
||||
read your mail when <filename>/home</filename> or <filename>/var/mail</filename> is
|
||||
NFS exported. For the same reason, you should never access your PGP private key
|
||||
over nfs. Or at least you should know the risk involved. And now you know a bit
|
||||
over NFS. Or at least you should know the risk involved. And now you know a bit
|
||||
of it.
|
||||
</para>
|
||||
<para>
|
||||
|
@ -438,4 +645,3 @@
|
|||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
|
|
|
@ -59,9 +59,11 @@
|
|||
<glossdef>
|
||||
<para>
|
||||
client machines that will have access to the directory. The machines
|
||||
may be listed by their IP address or their DNS address
|
||||
may be listed by their DNS address or their IP address
|
||||
(e.g., <emphasis>machine.company.com</emphasis> or <emphasis>192.168.0.8</emphasis>).
|
||||
Using IP addresses is more reliable and more secure.
|
||||
Using IP addresses is more reliable and more secure. If you need to
|
||||
use DNS addresses, and they do not seem to be resolving to the right
|
||||
machine, see <xref linkend="symptom3">.
|
||||
</para>
|
||||
</glossdef>
|
||||
</glossentry>
|
||||
|
@ -85,11 +87,14 @@
|
|||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<userinput>no_root_squash</userinput>: By default, any file request made by user root
|
||||
<userinput>no_root_squash</userinput>: By default,
|
||||
any file request made by user <computeroutput>root</computeroutput>
|
||||
on the client machine is treated as if it is made by user
|
||||
nobody on the server. (Excatly which UID the request is
|
||||
<computeroutput>nobody</computeroutput> on the
|
||||
server. (Excatly which UID the request is
|
||||
mapped to depends on the UID of user "nobody" on the server,
|
||||
not the client.) If no_root_squash is selected, then
|
||||
not the client.) If <userinput>no_root_squash</userinput>
|
||||
is selected, then
|
||||
root on the client machine will have the same level of access
|
||||
to the files on the system as root on the server. This
|
||||
can have serious security implications, although it may be
|
||||
|
@ -109,19 +114,17 @@
|
|||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<userinput>sync</userinput>: By default, a Version 2 NFS server will tell a client
|
||||
machine that a file write is complete when NFS has finished
|
||||
handing the write over to the filesysytem; however, the file
|
||||
system may not sync it to the disk, even if the client makes
|
||||
a sync() call on the file system. The default behavior may
|
||||
therefore cause file corruption if the server reboots. This
|
||||
option forces the filesystem to sync to disk every time NFS
|
||||
completes a write operation. It slows down write times
|
||||
substantially but may be necessary if you are running NFS
|
||||
Version 2 in a production environment. Version 3 NFS has
|
||||
a commit operation that the client can call that
|
||||
actually will result in a disk sync on the server end.
|
||||
</para>
|
||||
<userinput>sync</userinput>:
|
||||
By default, all but the most recent version (version 1.11)
|
||||
of the <command>exportfs</command> command will use
|
||||
<userinput>async</userinput> behavior, telling a client
|
||||
machine that a file write is complete - that is, has been written
|
||||
to stable storage - when NFS has finished handing the write over to
|
||||
the filesysytem. This behavior may cause data corruption if the
|
||||
server reboots, and the <userinput>sync</userinput> option prevents
|
||||
this. See <xref linkend="sync-async"> for a complete discussion of
|
||||
<userinput>sync</userinput> and <userinput>async</userinput> behavior.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
@ -174,7 +177,9 @@
|
|||
</para>
|
||||
<para>
|
||||
Third, you can use wildcards such as <emphasis>*.foo.com</emphasis> or
|
||||
<emphasis>192.168.</emphasis> instead of hostnames.
|
||||
<emphasis>192.168.</emphasis> instead of hostnames. There were problems
|
||||
with wildcard implementation in the 2.2 kernel series that were fixed
|
||||
in kernel 2.2.19.
|
||||
</para>
|
||||
<para>
|
||||
However, you should keep in mind that any of these simplifications
|
||||
|
@ -205,7 +210,8 @@
|
|||
<title>/etc/hosts.allow and /etc/hosts.deny</title>
|
||||
<para>
|
||||
These two files specify which computers on the network can use
|
||||
services on your machine. Each line of the file is an entry listing
|
||||
services on your machine. Each line of the file
|
||||
contains a single entry listing
|
||||
a service and a set of machines. When the server gets a request
|
||||
from a machine, it does the following:
|
||||
<itemizedlist>
|
||||
|
@ -232,7 +238,8 @@
|
|||
</itemizedlist>
|
||||
</para>
|
||||
<para>
|
||||
In addition to controlling access to services handled by inetd (such
|
||||
In addition to controlling access to services
|
||||
handled by <command>inetd</command> (such
|
||||
as telnet and FTP), this file can also control access to NFS
|
||||
by restricting connections to the daemons that provide NFS services.
|
||||
Restrictions are done on a per-service basis.
|
||||
|
@ -257,8 +264,8 @@
|
|||
</para>
|
||||
<para>
|
||||
In general it is a good idea with NFS (as with most internet services)
|
||||
to explicitly deny access to hosts that you don't need to allow access
|
||||
to.
|
||||
to explicitly deny access to IP addresses that you don't need
|
||||
to allow access to.
|
||||
</para>
|
||||
<para>
|
||||
The first step in doing this is to add the followng entry to
|
||||
|
@ -270,10 +277,10 @@
|
|||
</screen>
|
||||
</para>
|
||||
<para>
|
||||
Starting with nfs-utils 0.2.0, you can be a bit more careful by
|
||||
Starting with <application>nfs-utils</application> 0.2.0, you can be a bit more careful by
|
||||
controlling access to individual daemons. It's a good precaution
|
||||
since an intruder will often be able to weasel around the portmapper.
|
||||
If you have a newer version of NFS-utils, add entries for each of the
|
||||
If you have a newer version of <application>nfs-utils</application>, add entries for each of the
|
||||
NFS daemons (see the next section to find out what these daemons are;
|
||||
for now just put entries for them in hosts.deny):
|
||||
</para>
|
||||
|
@ -286,7 +293,7 @@
|
|||
</screen>
|
||||
</para>
|
||||
<para>
|
||||
Even if you have an older version of <emphasis>nfs-utils</emphasis>, adding these entries
|
||||
Even if you have an older version of <application>nfs-utils</application>, adding these entries
|
||||
is at worst harmless (since they will just be ignored) and at best
|
||||
will save you some trouble when you upgrade. Some sys admins choose
|
||||
to put the entry <userinput>ALL:ALL</userinput> in the file <filename>/etc/hosts.deny</filename>,
|
||||
|
@ -310,7 +317,7 @@
|
|||
<para>
|
||||
Here, host is IP address of a potential client; it may be possible
|
||||
in some versions to use the DNS name of the host, but it is strongly
|
||||
deprecated.
|
||||
discouraged.
|
||||
</para>
|
||||
<para>
|
||||
Suppose we have the setup above and we just want to allow access
|
||||
|
@ -351,7 +358,8 @@
|
|||
The NFS server should now be configured and we can start it running.
|
||||
First, you will need to have the appropriate packages installed.
|
||||
This consists mainly of a new enough kernel and a new enough version
|
||||
of the nfs-utils package. See <xref linkend="swprereq"> if you are in doubt.
|
||||
of the <application>nfs-utils</application> package.
|
||||
See <xref linkend="swprereq"> if you are in doubt.
|
||||
</para>
|
||||
<para>
|
||||
Next, before you can start NFS, you will need to have TCP/IP
|
||||
|
@ -366,7 +374,8 @@
|
|||
Verifying that NFS is running. If this does not work, or if
|
||||
you are not in a position to reboot your machine, then the following
|
||||
section will tell you which daemons need to be started in order to
|
||||
run NFS services. If for some reason nfsd was already running when
|
||||
run NFS services. If for some reason <command>nfsd</command>
|
||||
was already running when
|
||||
you edited your configuration files above, you will have to flush
|
||||
your configuration; see <xref linkend="later"> for details.
|
||||
</para>
|
||||
|
@ -385,13 +394,15 @@
|
|||
<sect3 id="daemons">
|
||||
<title>The Daemons</title>
|
||||
<para>
|
||||
NFS serving is taken care of by five daemons: rpc.nfsd, which does
|
||||
most of the work; rpc.lockd and rpc.statd, which handle file locking;
|
||||
rpc.mountd, which handles the initial mount requests, and
|
||||
rpc.rquotad, which handles user file quotas on exported volumes.
|
||||
Starting with 2.2.18, lockd is called by nfsd upon demand, so you do
|
||||
not need to worry about starting it yourself. statd will need to be
|
||||
started separately. Most recent Linux distributions will
|
||||
NFS serving is taken care of by five daemons: <command>rpc.nfsd</command>,
|
||||
which does most of the work; <command>rpc.lockd</command> and
|
||||
<command>rpc.statd</command>, which handle file locking;
|
||||
<command>rpc.mountd</command>, which handles the initial mount requests,
|
||||
and <command>rpc.rquotad</command>, which handles user file quotas on
|
||||
exported volumes. Starting with 2.2.18, <command>lockd</command>
|
||||
is called by <command>nfsd</command> upon demand, so you do
|
||||
not need to worry about starting it yourself. <command>statd</command>
|
||||
will need to be started separately. Most recent Linux distributions will
|
||||
have startup scripts for these daemons.
|
||||
</para>
|
||||
<para>
|
||||
|
@ -403,9 +414,9 @@
|
|||
then then you should add them, configured to start in the following
|
||||
order:
|
||||
<simplelist>
|
||||
<member>rpc.portmap</member>
|
||||
<member>rpc.mountd, rpc.nfsd</member>
|
||||
<member>rpc.statd, rpc.lockd (if necessary), rpc.rquotad</member>
|
||||
<member><command>rpc.portmap</command></member>
|
||||
<member><command>rpc.mountd</command>, <command>rpc.nfsd</command></member>
|
||||
<member><command>rpc.statd</command>, <command>rpc.lockd</command> (if necessary), and <command>rpc.rquotad</command></member>
|
||||
</simplelist>
|
||||
</para>
|
||||
<para>
|
||||
|
@ -461,9 +472,12 @@
|
|||
such as Solaris default to TCP.
|
||||
</para>
|
||||
<para>
|
||||
If you do not at least see a line that says "portmapper", a line
|
||||
that says "nfs", and a line that says "mountd" then you will need
|
||||
to backtrack and try again to start up the daemons (see <xref linkend="troubleshooting">,
|
||||
If you do not at least see a line that says
|
||||
<computeroutput>portmapper</computeroutput>, a line that says
|
||||
<computeroutput>nfs</computeroutput>, and a line that says
|
||||
<computeroutput>mountd</computeroutput> then you will need
|
||||
to backtrack and try again to start up the daemons
|
||||
(see <xref linkend="troubleshooting">,
|
||||
Troubleshooting, if this still doesn't work).
|
||||
</para>
|
||||
<para>
|
||||
|
@ -476,16 +490,16 @@
|
|||
<para>
|
||||
If you come back and change your <filename>/etc/exports</filename> file, the changes you
|
||||
make may not take effect immediately. You should run the command
|
||||
<command>exportfs -ra</command> to force nfsd to re-read the <filename>/etc/exports</filename>
|
||||
file. If you can't find the <command>exportfs</command> command, then you can kill nfsd with the
|
||||
<command>exportfs -ra</command> to force <command>nfsd</command> to re-read the <filename>/etc/exports</filename>
|
||||
file. If you can't find the <command>exportfs</command> command, then you can kill <command>nfsd</command> with the
|
||||
<userinput> -HUP</userinput> flag (see the man pages for kill for details).
|
||||
</para>
|
||||
<para>
|
||||
If that still doesn't work, don't forget to check <filename>hosts.allow</filename> to
|
||||
make sure you haven't forgotten to list any new client machines
|
||||
there. Also check the host listings on any firewalls you may have
|
||||
set up (see <xref linkend="troubleshooting"> for more details on firewalls
|
||||
and NFS).
|
||||
set up (see <xref linkend="troubleshooting"> and
|
||||
<xref linkend="security"> for more details on firewalls and NFS).
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
|
|
@ -15,9 +15,10 @@
|
|||
There are several ways of doing this. The most reliable
|
||||
way is to look at the file <filename>/proc/mounts</filename>,
|
||||
which will list all mounted filesystems and give details about them. If
|
||||
this doesn't work (for example if you don't have the /proc
|
||||
this doesn't work (for example if you don't
|
||||
have the <filename>/proc</filename>
|
||||
filesystem compiled into your kernel), you can type
|
||||
'mount -f' although you get less information.
|
||||
<userinput>mount -f</userinput> although you get less information.
|
||||
</para>
|
||||
<para>
|
||||
If the file system appears to be mounted, then you may
|
||||
|
@ -50,7 +51,8 @@
|
|||
<orderedlist numeration="loweralpha">
|
||||
<listitem>
|
||||
<para>
|
||||
failed, reason given by server: Permission denied
|
||||
failed, reason given by server:
|
||||
<computeroutput>Permission denied</computeroutput>
|
||||
</para>
|
||||
<para>
|
||||
This means that the server does not recognize that you
|
||||
|
@ -63,7 +65,8 @@
|
|||
volume is exported and that your client has the right
|
||||
kind of access to it. For example, if a client only
|
||||
has read access then you have to mount the volume
|
||||
with the ro option rather than the rw option.
|
||||
with the <userinput>ro</userinput> option rather
|
||||
than the <userinput>rw</userinput> option.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
|
@ -88,16 +91,33 @@
|
|||
in <filename>/etc/hosts</filename> that is throwing off the server, or
|
||||
you may not have listed the client's complete address
|
||||
and it may be resolving to a machine in a different
|
||||
domain. Try to ping the client from the server, and try
|
||||
to ping the server from the client. If this doesn't work,
|
||||
domain. One trick is login to the server from the
|
||||
client via <command>ssh</command> or <command>telnet</command>;
|
||||
if you then type <command>who</command>, one of the listings
|
||||
should be your login session and the name of your client
|
||||
machine as the server sees it. Try using this machine name
|
||||
in your <filename>/etc/exports</filename> entry.
|
||||
Finally, try to ping the client from the server, and try
|
||||
to <command>ping</command> the server from the client. If this doesn't work,
|
||||
or if there is packet loss, you may have lower-level network
|
||||
problems.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
It is not possible to export both a directory and its child
|
||||
(for example both
|
||||
<filename>/usr</filename> and <filename>/usr/local</filename>).
|
||||
You should export the parent directory with the necessary
|
||||
permissions, and all of its subdirectories can then be
|
||||
mounted with those same permissions.
|
||||
</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>RPC: Program Not Registered (or another "RPC" error):</para>
|
||||
<para><computeroutput>
|
||||
RPC: Program Not Registered</computeroutput>: (or another "RPC" error):</para>
|
||||
<para>
|
||||
This means that the client does not detect NFS running
|
||||
on the server. This could be for several reasons.
|
||||
|
@ -136,13 +156,15 @@
|
|||
|
||||
This says that we have NFS versions 2 and 3, rpc.statd
|
||||
version 1, network lock manager (the service name for
|
||||
rpc.lockd) versions 1, 3, and 4. There are also different
|
||||
<command>rpc.lockd</command>) versions 1, 3, and 4.
|
||||
There are also different
|
||||
service listings depending on whether NFS is travelling over
|
||||
TCP or UDP. UDP is usually (but not always) the default
|
||||
unless TCP is explicitly requested.
|
||||
</para>
|
||||
<para>
|
||||
If you do not see at least portmapper, nfs, and mountd, then
|
||||
If you do not see at least <computeroutput>portmapper</computeroutput>, <computeroutput>nfs</computeroutput>, and
|
||||
<computeroutput>mountd</computeroutput>, then
|
||||
you need to restart NFS. If you are not able to restart
|
||||
successfully, proceed to <xref linkend="symptom9" endterm="sym9short">.
|
||||
</para>
|
||||
|
@ -150,8 +172,9 @@
|
|||
<listitem>
|
||||
<para>
|
||||
Now check to make sure you can see it from the client.
|
||||
On the client, type <command>rpcinfo -p [server]</command> where
|
||||
<command>[server]</command> is the DNS name or IP address of your server.
|
||||
On the client, type <command>rpcinfo -p </command>
|
||||
<emphasis>server</emphasis> where <emphasis>server</emphasis>
|
||||
is the DNS name or IP address of your server.
|
||||
</para>
|
||||
<para>
|
||||
If you get a listing, then make sure that the type
|
||||
|
@ -160,13 +183,14 @@
|
|||
NFS, make sure Version 3 is listed; if you are trying
|
||||
to mount using NFS over TCP, make sure that is
|
||||
registered. (Some non-Linux clients default to TCP).
|
||||
See man rpcinfo for more details on how
|
||||
Type <userinput>man rpcinfo</userinput> for more details on how
|
||||
to read the output. If the type of mount you are
|
||||
trying to perform is not listed, try a different
|
||||
type of mount.
|
||||
</para>
|
||||
<para>
|
||||
If you get the error No Remote Programs Registered,
|
||||
If you get the error
|
||||
<computeroutput>No Remote Programs Registered</computeroutput>,
|
||||
then you need to check your <filename>/etc/hosts.allow</filename> and
|
||||
<filename>/etc/hosts.deny</filename> files on the server and make sure
|
||||
your client actually is allowed access. Again, if the
|
||||
|
@ -176,9 +200,11 @@
|
|||
the client. Also check the error logs on the system
|
||||
for helpful messages: Authentication errors from bad
|
||||
<filename>/etc/hosts.allow</filename> entries will usually appear in
|
||||
<filename>/var/log/messages</filename>, but may appear somewhere else depending
|
||||
<filename>/var/log/messages</filename>,
|
||||
but may appear somewhere else depending
|
||||
on how your system logs are set up. The man pages
|
||||
for syslog can help you figure out how your logs are
|
||||
for <computeroutput>syslog</computeroutput> can
|
||||
help you figure out how your logs are
|
||||
set up. Finally, some older operating systems may
|
||||
behave badly when routes between the two machines
|
||||
are asymmetric. Try typing <command>tracepath [server]</command> from
|
||||
|
@ -188,8 +214,9 @@
|
|||
not usually a problem on recent linux distributions.
|
||||
</para>
|
||||
<para>
|
||||
If you get the error Remote system error - No route
|
||||
to host, but you can ping the server correctly,
|
||||
If you get the error
|
||||
<computeroutput>Remote system error - No route to host</computeroutput>,
|
||||
but you can ping the server correctly,
|
||||
then you are the victim of an overzealous
|
||||
firewall. Check any firewalls that may be set up,
|
||||
either on the server or on any routers in between
|
||||
|
@ -221,7 +248,7 @@
|
|||
<filename>/proc/mounts</filename> and make sure the volume
|
||||
is mounted read/write (although if it is mounted read-only
|
||||
you ought to get a more specific error message). If not then
|
||||
you need to re-mount with the rw option.
|
||||
you need to re-mount with the <userinput>rw</userinput> option.
|
||||
</para>
|
||||
<para>
|
||||
The second problem has to do with username mappings, and is
|
||||
|
@ -292,7 +319,8 @@
|
|||
</screen>
|
||||
</para>
|
||||
<para>
|
||||
These happen when a NFS setattr operation is attempted on a
|
||||
These happen when a NFS <computeroutput>setattr</computeroutput>
|
||||
operation is attempted on a
|
||||
file you don't have write access to. The messages are
|
||||
harmless.
|
||||
</para>
|
||||
|
@ -309,8 +337,8 @@
|
|||
</screen>
|
||||
</para>
|
||||
<para>
|
||||
The "can't get a request slot" message means that the client-
|
||||
side RPC code has detected a lot of timeouts (perhaps due to
|
||||
The "can't get a request slot" message means that the client-side
|
||||
RPC code has detected a lot of timeouts (perhaps due to
|
||||
network congestion, perhaps due to an overloaded server), and
|
||||
is throttling back the number of concurrent outstanding
|
||||
requests in an attempt to lighten the load. The cause of
|
||||
|
@ -336,7 +364,7 @@ nfs warning: mount version older than kernel
|
|||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Errors in startup/shutdown log for lockd
|
||||
Errors in startup/shutdown log for <command>lockd</command>
|
||||
</para>
|
||||
<para>
|
||||
You may see a message of the following kind in your boot log:
|
||||
|
@ -345,10 +373,12 @@ nfslock: rpc.lockd startup failed
|
|||
</screen>
|
||||
</para>
|
||||
<para>
|
||||
They are harmless. Older versions of rpc.lockd needed to be
|
||||
They are harmless. Older versions of <command>rpc.lockd</command> needed to be
|
||||
started up manually, but newer versions are started automatically
|
||||
by knfsd. Many of the default startup scripts still try to start
|
||||
up lockd by hand, in case it is necessary. You can alter your
|
||||
by <command>nfsd</command>. Many of the
|
||||
default startup scripts still try to start
|
||||
up <command>lockd</command> by hand, in case
|
||||
it is necessary. You can alter your
|
||||
startup scripts if you want the messages to go away.
|
||||
</para>
|
||||
</listitem>
|
||||
|
@ -376,19 +406,19 @@ kmem_create: forcing size word alignment - nfs_fh
|
|||
</title>
|
||||
<titleabbrev id="sym7short">Symptom 7</titleabbrev>
|
||||
<para>
|
||||
<emphasis>
|
||||
<filename>/etc/exports</filename> is VERY sensitive to whitespace - so the
|
||||
<filename>/etc/exports</filename> is <emphasis>very</emphasis> sensitive to whitespace - so the
|
||||
following statements are not the same:
|
||||
</emphasis>
|
||||
<programlisting>
|
||||
/export/dir hostname(rw,no_root_squash)
|
||||
/export/dir hostname (rw,no_root_squash)
|
||||
</programlisting>
|
||||
The first will grant hostname rw access to <filename>/export/dir</filename>
|
||||
The first will grant <userinput>hostname rw</userinput>
|
||||
access to <filename>/export/dir</filename>
|
||||
without squashing root privileges. The second will grant
|
||||
hostname rw privs w/root squash and it will grant EVERYONE
|
||||
else read-write access, without squashing root privileges.
|
||||
Nice huh?
|
||||
<userinput>hostname rw</userinput> privileges with
|
||||
<userinput>root squash</userinput> and it will grant
|
||||
<emphasis>everyone</emphasis> else read/write access, without
|
||||
squashing root privileges. Nice huh?
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="symptom8">
|
||||
|
@ -412,8 +442,10 @@ kmem_create: forcing size word alignment - nfs_fh
|
|||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
You may be using a larger rsize and wsize in your mount options
|
||||
than the server supports. Try reducing rsize and wsize to 1024 and
|
||||
You may be using a larger <userinput>rsize</userinput>
|
||||
and <userinput>wsize</userinput> in your mount options
|
||||
than the server supports. Try reducing <userinput>rsize</userinput>
|
||||
and <userinput>wsize</userinput> to 1024 and
|
||||
seeing if the problem goes away. If it does, then increase them
|
||||
slowly to a more reasonable value.
|
||||
</para>
|
||||
|
@ -430,6 +462,19 @@ kmem_create: forcing size word alignment - nfs_fh
|
|||
to reinstall your binaries if none of these ideas helps.
|
||||
</para>
|
||||
</sect2>
|
||||
<sect2 id="symptom10">
|
||||
<title>File Corruption When Using Multiple Clients</title>
|
||||
<titleabbrev id="sym10short">Symptom 10</titleabbrev>
|
||||
<para>
|
||||
If a file has been modified within one second of its
|
||||
previous modification and left the same size, it will
|
||||
continue to generate the same inode number. Because
|
||||
of this, constant reads and writes to a file by
|
||||
multiple clients may cause file corruption. Fixing
|
||||
this bug requires changes deep within the filesystem
|
||||
layer, and therefore it is a 2.5 item.
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
|
||||
|
|
Loading…
Reference in New Issue