From 04cfa193ca3bca460474a1dc835d8d6779597835 Mon Sep 17 00:00:00 2001 From: binh <> Date: Wed, 9 Feb 2005 13:51:07 +0000 Subject: [PATCH] More consolidation, hence the removal of several files. Binh. --- LDP/guide/docbook/Linux-Networking/DHCP.xml | 104 - LDP/guide/docbook/Linux-Networking/DNS.xml | 4740 ------------- .../docbook/Linux-Networking/Database.xml | 20 - .../Linux-Networking/Email-Hosting.xml | 17 - LDP/guide/docbook/Linux-Networking/FTP.xml | 34 - LDP/guide/docbook/Linux-Networking/LDAP.xml | 4397 ------------ LDP/guide/docbook/Linux-Networking/NFS.xml | 2558 ------- LDP/guide/docbook/Linux-Networking/NTP.xml | 416 -- .../Protocols-and-Standards.xml | 82 + .../Linux-Networking/Proxy-Caching.xml | 2223 ------ LDP/guide/docbook/Linux-Networking/SSH.xml | 45 - LDP/guide/docbook/Linux-Networking/STRIP.xml | 49 - LDP/guide/docbook/Linux-Networking/Samba.xml | 76 - .../docbook/Linux-Networking/Services.xml | 5940 +++++++++++++++++ LDP/guide/docbook/Linux-Networking/TFTP.xml | 92 - LDP/guide/docbook/Linux-Networking/Telnet.xml | 35 - LDP/guide/docbook/Linux-Networking/VNC.xml | 138 - .../docbook/Linux-Networking/WaveLAN.xml | 31 - .../docbook/Linux-Networking/Web-Serving.xml | 76 - LDP/guide/docbook/Linux-Networking/X11.xml | 61 - 20 files changed, 6022 insertions(+), 15112 deletions(-) delete mode 100644 LDP/guide/docbook/Linux-Networking/DHCP.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/DNS.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/Database.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/Email-Hosting.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/FTP.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/LDAP.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/NFS.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/NTP.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/Proxy-Caching.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/SSH.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/STRIP.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/Samba.xml create mode 100644 LDP/guide/docbook/Linux-Networking/Services.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/TFTP.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/Telnet.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/VNC.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/WaveLAN.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/Web-Serving.xml delete mode 100644 LDP/guide/docbook/Linux-Networking/X11.xml diff --git a/LDP/guide/docbook/Linux-Networking/DHCP.xml b/LDP/guide/docbook/Linux-Networking/DHCP.xml deleted file mode 100644 index 8807d0c7..00000000 --- a/LDP/guide/docbook/Linux-Networking/DHCP.xml +++ /dev/null @@ -1,104 +0,0 @@ - - -DHCP - - -Endeavouring to maintain static IP addressing to maintain static IP addressing -information, such as IP addresses, subnet masks, DNS names and other -information on client machines can be difficult. Documentation becomes lost or -out-of-date, and network reconfigurations require details to be modified -manually on every machine. - - - -DHCP (Dynamic Host Configuration Protocol) solves this problem by providing -arbitrary information (including IP addressing) to clients upon request. -Almost all client OSes support it and it is standard in most large networks. - - - -The impact that it has is most prevalent it eases network administration, -especially in large networks or networks which have lots of mobile users. - - -2. DHCP protocol - - DHCP (Dynamic Host Configuration Protocol), is used to control - vital networking parameters of hosts (running clients) with the help - of a server. DHCP is backward compatible with BOOTP. For more - information see RFC 2131 (old RFC 1541) and other. (See Internet - Resources section at the end of the document). You can also read - [32]http://web.syr.edu/~jmwobus/comfaqs/dhcp.faq.html. - -4.5. Other interesting documents - - Linux Magazine has a pretty good article in their April issue called - [62]Network Nirvana: How to make Network Configuration as easy as DHCP - that discusses the set up for DHCP. - -References - - 1. DHCP.html#AEN17 - 2. DHCP.html#AEN19 - 3. DHCP.html#AEN24 - 4. DHCP.html#AEN41 - 5. DHCP.html#AEN45 - 6. DHCP.html#AEN64 - 7. DHCP.html#AEN69 - 8. DHCP.html#AEN74 - 9. DHCP.html#AEN77 - 10. DHCP.html#SLACKWARE - 11. DHCP.html#REDHAT6 - 12. DHCP.html#AEN166 - 13. DHCP.html#AEN183 - 14. DHCP.html#DEBIAN - 15. DHCP.html#AEN230 - 16. DHCP.html#NAMESERVER - 17. DHCP.html#AEN293 - 18. DHCP.html#TROUBLESHOOTING - 19. DHCP.html#AEN355 - 20. DHCP.html#AEN369 - 21. DHCP.html#DHCPSERVER - 22. DHCP.html#AEN382 - 23. DHCP.html#AEN403 - 24. DHCP.html#AEN422 - 25. DHCP.html#AEN440 - 26. http://www.oswg.org/oswg-nightly/DHCP.html - 27. http://www.linux.org.tw/CLDP/mini/DHCP.html - 28. http://www.linux.or.jp/JF/JFdocs/DHCP.html - 29. ftp://cuates.pue.upaep.mx/pub/linux/LuCAS/DHCP-mini-Como/ - 30. mailto:vuksan-feedback@veus.hr - 31. http://www.opencontent.org/opl.shtml - 32. http://web.syr.edu/~jmwobus/comfaqs/dhcp.faq.html - 33. mailto:sergei@phystech.com - 34. ftp://ftp.phystech.com/pub/ - 35. http://www.cps.msu.edu/~dunham/out/ - 36. ftp://metalab.unc.edu/pub/Linux/system/network/daemons - 37. ftp://ftp.phystech.com/pub/ - 38. DHCP.html#NAMESERVER - 39. DHCP.html#LINUXPPC-RH6 - 40. mailto:alexander.stevenson@home.com - 41. DHCP.html#NAMESERVER - 42. ftp://ftp.redhat.com/pub/redhat/redhat-4.2/i386/RedHat/RPMS/dhcpcd-0.6-2.i386.rpm - 43. DHCP.html#SLACKWARE - 44. mailto:nothing@cc.gatech.edu - 45. DHCP.html#NAMESERVER - 46. http://ftp.debian.org/debian/dists/slink/main/binary-i386/net/ - 47. DHCP.html#SLACKWARE - 48. mailto:heiko@os.inf.tu-dresden.de - 49. DHCP.html#NAMESERVER - 50. DHCP.html#REDHAT6 - 51. ftp://ftp.linuxppc.org/ - 52. ftp://ftp.phystech.com/pub/dhcpcd-1.3.17-pl9.tar.gz - 53. DHCP.html#TROUBLESHOOTING - 54. mailto:nothing@cc.gatech.edu - 55. DHCP.html#ERROR3 - 56. ftp://vanbuer.ddns.org/pub/ - 57. DHCP.html#DHCPSERVER - 58. mailto:mellon@isc.org - 59. ftp://ftp.isc.org/isc/dhcp/ - 60. http://www.kde.org/ - 61. ftp://ftp.us.kde.org/pub/kde/unstable/apps/network/ - 62. http://www.linux-mag.com/2000-04/networknirvana_01.html - - diff --git a/LDP/guide/docbook/Linux-Networking/DNS.xml b/LDP/guide/docbook/Linux-Networking/DNS.xml deleted file mode 100644 index ae571dde..00000000 --- a/LDP/guide/docbook/Linux-Networking/DNS.xml +++ /dev/null @@ -1,4740 +0,0 @@ - - -DNS - - Setting Up Your New Domain Mini-HOWTO. - - This document outlines the things you will probably have to do when - you want to set up a network of computers under your own domain. It - covers configuration of network parameters, network services, and - security settings. - - 2. Introduction - - This is a guide to setting up your own domain of Linux machines, or - mixed Linux and Windows machines, on an always-up connection with a - static IP and a named domain. It is not really intended for setups - which use dynamic IPs, or which are regularly disconnected from their - provider for long periods of time, though some basic hints for - operating such a setup are available in section ``Using A Dynamic - IP''. - - - With the increasing availability of permanent connections and static - IPs, it's becoming easier for people and organizations to set up a - real domain, with the associated Internet presence. Proper planning at - the outset can reduce problems later. - - - Much of this document describes techniques for implementing - unobtrusive security on the newly exposed network. This deals with - protection from external attack, and from casual internal attack. It - does not claim to provide an extremely secure setup, but is usually - enough to discourage the less determined attacker. - - - This document is primarily directed at small organizations which have - an existing network of computers, possibly with a shared dialup line, - which are trying to move to a permanent, relatively high-speed - connection, either to improve data transfer with the outside world, or - to create a WWW or FTP site. The document is also directed at new - organizations which want to skip the early stage and start out with - higher speed networking and services under their own domain name. - - - Throughout this document, I will discuss the configuration of a newly - registered domain, example.com. Note that the name example.com is - reserved by the Internet Assigned Numbers Authority for use in - documentation, and so will never correspond to an actual domain. - - - Much of the information in this document is available in other places. - I have tried to distill the material relevant to the creation of a new - domain. Where detail on a specific subject is lacking, you may want to - consult one of the more comprehensive documents. - - - This document will also assume a mixed OS environment. Specifically, I - will assume that some desktop machines are running some version of - Microsoft Windows, while servers and the private network gateway are - running Linux. - - - - 3. Planning Your Network Topology - - While there are arguments which can be made for many different network - layouts, the requirements of many organizations can be met by putting - the desktop machines and private servers on a private masqueraded - subnet, and the publicly accessible machines on valid external IPs. - The machines on valid external IPs will be referred to in this - document as ``exposed hosts''. This leads to the following (example) - topology: - - - - +--------------+ - | | +---------------+ - | ISP-supplied |---------------| FTP server | - | router | | +---------------+ - | | | - +--------------+ | +---------------+ - |------| WWW server #1 | - | +---------------+ - | - | +---------------+ - |------| WWW server #2 | - | +---------------+ - | - ~ - ~ - | - | +---------------+ - |------| Private | - | Network | - | Gateway | - +---------------+ - | - | - | - | - +------------+ | +-------------------+ - | Desktop #1 |-------------------|------| Private server #1 | - +------------+ | +-------------------+ - | - . -------------------|-------- . - . | . - . -------------------|-------- . - | - +------------+ | +-------------------+ - | Desktop #N |-------------------|------| Private server #N | - +------------+ +-------------------+ - - - - In this example, the router provided by the ISP (Internet Service - Provider), FTP server, WWW servers, and the machine labelled ``private - network gateway'' all have externally visible IP numbers, while the - desktop and private server machines have IP numbers allocated from RFC - 1918 , reserved for private use. - The IP numbers you choose for use within the private network - (everything below the private network gateway machine) should be - chosen to be unique, not only among the hosts under your control, but - should also not conflict with numbers assigned on similar private - subnets at other sites or partner companies with whom you might, at - some time, want to implement a virtual private network, in order to - reduce confusion and reconfiguration when the networks are merged in - that way. As outlined in the RFC, you can choose from any class C - network from 192.168.0.* to 192.168.255.*, or any class B network from - 172.16.*.* to 172.31.*.*, or the class A network 10.*.*.*. In the rest - of this document I will assume that your private network (if you've - chosen to create one) is on the class C network 192.168.1.*, and your - private network gateway machine is at IP number 10.1.1.9, one of the - IP numbers provided to you by your provider (note that this is not a - valid external IP, I use it as an example only). I will also assume - that there is a machine, betty.example.com, at 10.1.1.10, which will - handle both www and FTP services. - - - Take note of the number of external IP numbers which you need for your - own machines. You will need one IP number for each machine which lies - outside the private network gateway, plus one for the gateway itself. - This count does not include any IP numbers which may be taken by - routers, broadcast addresses, and so on. You should ask your provider - for a block of addresses large enough to mount the given number of - machines. For example, in my office network, of the 8 IP numbers - allocated from the ISP, three were not usable by my computers, leaving - enough IP numbers for four machines outside the gateway, plus the - gateway itself. - - - This network topology is not correct for everybody, but it is a - reasonable starting point for many configurations which don't have - special needs. The advantages of this configuration include: - - ˇ Easy expandability. If you suddenly double your number of private - nodes, you don't have to worry about getting a new IP block from - your provider and reconfiguring all of the interfaces on your - machines. - - ˇ Local network control. Adding a new workstation to your private - network requires no communication with your provider, unlike - exposed nodes, which need both forward and reverse DNS (domain name - service) mappings if they are to perform certain tasks (ssh and - ftpd may complain if they can't perform reverse and forward DNS on - incoming connections). A reverse DNS query is an attempt to obtain - the host name from the IP number. - - ˇ Centralized security. The private network gateway can enforce - security over the whole private network, filtering packets and - logging attacks, rather than having to install such measures on - each desktop and server on the private network. This can be - enforced not only on incoming packets, but also on outgoing - packets, so that a misconfigured desktop machine doesn't - inadvertently broadcast data to the outside world which ought to - remain internal. - - ˇ Easy transplantability. Because the IP numbers within the private - network are yours for as long as you want them, you can move the - entire network to a new range of IP numbers without having to make - any changes to the network configuration on the private network. - The publicly exposed hosts still have to be reconfigured, of - course. - - ˇ Transparent Internet access. The machines on your private network - can still use FTP, telnet, WWW, and other services with minimal - obstruction, assuming a Linux masquerading router. The users may - not even be aware that their machines are not on externally visible - IP numbers. - - - Some of the potential disadvantages of such a configuration are: - - - ˇ Some services will not be available directly to the machines on the - internal network. NTP synchronization against an outside host, - certain obscure services which may not have masquerading rules in - the kernel, and .shosts authentication for logging in to external - nodes are all difficult or impossible, but simple workarounds are - almost always available. - - ˇ More network hardware costs. The private network gateway machine - needs two network cards, and you need at least two hubs / switches, - one on the visible network and one on the private network. - - ˇ Machines outside the private network cannot easily make direct - connections to machines within the private network. They may have - to open a session first on the private network gateway machine, - then log through to the internal host. It is possible to route - packets transparently through the firewall, but this is not - recommended for security reasons which will be discussed in a later - section. - - - You should consider these points in planning your network topology, - and decide if a fully visible network is more appropriate for your - situation. In the rest of this document I will assume that you have - configured your network as shown above. If you have chosen to have a - fully visible network, some details will differ, and I will try to - point out such differences in this document. - - - As a special case, if you do not need any external servers, the ISP- - supplied router can be attached directly to your external interface on - the private network gateway machine, rather than with a hub. - - - - 4. Obtaining Your Connection - - - 4.1. Choosing Your Provider - - As with anything, shop around. Determine which services are available - in your area, as well as the costs associated with those services. Not - all locations are wired to accept DSL, and some locations may not be - suitable for wireless connections due to constraints of the landscape, - architecture, or environment. Be prepared to provide the street - address of the location where your hookup will be installed, as DSL - speeds are strongly dependent on your distance from the switch, and - ask specifically about such details as bandwidth between your machine - and the provider, what has to be done to install the connection, and - what hardware is provided in the quoted monthly rate. Also, you should - have some idea of how many IP numbers you need for your own machines - (remember that not all IP numbers in the block you get from the - provider will be available for attaching your computers). Ask the - provider what their total bandwidth is out to the outside world, as - the quoted speed is only between your site and theirs. If the provider - has insufficient bandwidth to the outside, the customers will suffer - bottlenecks within the provider's network. - - - Once you have narrowed down a list of candidates, ask around, see if - anybody can provide you with recommendations for the services you're - considering. Ask them what sort of bandwidth they get to unloaded - sites. Also, if you intend to have fast connections between the new - domain and local ISP accounts from home, for telecommuting, or just - remote administration, it is essential that you do a traceroute from - your home ISP account to a host operating on the service you're - considering. This will tell you how many hops, and how much latency - you should expect, between home and the new domain. Latencies much - above 100 to 200 milliseconds can be difficult to use for extended - periods of time. The traceroute should be run around the time of day - that you expect to make use of the network connection between home and - the new domain. - - - - 4.2. Preparing For Hardware Installation - - After you have chosen the provider and service type for the new - domain, ask about installation details. You may require service calls - from the telephone company as well as from the ISP in order to install - the service, and the technicians may need access to controlled areas - of your building, so inform the building engineer of the installation - requirements. - - - Before the ISP technician arrives, ask for the network parameters, - specifically the IP number, netmask, broadcast address, gateway - routing address, DNS server address, and also what cabling you need to - connect to the hardware delivered by the technician (i.e. straight- - through or crossover RJ45 cabling, etc.). - - - Have one machine available for testing, and put it close to where the - network connection hardware will be installed. If possible, configure - it before the service technician arrives, setting the IP number and - netmask, and have the appropriate cabling ready so that the - installation and testing can be done quickly. - - - - 4.3. Testing The Connection - - With your test machine attached to the ISP's hardware, make sure that - you can ping sites beyond the ISP. If not, a traceroute to the outside - can help to show where the connection is failing. If traceroute shows - no successful hops it indicates that your test machine's network - configuration (default route, interface address, NIC drivers, DNS, - etc.) is incorrectly set. If it shows one hop, that could mean that - your router is not correctly configured to communicate with the ISP. - If it shows several hops before failing, the problem is almost - certainly in the ISP or in the outside world, and beyond your - immediate control. - - - - 4.4. Using A Dynamic IP - - The benefits of a corporate connection, with a static IP block and - various hosted services, comes with a cost. It can be more than ten - times as expensive as a high speed home connection on DSL or cable - modem. If the budget can't support a corporate connection, or if no - such connections are available in your area, you might want to try to - set up a domain on a dynamic IP. Instead of a range of IP numbers, you - typically get exactly one, which means that your private network - gateway machine will also have to host any incoming services from the - outside. - - - First, you might want to check the legality of it. Many companies' - user agreements explicitly forbid setting up externally-accessible - servers on personal accounts. They may enforce this with packet - filters blocking incoming connections on the http and FTP ports. You - should also be aware that the quoted connection speed for personal - accounts such as home DSL or cable modem are the downlink speeds, and - that the uplink speeds might be much slower. The uplink speed is what - is important for serving up FTP or web content. - - - If you have a dynamic IP, and you want to have incoming connections, - you will have to subscribe to a dynamic IP hosting service, such as - one of those listed at Dynamic DNS Providers - . These services typically work - by running software on your machine which passes your current IP - number on to the company's servers. When your current IP number - arrives at the servers, their DNS tables are updated to reflect the - new value. You can either get a domain name under their domain name, - such as ``example.dynip.com'' or ``example.dynhost.com'', or you can - register your own domain and set the primary DNS authority to point to - the company providing this service (usually at a higher cost). - - - There is also a free hosting service, at Domain Host Services - . They seem fairly new, and there are few details - on their web site at the moment, but you might find it worth a look. - - - If you have set up a dynamic IP, and subscribed to one of these - services, it will affect some of the decisions you make in section - ``Deciding Which Domain Services You Will Host''. In particular, there - is little point subscribing to a dynamic IP hosting service if you do - not plan to host at least one of web or FTP services. You will have to - set primary DNS authority to point to the company you've chosen. You - should not have a named daemon answering requests from outside your - private network. Other details, such as handling of email, will depend - on the specifics of the service you've subscribed to, and can best be - answered by the support staff of that company. - - - One final note: if you want to have remote access to a machine with a - dynamic IP, but don't need it for hosting other services, the - inexpensive solution is to create a ``drop box'' on a publicly - accessible machine with a static IP, and have your dynamic IP host - send its IP number there, either in email or simply by writing it into - a file on a shell account. When you want to access your machine - remotely, first extract the current IP number from the drop box, then - use slogin to attach directly to that IP number. This is, after all, - really all that a dynamic IP hosting service does, they just do it - automatically over standard services, saving you some steps. - - - - 5. Registering A Domain Name - - In order for people in the outside world to locate your servers under - the domain name of your choice, whether for web, FTP, or email - delivery, you will have to register the domain name for insertion into - the relevant top level domain database. - - - Exercise some simple prudence in choosing your domain name. Certain - words or phrases may be forbidden on the grounds of community - standards, or may be offensive to visitors whose language or slang - differs from that of your region. Domain names can contain only the 26 - letters of the Roman alphabet (without accents), the hyphen (though - not at the beginning or end of the name), and the 10 digits. Domain - names are not case-sensitive, and can be at least 26 characters long - (this limit is subject to change). Be careful not to register a name - which you can reasonably have been expected to know infringes on the - trademarks of an existing company, the courts are not kind to - cybersquatters. Some information on the circumstances under which your - poorly-chosen domain name might be stripped from your control are - available in this Uniform Domain Name Dispute Resolution Policy - . - - - There are many companies which register names in the ``.com'', - ``.net'', and ``.org'' top level domains. For a current list, check - the list of accredited registrars - . - - - To register a name under a country top level domain, such as a - ``.ca'', ``.de'', ``.uk'', etc., check with the appropriate authority, - which can be located in the Country Code Top-Level Domains database - . - - - Typically, you have to provide the registrar with contact information, - primary and secondary DNS IP numbers, a change request validation - scheme (you wouldn't want just anybody changing your domain for you), - and money in the form of an annual fee. If you're not comfortable with - the change request validation schemes offered by a registrar, let them - know that you're not willing to use the service until they address - your security concerns. - - - - 6. Deciding Which Domain Services You Will Host - - Most full-service ISPs will provide a variety of domain services for - their customers. This is largely because of the problems associated - with hosting these services under certain other, more popular desktop - and server operating systems. These services are much easier to - provide under Linux, and can be hosted on fairly inexpensive hardware, - so you should decide what services you want to take on for yourself. - Some of these services include: - - ˇ Primary DNS authority on your domain. See section ``Primary DNS - Authority''. - - ˇ Electronic mail. See section ``Electronic Mail''. - - ˇ Web space hosting. See section ``Web Space Hosting''. - - ˇ FTP space hosting. See section ``FTP Site Hosting''. - - ˇ Packet filtering. See section ``Packet Filtering''. - - In each of these, you basically have to weigh convenience against - control. When your ISP performs one or more of these services, you can - usually be fairly sure that they have people with experience - maintaining the service, so you have less to learn, and less to worry - about. At the same time, you lose control over these services. Any - changes require that you go through the technical support of your ISP, - something which may sometimes be inconvenient or cause longer delays - than you would like. There's also a security issue involved, the ISP - is a much more tempting target to attackers than your own site. Since - an ISP's servers might host email and/or web space for the dozens of - companies which are their customers, an attacker who compromises one - of those servers gets a much higher return for his efforts than one - who attacks your personal servers, where only one company's data is - kept. - - - - 6.1. Primary DNS Authority - - When a person somewhere in the outside world attempts to connect to a - machine in the new example.com domain, queries are sent between - various servers on the Internet, ultimately resulting in the IP number - of that machine being returned to the software of the person - attempting the connection. The details of this sequence are beyond the - scope of this document. Neglecting many details, when a request is - made for the machine fred.example.com, a centralized database is - consulted to determine what is the IP number of the machine which - holds primary DNS authority for the example.com domain. This IP number - is then queried for the IP number of the machine fred.example.com. - - - There must be a primary and a secondary DNS server for every domain - name. The names and IP numbers of these two servers are stored in a - centralized database whose entries are controlled by domain - registration authorities such as Network Solutions - . - - - If you elect to have primary DNS authority hosted by your ISP, these - two servers will probably both be machines controlled by the ISP. Any - time you want to add an externally visible machine to your network, - you will have to contact the ISP and ask them to put the new machine - in their database. - - - If you elect to hold primary DNS authority on your own host, you will - still use another machine as your secondary. Technically, you should - use one on a redundant Internet connection, but it is very common that - the secondary is held on one of your ISP's machines. If you want to - add an externally visible machine to your network, you will have to - update your own database, and then wait for the change to propagate - (something which takes, typically, a small number of hours). This - allows you to add barney.example.com without having to go through your - ISP. - - - It is a good idea to set up secondary DNS on a geographically distant - host, so that a single cable cut near your ISP doesn't take both your - primary and secondary DNS servers off line. The domain registrar you - used to register your domain name may provide secondary DNS service. - There is also a free service, Granite Canyon - , available to anybody who asks. - - - Regardless of whether or not you choose to act as primary DNS - authority for your domain, see section ``Setting Up Name Resolution'' - for configuration help. You will want some sort of name resolution - system for your private network, even if you delegate primary DNS - authority to the ISP. - - - - 6.2. Electronic Mail - - When you subscribe with your ISP, they will typically supply a number - of email boxes. You can elect to use this service exclusively, in - which case all incoming email is stored on the ISP's servers and your - users read their mail with POP3 clients which connect to the ISP's - servers. Alternately, you may decide to set up email on your own - machines. Once again, you should weigh the merits of the two - approaches, and choose the one which you prefer. - - - Things to remember if you use the ISP for all email: - - ˇ It may be easier to access the email from home, or from other - locations when you're on a business trip, depending on the security - which you use to protect your domain. - - ˇ Email is routinely stored on the ISP's servers, which may be a - problem if sensitive material is sent unencrypted. - - ˇ You have a limited number of email accounts, and may have to pay if - you exceed this limit. - - ˇ To create a new email address, you have to go through the ISP. - - - Things to remember if you provide your own email: - - ˇ Email is routinely stored on your own servers, with backup storage - on your ISP if your mail host goes down or its disk fills up. - - ˇ You have an essentially unlimited number of email accounts, which - you can create and delete yourself. - - ˇ You have to support the email clients used on your private network, - and possibly by people trying to read their email from home. - - - One possible approach is to host email yourself, but also use the - several email addresses provided by the ISP. People who need email - accessible from outside the private network can have an email address - in your domain which gets redirected to one of the ISP-supplied email - addresses. Others can have local email on the private network. This - requires a bit more coordination and configuration, but gives more - flexibility than either of the other approaches. - - - Should you choose to host email for your domain, see section ``Setting - Up Email For Your Domain'' for configuration help. - - - If you decide not to host email for your domain, refer to section - ``DNS Configuration If You Are Not Hosting Email'' for important notes - on the name resolution configuration. - - - - 6.3. Web Space Hosting - - Your ISP may allocate you a certain amount of space on their web - servers. You might decide to use that, or you might have a web hosting - machine which you put on your external network, in one of your - external IP numbers. - - - Points to remember if you choose to use the ISP's web space hosting: - - ˇ You have a certain disk space allocation which you should not - exceed. This will include not only web space contents, but also - data collected from people visiting the site. - - ˇ The bandwidth between your web server and the outside world will - almost certainly be higher than it would be if you hosted it on - your own hardware. In any case, it will not be slower. - - ˇ It may be difficult to install custom CGI scripts or commercial - packages on your web site. - ˇ Your bandwidth between your network and your web server will almost - certainly be lower than it would be if you hosted it on your own - network. - - - Points to remember if you choose to host your own web space: - - ˇ You have much more control over the hosting machine. You can tailor - your security more precisely for your application. - - ˇ Potentially sensitive data, such as credit card numbers or mailing - addresses, remains on machines which you control. - - ˇ Your backup strategy is probably not as comprehensive as your - ISP's. - - - Notice that I do not mention anything about the ISP having more - powerful hardware, higher peak data rates, and so on. By the time - these things become important, you're talking about very high data - rate network connections, and, quite frankly, you had better be - delegating these decisions to a skilled consultant, not looking in a - Linux HOWTO. - - - Should you choose to host web space for your domain on your own - server(s), refer to other documents, such as the WWW-HOWTO - , for - configuration help. I strongly recommend that this service be run on a - different machine from the private network gateway machine, for - security reasons. - - - - 6.4. FTP Site Hosting - - Basically, the same arguments apply to FTP hosting as apply to WWW - hosting, with the exception that active content is not an issue for - FTP, and CGI scripts don't appear. Most of the recent ftpd exploits - have come from buffer overruns resulting from the creation of large - directory names in anonymously-writable upload directories, so if your - ISP allows uploads and is lax in keeping up with security updates on - the FTP daemon, you might be better off hosting this service yourself. - - - Should you choose to host FTP for your domain on your own server(s), - make sure to get the latest version of your FTP daemon, and consult - the configuration instructions there. Once more, I strongly recommend - that this service be run on a different machine from the private - network gateway machine, for security reasons. - - - For wu-ftpd, I would recommend the following configuration options: - - ˇ --disable-upload - unless you need anonymous uploads - - ˇ --enable-anononly - encourage your local users to use scp to - transfer files between machines. - - ˇ --enable-paranoid - disable whatever features of the current - release might be considered questionable. - - - - 6.5. Packet Filtering - - Some ISPs will put packet filters on their network, to protect the - users of the system from each other, or from external attackers. Cable - modem networks and similar broadcast networks have had embarrassing - problems when users of Windows 95 or 98 inadvertently set up disk - shares, exporting the full contents of their hard drives to anybody on - the network segment who cared to browse for active servers in the - neighbourhood. In some cases, the solution has been to tell the users - not to do that, but some providers have put filtering into the access - hardware to prevent people from exporting their data by accident. - - - Packet filtering is really something which you ought to do yourself. - It fits in easily into the kernel running on your private network - gateway machine and gives you a better idea of what's happening around - you. You often will find that you have to make small tweaks to the - firewall to optimize it during the initial setup, and this is much - easier to do in real time than through a technical support contact. - - - Should you choose to do packet filtering for your domain, see section - ``Setting Up Packet Filtering'' for configuration help. - - - - 7. Configuring Your Hosted Services - - 7.1. Setting up Name Resolution - - You will want some way for the computers on your network to refer to - one another by name, and also a way for people in the outside world to - refer to your exposed hosts by name. There are several ways to go - about doing this. - - - 7.1.1. DNS On Private Network, ISP Handles Domain - - [ Note: if you have chosen not to implement a private network, go to - section ``Fully Exposed Network, Hosted By ISP''. ] - - - In this configuration, you have delegated responsibility for the - primary DNS authority on your domain to the ISP. You still use DNS - within your private network when hosts there want to talk to one - another. You have given your ISP a list of the names and IP numbers of - all exposed hosts. If you want one externally visible machine, for - instance betty.example.com, to act both as web and FTP server, you - should ask the ISP to make CNAME entries for www.example.com and - ftp.example.com pointing to betty.example.com. - - - Set up DNS on your private network gateway machine. This can be done - securely, and makes upgrading easier, should you later decide to host - primary DNS authority for your domain. - - - I will assume that you have decided to host DNS from the machine - dns.example.com, which is on the private network gateway, and an alias - for fred.example.com at 192.168.1.1. Some small modifications have to - be made to this configuration if this is not the case. I will not - cover that in this HOWTO unless there is significant interest. - - - - You will have to download and compile a recent version of BIND, the - Berkeley Internet Name Domain. It is available at the BIND web site - . Next, you have to configure the - daemon. Create the following file, /etc/named.conf: - - - ______________________________________________________________________ - options { - directory "/var/named"; - listen-on { 192.168.1.1 }; - }; - - zone "." { - type hint; - file "root.hints"; - }; - - zone "0.0.127.in-addr.arpa" { - type master; - file "pz/127.0.0"; - }; - - - zone "1.168.192.in-addr.arpa" { - type master; - file "pz/1.168.192"; - }; - - zone "example.com" { - type master; - notify no; - file "pz/example.com"; - }; - ______________________________________________________________________ - - - - Note that we are declaring ourselves the master for the example.com - domain. Meanwhile, our ISP is also declaring itself to be the master - for the same domain. This is not a problem, as long as you are careful - about the setup. All of the machines on the private network must use - dns.example.com to perform their name resolution. They must not use - the name resolvers of the ISP, as the ISP name server believes itself - to be authoritative over your entire domain, but it doesn't know the - IP numbers or names of any machines on your private network. - Similarly, hosts on exposed IP numbers in your domain must use the ISP - name server, not the private name server on dns.example.com. - - - The various files under /var/named must now be created. - - - The root.hints file is exactly as described in the BIND documentation, - or in the DNS HOWTO . At the time of this writing, the following is a valid - root.hints file: - - - - ______________________________________________________________________ - H.ROOT-SERVERS.NET. 6d15h26m24s IN A 128.63.2.53 - C.ROOT-SERVERS.NET. 6d15h26m24s IN A 192.33.4.12 - G.ROOT-SERVERS.NET. 6d15h26m24s IN A 192.112.36.4 - F.ROOT-SERVERS.NET. 6d15h26m24s IN A 192.5.5.241 - B.ROOT-SERVERS.NET. 6d15h26m24s IN A 128.9.0.107 - J.ROOT-SERVERS.NET. 6d15h26m24s IN A 198.41.0.10 - K.ROOT-SERVERS.NET. 6d15h26m24s IN A 193.0.14.129 - L.ROOT-SERVERS.NET. 6d15h26m24s IN A 198.32.64.12 - M.ROOT-SERVERS.NET. 6d15h26m24s IN A 202.12.27.33 - I.ROOT-SERVERS.NET. 6d15h26m24s IN A 192.36.148.17 - E.ROOT-SERVERS.NET. 6d15h26m24s IN A 192.203.230.10 - D.ROOT-SERVERS.NET. 6d15h26m24s IN A 128.8.10.90 - A.ROOT-SERVERS.NET. 6d15h26m24s IN A 198.41.0.4 - ______________________________________________________________________ - - - - The pz/127.0.0 file is as follows: - - - ______________________________________________________________________ - $TTL 86400 - - @ IN SOA example.com. root.example.com. ( - 1 ; Serial - 8H ; Refresh - 2H ; Retry - 1W ; Expire - 1D) ; Minimum TTL - NS dns.example.com. - 1 PTR localhost. - ______________________________________________________________________ - - - - The pz/1.168.192 file is as follows: - - - ______________________________________________________________________ - $TTL 86400 - - @ IN SOA dns.example.com. root.dns.example.com. ( - 1 ; Serial - 8H ; Refresh 8 hours - 2H ; Retry 2 hours - 1W ; Expire 1 week - 1D ; Minimum 1 day - ) - NS dns.example.com. - - 1 PTR fred.example.com. - PTR dns.example.com. - PTR mail.example.com. - 2 PTR barney.example.com. - 3 PTR wilma.example.com. - ______________________________________________________________________ - - - - and so on, where you create one PTR record for each machine with an - interface on the private network. In this example, fred.example.com is - on IP number 192.168.1.1, and is pointed to by the dns.example.com and - mail.example.com aliases. The machine barney.example.com is on IP num­ - ber 192.168.1.2, and so on. - - - The pz/example.com file is as follows: - - - ______________________________________________________________________ - $TTL 86400 - - @ IN SOA example.com. root.dns.example.com. ( - 1 ; Serial - 8H ; Refresh 8 hours - 2H ; Retry 2 hours - 1W ; Expire 1 week - 1D ; Minimum 1 day - ) - NS dns.example.com. - IN A 192.168.1.1 - IN MX 10 mail.example.com. - IN MX 20 . - - - localhost A 127.0.0.1 - fred A 192.168.1.1 - A 10.1.1.9 - dns CNAME fred - mail CNAME fred - barney A 192.168.1.2 - wilma A 192.168.1.3 - betty A 10.1.1.10 - www CNAME betty - ftp CNAME betty - ______________________________________________________________________ - - - - Note that we create entries for machines both within the private net­ - work and on external IPs, since machines within the private network - will not query the ISP's name servers for a request on, say, - betty.example.com. We also provide both IP numbers for fred, the pri­ - vate and external IP numbers. - - - One line in the ``options'' section of /etc/named.conf bears - discussion: - - - listen-on { 192.168.1.1 }; - - - - This will prevent your named daemon from answering DNS requests on the - outside interface (all requests from the outside must go through the - ISP's name resolver, not yours). - - - - 7.1.2. Non-DNS Resolution On Private Network, ISP Handles Domain - - [ Note: if you have chosen not to implement a private network, go to - section ``Fully Exposed Network, Hosted By ISP''. ] - In this configuration, you have decided that your private network is - fairly small and unlikely to change often. You have decided not to use - the centralized database of a DNS server, and instead to maintain the - host resolution separately on each machine. All machines should use - the ISP's DNS server for their host name resolution for machines - beyond the private network gateway. For name resolution on the private - network, a hosts table has to be created. For Linux, this means - entering the names and IP numbers of all of the machines on the - private network into the /etc/hosts on each machine. Any time a new - machine is added, or a name or IP number is changed, this file has to - be updated on each Linux box. - - - As in section ``DNS Resolution on Private Network, ISP Handles - Domain'', the list of host names on exposed IP numbers must be sent to - the ISP, and any aliases (such as for www and ftp names) should be - specified so that a CNAME entry can be created by the ISP. - - - - 7.1.3. You Are Primary DNS Authority For Domain - - While you could set up named resolution on the exposed hosts, and - private database resolution for the private network, I will not cover - that case. If you're going to be running named for one service, you - ought really to do it for both, just to simplify the configuration. In - this section I will assume that the private network gateway machine is - handling name resolution both for the private network and for outside - requests. - - - At the time of this writing, under version 8.2.2 of the BIND package, - there is no way for a single named daemon to produce different answers - to requests, depending on which interface the request arrives on. We - want name resolution to act differently if the query comes from the - outside world, because IP numbers on the private network shouldn't be - sent out, but have to be available in answer to requests from within - the private network. There is some discussion of a new ``views'' - keyword which may be added to BIND to fill this need at a later date, - but until that happens, the solution is to run two named daemons with - different configurations. - - - First, set up the private network domain name server as described in - section ``DNS Resolution on Private Network, ISP Handles Domain''. - This will be the name resolver visible from within your private - network. - - - Next, you have to set up DNS for your domain, as visible to hosts in - the outside world. First, check with your provider to see if they will - delegate reverse lookups of your IP numbers to them. While the - original DNS standard didn't account for the possibility of - controlling reverse DNS on subnets smaller than a class C network, a - workaround has been developed which works with all compliant DNS - clients, and has been outlined in RFC 2317 - . If your provider is willing to - delegate control of reverse DNS on your IP block, you will have to - determine from them the exact name of the in-addr pseudo-domain they - have chosen to delegate to (the RFC does not offer a convention they - recommend for everyday use), and you will have to register control for - that pseudo-domain. I will assume that the provider has delegated - control to you, and the name of the pseudo-domain is 8.1.1.10.in- - addr.arpa. The provider would create CNAME entries of the form - - - 8.1.1.10.in-addr.arpa. 2H IN CNAME 8.8.1.1.10.in-addr.arpa. - 9.1.1.10.in-addr.arpa. 2H IN CNAME 9.8.1.1.10.in-addr.arpa. - 10.1.1.10.in-addr.arpa. 2H IN CNAME 10.8.1.1.10.in-addr.arpa. - etc. - - - - in their zone file for the 1.1.10.in-addr.arpa domain. The configura­ - tion of your 8.1.1.10.in-addr.arpa zone file is given later in this - section. - - - If your provider is willing to delegate control of the reverse DNS to - you, they will create CNAME entries in their reverse DNS zone table - for those IP numbers you control, pointing to the corresponding - records in your pseudo-domain, as shown above. If they are not willing - to delegate control to you, you will have to ask them to update their - reverse DNS entries any time you add, delete, or change the name of an - externally visible host in your domain. If the reverse DNS table is - not synchronized with your forward DNS entries, certain services may - generate warnings, or refuse to handle requests issued by machines - affected by the mismatch. - - - You now have to create a second named setup, this one to handle - requests issued by machines outside the private network gateway. This - setup lists only those hosts and IP numbers which are externally - visible, and responds only to requests on the outside interface of the - private network gateway machine. - - - First, create a second configuration file, for instance - /etc/named.ext.conf for requests from the external interface. In our - example, it might be as follows: - - - ______________________________________________________________________ - options { - directory "/var/named"; - listen-on { 10.1.1.9; }; - }; - - zone "." { - type hint; - file "root.hints"; - }; - - zone "0.0.127.in-addr.arpa" { - type master; - file "pz/127.0.0"; - }; - - - zone "8.1.1.10.in-addr.arpa" { - type master; - file "ext/8.1.1.10"; - }; - - zone "example.com" { - type master; - notify no; - file "ext/example.com"; - }; - ______________________________________________________________________ - - The root.hints and pz/127.0.0 files, both under /var/named are shared - with the other running daemon. The file ext/8.1.1.10 is as follows: - - - ______________________________________________________________________ - $TTL 86400 - - @ IN SOA fred.example.com. root.fred.example.com. ( - 1 ; Serial - 10800 ; Refresh 3 hours - 3600 ; Retry 1 hour - 3600000 ; Expire 1000 hours - 86400 ) ; Minimum 24 hours - NS dns.example.com. - 9 IN PTR fred.example.com. - PTR dns.example.com. - PTR mail.example.com. - 10 IN PTR betty.example.com. - PTR www.example.com. - PTR ftp.example.com. - ______________________________________________________________________ - - - - The file ext/example.com contains the following: - - - ______________________________________________________________________ - - $TTL 86400 - - @ IN SOA example.com. root.fred.example.com. ( - 10021 ; Serial - 8H ; Refresh 8 hours - 2H ; Retry 2 hours - 1W ; Expire 1 week - 1D ; Minimum 1 day - ) - NS fred.example.com. - IN A 10.1.1.9 - IN MX 10 mail.example.com. - IN MX 20 . - - - localhost A 127.0.0.1 - fred A 10.1.1.9 - betty A 10.1.1.10 - dns CNAME fred - mail CNAME fred - www CNAME betty - ftp CNAME betty - ______________________________________________________________________ - - - - Start the two daemons on the private network gateway machine. Put the - following into your network daemon initialization scripts: - - - /usr/sbin/named -u dnsuser -g dnsgroup /etc/named.conf - /usr/sbin/named -u dnsuser -g dnsgroup /etc/named.ext.conf - - I've assumed here that you have created the unprivileged user - ``dnsuser, and the corresponding unprivileged group ``dnsgroup''. If a - bug in bind turns up, which allows an attacker to execute code from - within named, the attacker will find himself restricted to those oper­ - ations available to the unprivileged user. The /var/named directory - and the files within should not be writable by ``dnsuser''. - - - The machines on the private network must have their name resolution - configured to ask dns.example.com (at IP 192.168.1.1 in our example), - while the externally visible machines can either query the network - gateway's outside interface (at IP 10.1.1.9 in our example), or the - ISP's DNS servers. - - - - 7.1.4. Fully Exposed Network, Hosted By ISP - - In this configuration, you have chosen to expose all of your hosts. - You have a real IP number for each machine in your domain, and you've - given your ISP the list of machine names and IP numbers. The ISP has - given you at least one IP number for their DNS host(s). Your Linux - boxes are now configured for name resolution in /etc/resolv.conf: - - - ______________________________________________________________________ - search example.com - nameserver - nameserver - ______________________________________________________________________ - - - - Windows boxes are configured with the same parameters, in the network - settings dialogues. - - - - 7.1.5. Preparing DNS Before Moving Your Domain - - If you decide to move your domain to a new IP number, either because - you have to change your ISP or because you've changed some details of - your service which require you to move to a new IP number from the - same ISP, you will have to make a few preparations ahead of the move. - - - You want to set things up so that the IP number fetched by a DNS - lookup somewhere in the outside world points properly to the original - IP number until you move, and then quickly points to the new IP number - after you move. Remote sites can have cached your IP number, and - subsequent queries may be answered locally from the cache, rather than - querying the appropriate servers. The effect of this might be that - people who had visited your site recently are unable to connect, while - new visitors have no problems, because only the new visitors are - getting valid uncached data. Complicating things further is the fact - that the root-level servers are only updated twice a day, so it's - difficult to time a change to the identities of your primary and - secondary DNS servers in the root servers. - - - The easiest way to make the transition is probably to duplicate the - entire site, or at least the publicly visible components of it, on the - new IP number, submit the changes, and then wait for the traffic to - shift completely to the new IP number. This is probably not very - practical, though. - - - What you should do first is to arrange with your new ISP (or your - current ISP if you've just changing IP numbers within a single ISP) to - host primary and secondary DNS during the transition. This should be - done at least a day before the move. Ask them to set the TTL on this - record to something appropriately small (for instance, five minutes). - The sample DNS files given earlier in this section all have TTL values - set to 86400 seconds (1 day). If your TTL is longer than this, you - will have to arrange the change that much more in advance of the move. - Ultimately, here's what you have to achieve. If your current domain - information TTL is, say, N hours, then the following have to be - finished more than N hours before the move: - - ˇ Your domain registration entry must show primary and secondary DNS - on the new ISP's machines in the root database. Allow at least a - day between the time you submit the change and the time the change - enters the database. - - ˇ The new primary and DNS servers should point to the original IP - numbers of your site, with a fairly small TTL. - - Note that you cannot accelerate this process by reducing your - current domain TTL value, unless you've also done this at least N - hours before the move. - - - Now, you're ready for the move. Move your machines over to the new IP - numbers. Synchronize this with an update of the DNS records on your - ISP to point to the new numbers. Within five minutes (the small TTL - you set for the move), the traffic should have switched over to the - new site. You can now rearrange the DNS authority to your liking, - making yourself primary if that's how you want it, and putting the TTL - back up to a reasonably large value. - - - - 7.2. DNS Configuration If You Are Not Hosting Email - - The configurations described in section ``Setting Up Name Resolution'' - have MX records pointing to a machine ``mail.example.com''. The MX - record with the lowest priority number following tells remote sites - where to send email. Other MX records with higher priority numbers are - used as backup email receivers. These backups will hold the mail for a - certain period of time if the primary email receiver is not able to - accept the messages for some reason. In the examples in that section, - I have assumed that fred.example.com, under its alias of - mail.example.com, is handling email for the domain. If you have chosen - to let the ISP handle all of your email hosting, you should change - those MX records to point to the appropriate ISP machines. Ask your - ISP technical support representative what host names you should use - for the MX records in the various files. - - - - 7.3. Setting up Electronic Mail - - If you have chosen to do full electronic mail hosting for your domain, - you'll have to take special actions for email coming from hosts on the - private network, and for allowing transparent mail reading from - anywhere within the private network. Unless you're careful, messages - are likely to sit around for long times if they are waiting on one - host, and the intended recipient is logged on another machine. For - security reasons, I recommend that the incoming email not be - accessible from the externally visible hosts (this might help to - discourage a PHB who wants his desktop machine to be on a real IP, - then wonders why he gets brought down by a ping of death twice a day). - A transparent email sharing system on the private network fairly - straight-forward in sendmail. If anybody wants to provide tested - solutions for other mail handling daemons, I welcome additions. - - - 7.3.1. A Solution Using "sendmail" - - In order that email delivered to one host be visible on all machines, - the simplest solution is to export the mail spool directory with read- - write privileges over the entire private network. The private network - gateway machine will also act as mail collector and forwarder for the - entire private network, and so must have root write privileges to the - mail spool drive. The other clients may or may not squash root, at - your discretion. My general security philosophy is not to grant - privileges unless there is a clear reason for it, so I squash root on - the mail spool network drive for all hosts except the private network - gateway machine. This has the effect that root can only read his mail - from that machine, but this is not a particularly serious handicap. - Note that the mail spool drive can be a directory on the private - network gateway machine, exported via NFS, or it can be a directory on - one of the internal servers, exported to the entire private network. - If the mail spool drive is resident on the private network gateway, - there is no issue of squashing root for that machine. If it is on - another server, then note that email will be undeliverable if that - server, the gateway machine, or the network connecting them, is down. - - - For Windows machines on your private network, you may either set up a - POP server on the mail spool host, or use samba to export the mail - spool to those machines. The Windows machines should be configured to - send and retrieve mail under a Linux username, such as - joeuser@example.com, so that the email address host name is the bare - domain name, not a machine name like barney.example.com. The outgoing - SMTP host should be set to the private network gateway machine, which - will be responsible for forwarding the mail and doing any address - rewriting. - - - Next, you should configure sendmail to forward email from the machines - on the private network, rewriting the addresses if necessary. Obtain - the latest sources to sendmail from the sendmail.org WWW site - . Compile the binaries, then go to the - cf/domain subdirectory within the sendmail source tree, and create the - following new file: example.com.m4 - - - - ______________________________________________________________________ - divert(-1) - # - # Copyright (c) 1998 Sendmail, Inc. All rights reserved. - # Copyright (c) 1983 Eric P. Allman. All rights reserved. - # Copyright (c) 1988, 1993 - # The Regents of the University of California. All rights reserved. - # - # By using this file, you agree to the terms and conditions set - # forth in the LICENSE file which can be found at the top level of - # the sendmail distribution. - # - # - - # - # The following is a generic domain file. You should be able to - # use it anywhere. If you want to customize it, copy it to a file - # named with your domain and make the edits; then, copy the appropriate - # .mc files and change `DOMAIN(generic)' to reference your updated domain - # files. - # - divert(0) - define(`confFORWARD_PATH', `$z/.forward.$w+$h:$z/.forward+$h:$z/.forward.$w:$z/.forward')dnl - FEATURE(redirect)dnl - MASQUERADE_AS(example.com)dnl - FEATURE(masquerade_envelope)dnl - ______________________________________________________________________ - - - - This defines the domain ``example.com''. Next, you have to create the - sendmail.cf files which will be used on the mail host (the private - network gateway), and on the other Linux nodes on the private network. - - - Create the following file in the sendmail source tree, under cf/cf: - example.master.m4 - - - - ______________________________________________________________________ - divert(-1) - # - # Copyright (c) 1998 Sendmail, Inc. All rights reserved. - # Copyright (c) 1983 Eric P. Allman. All rights reserved. - # Copyright (c) 1988, 1993 - # The Regents of the University of California. All rights reserved. - # - # By using this file, you agree to the terms and conditions set - # forth in the LICENSE file which can be found at the top level of - # the sendmail distribution. - # - # - - # - # This is the prototype file for a configuration that supports nothing - # but basic SMTP connections via TCP. - # - # You MUST change the `OSTYPE' macro to specify the operating system - # on which this will run; this will set the location of various - # support files for your operating system environment. You MAY - # create a domain file in ../domain and reference it by adding a - # `DOMAIN' macro after the `OSTYPE' macro. I recommend that you - # first copy this to another file name so that new sendmail releases - # will not trash your changes. - # - - divert(0)dnl - OSTYPE(linux)dnl - DOMAIN(example.com)dnl - FEATURE(nouucp) - FEATURE(relay_entire_domain) - FEATURE(`virtusertable', `hash /etc/sendmail/virtusertable')dnl - FEATURE(`genericstable', `hash /etc/sendmail/genericstable')dnl - define(`confPRIVACY_FLAGS', ``noexpn,novrfy'')dnl - MAILER(local) - MAILER(smtp) - Cw fred.example.com - Cw example.com - ______________________________________________________________________ - - - - In this example we have disabled the ``expn'' and ``vrfy'' commands. - An attacker could troll for aliases with ``expn'', trying names like - ``staff'', ``allstaff'', ``office'', and so on, until he hits an alias - which expands out several usernames for him. He can then try the user­ - names against certain weak passwords in hopes of getting in (assuming - he can get a login prompt - the security settings described in section - ``Securing Your Domain'' are set up so that no login prompt is avail­ - able for off-site attackers). - - - The other file you should create will define the sendmail.cf for the - slave machines: example.slave.m4 - - - - ______________________________________________________________________ - divert(-1) - # - # Copyright (c) 1998 Sendmail, Inc. All rights reserved. - # Copyright (c) 1983 Eric P. Allman. All rights reserved. - # Copyright (c) 1988, 1993 - # The Regents of the University of California. All rights reserved. - # - # By using this file, you agree to the terms and conditions set - # forth in the LICENSE file which can be found at the top level of - # the sendmail distribution. - # - # - - # - # This the prototype for a "null client" -- that is, a client that - # does nothing except forward all mail to a mail hub. IT IS NOT - # USABLE AS IS!!! - # - # To use this, you MUST use the nullclient feature with the name of - # the mail hub as its argument. You MUST also define an `OSTYPE' to - # define the location of the queue directories and the like. - # In addition, you MAY select the nocanonify feature. This causes - # addresses to be sent unqualified via the SMTP connection; normally - # they are qualified with the masquerade name, which defaults to the - # name of the hub machine. - # Other than these, it should never contain any other lines. - # - - divert(0)dnl - - OSTYPE(linux) - FEATURE(nullclient, fred.$m) - Cm example.com - ______________________________________________________________________ - - - - You build the appropriate sendmail.cf files with the command: - - - make example.master.cf example.slave.cf - - - - and then copy the files to the appropriate machines under the name - sendmail.cf. - - - This configuration puts most of the sendmail configuration files under - the /etc/sendmail/ subdirectory. This configuration causes sendmail to - parse and use two special files, virtusertable.db and - genericstable.db. To use these special files, create their parent - files. First, virtusertable.src: - - - ______________________________________________________________________ - John.Public@example.com jpublic - Jane.Doe@example.com jdoe@somemachine.somedomain - abuse@example.com root - Pointyhaired.Boss@example.com #phb#@hotmail.com - ______________________________________________________________________ - - This maps the email addresses on incoming email to new destinations. - Mail sent to John.Public@example.com is delivered locally to the Linux - account jpublic. Mail to Jane.Doe@example.com is redirected to another - email account, possibly in a different domain. Mail to abuse@exam­ - ple.com is sent to root, and so on. The other file is generic­ - stable.src: - - - ______________________________________________________________________ - jpublic John.Public@example.com - janedoe Jane.Doe@example.com - whgiii Pointyhaired.Boss@example.com - ______________________________________________________________________ - - - - This file renames the sender on outgoing email from locally-sourced - mail. While it clearly can't affect the return address for mail sent - directly from jdoe@somemachine.somedomain, it allows you to rewrite - the sender's email address from the internal usernames to whatever - email address convention you've chosen. Finally, create the following - Makefile in /etc/sendmail/: - - - ______________________________________________________________________ - all : genericstable.db virtusertable.db - - virtusertable.db : virtusertable.src - makemap hash virtusertable < virtusertable.src - - genericstable.db : genericstable.src - makemap hash genericstable < genericstable.src - ______________________________________________________________________ - - - - Run make to create the hashed files which sendmail can use, and remem­ - ber to re-run make and restart sendmail (or send it a SIGHUP) after - any changes to either of these ``.src'' files. - - - - 7.3.2. Solutions Using Other Mail Transfer Agents - - My experience is only with sendmail. If anybody would like to write - this section, please contact me. Otherwise, I may, at some later time, - try to provide details myself on such MTAs as Postfix, Exim, or smail. - I'd really rather somebody wrote these sections who uses those - programs. - - - - 7.4. Setting up Web Space Hosting - - You should set up your externally visible web server on a machine - outside the private network, and not on the private network gateway - machine, for security reasons. If the web server needs access to - databases or other resources stored on the private network, the - situation becomes more complicated, both from a network and a security - standpoint. Such configurations are beyond the scope of this document. - - - The details of setting up the server itself can be found in the apache - documentation, and in the Linux WWW HOWTO - document. - - - - 7.5. Setting up FTP Hosting - - Once again, your FTP host should be an externally visible machine, and - not the private network gateway machine. Follow the setup directions - which ship with your FTP daemon package. Be sure to download the most - recent version of the daemon, as there are security vulnerabilities in - some older versions of many daemons. If your FTP site does not require - anonymous users to upload files, be sure to disable that feature in - the daemon. I recommend that user (non-anonymous) FTP logins not be - permitted on the FTP host, that you require your regular users to use - scp, the secure shell remote copy command, for any file updating they - may have to do on the FTP host. This is to help build secure habits in - the users, and to protect against the ``hostile router'' problem - described in section ``Securing Your Domain''. - - - - 7.6. Setting up Packet Filtering - - This is discussed in detail in section ``Configuring Your Firewall''. - - - - 8. Securing Your Domain - - This section deals with setting up security for your new domain. The - emphasis is on user-transparent features. If your security is too - obtrusive, and interferes strongly with the actions of the users, the - users will develop their own workarounds which may compromise the - entire domain. The best way to avoid this is to make the security as - transparent as possible, and to encourage users to come to you first - when they have difficulties which might be related to the security - measures of the site. A certain flexibility in attitude is important. - I know from personal experience that if the security policy is too - rigid, the users will simply set up their own network tunnels through - the firewall so they can log in from outside the domain. It's better - that remote login procedures, or whatever the users are trying to do, - be set up, inspected, and approved by you. - - - This section deals with securing your network against outside attack, - and against casual snooping from within. Securing your site against - determined attack from validated users within the private network is a - more difficult and involved task, and is beyond the scope of this - document. - - - One of the security considerations used in this section is protecting - against the ``hostile router''. The router provided by your ISP may be - a remotely configurable computer in its own right, with the - administrative password held by your provider. There have been - security problems in the past when the router's manufacturer override - password (the one used when your ISP forgets the password they - programmed in) has become known to system crackers. When possible, you - should design your security around the assumption that the router is - potentially hostile. That is, it could be using any IP number in your - public or private network blocks, it could be redirecting traffic on - outgoing packets to another site, and it could be recording anything - which goes through. - - 8.1. Configuring Your Firewall - - This section deals with configuring an ipchains-based masquerading, - forwarding, filtering router. You should probably read the IPCHAINS- - HOWTO - document first, then look here for additional hints. That HOWTO - describes the steps necessary to compile a kernel with masquerading - support, and describes the use of the ipchains binary in detail. You - should enable firewalling on all machines with exposed IP numbers. - - - Check your startup scripts to make sure that the sequence is as - follows on the private network gateway machine: - - 1. Outside Ethernet card is initialized. - - 2. Firewall rules are run through ipchains. - - 3. Forwarding is turned on. - - 4. Network service daemons are started. - - So, as an example, on a Slackware-based system, the firewall - configuration should come between the execution of rc.inet1 and - rc.inet2. Further, if any problems arise during the firewall - configuration steps, a warning should be printed, and the external - Ethernet card taken off line before the network service daemons are - run. - - - One common problem with ipchains-based firewalls is the tedium of - making sure that your rules are correctly set for packets arriving - from the loopback interface, or arriving from either of the internal - or external interfaces on the firewall machine. These locally-sourced - packets can be blocked by a firewall. All too often, this is fixed by - a sort of shotgun debugging approach, whereby the rules for the - firewall are tweaked until all applications seem to run properly on - the firewall host again. Unfortunately, this can sometimes result in a - firewall which has unintended holes. With ipchains it is possible to - write a firewall script which is easily debugged, and which avoids - many of the packet source problems. Here is a sample script, - /sbin/firewall.sh: - - - - ______________________________________________________________________ - #! /bin/sh - # - # New firewalling script using IP chains. Creates a filtering router - # with network masquerading. - # - - # define a few variables - - IPCHAINS=/sbin/ipchains - - LOCALNET="192.168.1.0/24" # the private network - ETHINSIDE="192.168.1.1" # fred.example.com's private IP # - ETHOUTSIDE="10.1.1.9" # fred.example.com's public IP # - LOOPBACK="127.0.0.1/8" - ANYWHERE="0/0" - OUTSIDEIF=eth1 # fred.example.com's private interface - - FORWARD_PROCENTRY=/proc/sys/net/ipv4/ip_forward - - # - # These two commands will return error codes if the rules - # already exist (which happens if you run the firewall - # script more than once). We put the commands before "set -e" - # so that the script doesn't abort in that case. - - $IPCHAINS -N outside - $IPCHAINS -N portmap - - set -e # Abort immediately on error setting - # up the rules. - - - # - # Turn off forwarding and clear the tables - - echo "0" > ${FORWARD_PROCENTRY} - - $IPCHAINS -F forward - $IPCHAINS -F input - $IPCHAINS -F output - $IPCHAINS -F outside - $IPCHAINS -F portmap - - - # - # Masquerade packets from within our local network destined for the - # outside world. Don't masquerade packets which are local to local - - $IPCHAINS -A forward -s $LOCALNET -d $LOCALNET -j ACCEPT - $IPCHAINS -A forward -s $ETHOUTSIDE -d $ANYWHERE -j ACCEPT - $IPCHAINS -A forward -s $LOCALNET -d $ANYWHERE -j MASQ - - # - # Set the priority flags. Minimum delay connections for www, telnet, - # ftp, and ssh (outgoing packets only). - - $IPCHAINS -A output -p tcp -d $ANYWHERE www -t 0x01 0x10 - $IPCHAINS -A output -p tcp -d $ANYWHERE telnet -t 0x01 0x10 - $IPCHAINS -A output -p tcp -d $ANYWHERE ftp -t 0x01 0x10 - $IPCHAINS -A output -p tcp -d $ANYWHERE ssh -t 0x01 0x10 - - - # - # Anything from our local class C is to be accepted, as are - # packets from the loopback and fred's external IP. - $IPCHAINS -A input -s $LOCALNET -j ACCEPT - $IPCHAINS -A input -s $LOOPBACK -j ACCEPT - $IPCHAINS -A input -s $ETHOUTSIDE -j ACCEPT - - - - # We'll create a set of rules for packets coming from the big, bad - # outside world, and then bind all external interfaces to it. This - # rule will be called "outside" - # - # We also create a "portmap" chain. The sockets used by daemons - # registered with the RPC portmapper are not fixed, and so it is - # a bit difficult to set up filter rules for them. The portmap - # chain is configured in a separate script. - - - # - # Send packets from any outside interface to the "outside" - # rules chain. This includes the $OUTSIDEIF interface and any - # ppp interfaces we create for dialout (or dialin). - - $IPCHAINS -A input -i ${OUTSIDEIF} -j outside - $IPCHAINS -A input -i ppp+ -j outside - - - ################################################## - # - # Set up the "outside" rules chain # - # - ################################################## - - # - # Nobody from the outside should claim to be coming from our localnet - # or loopback - - $IPCHAINS -A outside -s $LOCALNET -j DENY - $IPCHAINS -A outside -s $LOOPBACK -j DENY - - # - # No packets routed to our local net should come in from outside - # because the outside isn't supposed to know about our private - # IP numbers. - - $IPCHAINS -A outside -d $LOCALNET -j DENY - - # - # Block incoming connections on the X port. Block 6000 to 6010. - - $IPCHAINS -l -A outside -p TCP -s $ANYWHERE -d $ANYWHERE 6000:6010 -j DENY - - # - # Block NFS ports 111 and 2049 - - $IPCHAINS -l -A outside -p TCP -s $ANYWHERE -d $ANYWHERE 111 -j DENY - $IPCHAINS -l -A outside -p TCP -s $ANYWHERE -d $ANYWHERE 2049 -j DENY - $IPCHAINS -l -A outside -p UDP -s $ANYWHERE -d $ANYWHERE 111 -j DENY - $IPCHAINS -l -A outside -p UDP -s $ANYWHERE -d $ANYWHERE 2049 -j DENY - - # - # Block XDM packets from outside, port 177 UDP - - $IPCHAINS -l -A outside -p UDP -s $ANYWHERE -d $ANYWHERE 177 -j DENY - - - # - # Block the YP/NIS port 653 - $IPCHAINS -l -A outside -p TCP -s $ANYWHERE -d $ANYWHERE 653 -j DENY - - # - # Don't bother logging accesses on TCP port 80, the www port. - - $IPCHAINS -A outside -p TCP -s $ANYWHERE -d $ANYWHERE 80 -j DENY - - # - # Accept FTP data and control connections. - - $IPCHAINS -A outside -p TCP -s $ANYWHERE 20:21 -d $ANYWHERE 1024: -j ACCEPT - - # - # Accept ssh packets - - $IPCHAINS -A outside -p TCP -s $ANYWHERE -d $ANYWHERE ssh -j ACCEPT - - # - # Accept DNS packets from outside - - $IPCHAINS -A outside -p TCP -s $ANYWHERE -d $ANYWHERE 53 -j ACCEPT - $IPCHAINS -A outside -p UDP -s $ANYWHERE -d $ANYWHERE 53 -j ACCEPT - - # - # Accept SMTP from the world - - $IPCHAINS -A outside -p TCP -s $ANYWHERE -d $ANYWHERE 25 -j ACCEPT - - # - # Accept NTP packets - - $IPCHAINS -A outside -p UDP -s $ANYWHERE -d $ANYWHERE 123 -j ACCEPT - - # - # Accept no tap ident packets, we don't use them - - $IPCHAINS -A outside -p TCP -s $ANYWHERE -d $ANYWHERE 113 -j DENY - - # - # Turn off and log all other packets incoming, TCP or UDP, on privileged ports - - $IPCHAINS -l -A outside -p TCP -s $ANYWHERE -d $ANYWHERE :1023 -y -j DENY - $IPCHAINS -l -A outside -p UDP -s $ANYWHERE -d $ANYWHERE :1023 -j DENY - - # - # Check against the portmapper ruleset - - $IPCHAINS -A outside -j portmap - - - ############################################## - # - # End of "outside" rules chain # - # - ############################################## - - - # - # Block outgoing rwho packets - - $IPCHAINS -A output -p UDP -i $OUTSIDEIF -s $ANYWHERE 513 -d $ANYWHERE -j DENY - - # - # Prevent netbios packets from leaving - - $IPCHAINS -A output -p UDP -i $OUTSIDEIF -s $ANYWHERE 137 -d $ANYWHERE -j DENY - # - # Turn on forwarding - - echo "1" > ${FORWARD_PROCENTRY} - ______________________________________________________________________ - - - - Notice that the firewall can be used not only to block incoming pack­ - ets, but also outgoing packets which might leak information about your - private network, such as rwho and netbios packets. - - - As noted earlier, the portmapper rules are a bit different, because - the portmap daemons register themselves with the portmapper and are - told which ports to listen on. The ports used by a particular daemon - may change as you change the RPC services used, or change their order - of startup. The following script, /sbin/firewall.portmap.sh generates - rule sets for the portmapped daemons: - - - ______________________________________________________________________ - #! /bin/sh - # - ANYWHERE=0/0 - - IPCHAINS=/sbin/ipchains - - $IPCHAINS -F portmap - - # Rules for preventing access to portmapped services by people on the outside - # - /usr/bin/rpcinfo -p | tail +2 | \ - { while read program vers proto port remainder - do - prot=`echo $proto | tr "a-z" "A-Z"` - $IPCHAINS -l -A portmap -p $prot -s $ANYWHERE -d $ANYWHERE $port -j DENY || exit 1 - done - } - ______________________________________________________________________ - - - - We didn't have to worry about whether packets coming in were legiti­ - mate packets from the private network, the portmap chain is only - checked when the packets come in from the outside. - - - This firewall configuration logs most suspicious packets through klogd - with the kern.info logging priority. It will log normal connection - attempts, as well as all known ``stealth'' probes. - - - Now, we put these all together. We'd like to make sure that there - isn't a small window of vulnerability while the system is starting up, - so you should configure your startup sequence as follows: - - - - ______________________________________________________________________ - #! /bin/sh - # - # Get the network started, securely - # - # - /etc/rc.d/rc.inet1 # Configure the network interfaces - # and set up routing. - /sbin/firewall.sh || { echo "Firewall configuration failed" - /sbin/ifconfig eth1 down } - - /sbin/ipchains -I outside 1 -j DENY # Deny all incoming packets - - /etc/rc.d/rc.inet2 # Start the network daemons - - sleep 5 # Let them stabilize - - # Secure the portmapped services - /sbin/firewall.portmap.sh || { echo "Portmap firewall configuration failed" - /sbin/ifconfig eth1 down } - - /sbin/ipchains -D outside 1 # Allow incoming packets - ______________________________________________________________________ - - - - This assumes that eth1 is the interface on the externally visible IP - number. If any of the ipchains rule sets fail to install, a warning is - issued and that interface is taken off line. The ``outside'' chain is - set to deny all packets before the network service daemons are - started, because the firewalling rules are not yet in place for the - portmapped services. Once the portmapped services are firewalled, the - ``outside'' chain is restored to its proper behaviour. - - - - 8.2. Configuring OpenSSH or SSH1 - - At the time of this writing, OpenSSH, like SSH1, now offers a - configuration setting which allows you to insert scp, ssh, and slogin - as binaries named rcp, rsh, and rlogin, with transparent fall-through - in the ssh client programs to the original rsh, rcp, or rlogin when - the remote site isn't running sshd. Making an invocation of rsh run, - instead, the ssh client program is, in my opinion, important for - keeping the security easy to use and out of the way of the users. - Everybody's scripts, rdist configurations, and so on will continue to - work without modification if the remote site is running sshd, but data - will be sent encrypted, with strong host authentication. The converse - will not always be true. Specifically, if the remote machine is not - running sshd, the rsh program will echo a diagnostic to the screen - warning that the connection is unencrypted. This message breaks rdist, - and possibly other programs. The message cannot be suppressed with - command line or compile time switches. For rdist, one solution is to - invoke the program with -p /usr/lib/rsh/rsh. - - - Obtain ssh1 from the ssh web site , or OpenSSH - from the OpenSSH web site , and compile it to - replace the unencrypted r-programs (rsh, rlogin, and rcp). First, copy - those three files to /usr/lib/rsh/, then configure the ssh package - with: - - - ./configure --with-rsh=/usr/lib/rsh/rsh --program-transform-name='s/^s/r/' --prefix=/usr - - Install the binaries, and configure according to the directions. On - the private network gateway machine, make sure that the sshd configu­ - ration has the following entries defined: - - - ListenAddress 192.168.1.1 # fred's internal IP - IgnoreRhosts no - X11Forwarding yes - X11DisplayOffset 10 - RhostsAuthentication no - RhostsRSAAuthentication yes - RSAAuthentication yes - PasswordAuthentication yes - - - - You will have to do further configuration of other entries in the - /etc/sshd_config file, but try not to change these fields. Once you - have all of the entries in the file set to your satisfaction, copy - this entire file into a new file, /etc/sshd_config.ext, for the exter­ - nal network. Change two fields in the new file: the ``ListenAddress'' - should be changed to the private network gateway's external IP number - (10.1.1.9 in our fred.example.com case), and ``PasswordAuthentica­ - tion'' should be set to ``no'' in /etc/sshd_config.ext. In your net­ - work services startup script, start sshd twice, once with - - - /usr/sbin/sshd - - - - and once with - - - /usr/sbin/sshd -f /etc/sshd_config.ext - - - - This will create two running sshd daemons. The one operating on the - internal interface will allow logins with passwords, but the external - interface will require an RSA key validation before anybody can log - on. - - - Next, turn off incoming telnet and shell services in the inetd - configuration file (note that the firewall configuration listed in - section ``Configuring Your Firewall'' already prevents access from - outside, but it's best to defend in depth, don't rely on everything - working correctly). - - - People who want to be able to log in from home, or from out of town, - will need an RSA key. Make sure they know how to do this, so they - don't spend their energies trying to figure out another way to do it, - like running a telnetd on an unprivileged port on your firewall - machine. - - - An RSA key is generated by the command: - - - ssh-keygen -b 1024 -f new_rsa_key - - You will be prompted for a pass phrase. This should not be blank. A - person with access to the file new_rsa_key, and knowledge of the pass - phrase, has everything necessary to pass an RSA authentication chal­ - lenge. The pass phrase can be an ``unguessable'' password, or a long - sentence, but make it something non-trivial. The file new_rsa_key can - be copied to a floppy disk, or onto a laptop, and, along with the pass - phrase, can be used to log into accounts which are set to grant access - to that particular RSA key. - - - To configure an account to allow access by a particular RSA key, - simply create a $HOME/.ssh/ directory for that user on the private - network gateway machine (i.e. the machine which will be receiving the - login attempt), and copy the file new_rsa_key.pub which was created by - the "ssh-keygen" command into the file $HOME/.ssh/authorized_keys. See - the section ``AUTHORIZED_KEYS FILE FORMAT'' in the sshd man page for - details on other options you can add to the key, such as requiring the - login to come from a certain IP or host name, or authorizing the key - only to permit the remote invocation of certain commands (for - instance, an RSA key which commands a backup to take place, or - commands a status report to be emailed somewhere off site). - - - Only one thing remains to make the RSA key mechanism as gentle as - possible to the users. If a user is forced to enter the pass phrase - more than once or twice in a session, they are likely to become bored - and take security matters into their own hands. Under Linux, arrange - their login shell to be invoked under ssh-agent. For instance, if the - company laptop used on business trips runs xdm, and drops users into - an X session, go into the /var/X11R6/lib/xdm/Xsession_0 file and - change the lines which invoke the startup, which are probably of the - form: - - - exec "$startup" - - - - into lines of the form: - - - exec ssh-agent "$startup" - - - - In my xdm setup, there are three such lines which should be altered in - that one file. Now, when the user logs onto the laptop, he enters the - command - - - ssh-add new_rsa_key - - - - at any prompt, enters the pass phrase when prompted, and all windows - will have pass phrase-free access to the account on the private net­ - work gateway until the user logs off his X session on the laptop. - - - Run sshd on all of the machines on your private network, as well as on - any exposed hosts. For machines other than the private network gateway - machine, the ListenAddress entry in /etc/sshd_config can be set to - ``0.0.0.0''. You should set up the host keys with the command: - ssh-keygen -b 1024 -f /etc/ssh_host_key -N "" - - - - then run make-ssh-known-hosts and distribute the /etc/ssh_known_hosts - file among all of the machines on the private and public networks. - - - Disable incoming telnet and the unencrypted r-services. Don't delete - the telnet binary, it's useful for things other than simple telnet - sessions on port 23. You should allow password authentication on the - private network, and disable it on the exposed machines, requiring an - RSA key to log onto the exposed hosts. - - - It is convenient for the users if the hosts on the private network are - mentioned in each other's /etc/hosts.equiv files. The sshd daemons - will respect those, and allow people to rlogin and rsh between - machines without passwords or pass phrases. On every connection, the - machines will be verifying each other's identities with host-level RSA - keys. - - - One difficulty arises when a user logged onto a machine on the private - network wants to log onto a box on an exposed IP number. You can't use - /etc/hosts.equiv or $HOME/.shosts to allow password-less validation, - because the user is coming from a machine whose IP number cannot be - determined - it will appear to be coming from the masquerading - firewall machine, but the host keys won't match. There are two - solutions to this. First, if you insist on using the /etc/hosts.equiv - or $HOME/.shosts methods, the user will have to log onto the private - network gateway machine (fred.example.com in our example here), and - then log through to the exposed machine from there. The other - technique is to use RSA key authentication, that always works - regardless of what games are going on with IP numbers and host name - lookups. - - - - 8.3. Configuring X - - In the user's continuing quest to prove that he values convenience - over security, it has become common for people to put - - - xhost + - - - - commands right into their X initialization scripts. This grants X - server access to everybody in the world. Now the random outsider can - change your root window graphic to something embarrassing while your - boss is showing his mother around your office. Alternately, this out­ - sider can quietly monitor every keystroke you issue, and dump the con­ - tents of your screen to his desktop. Needless to say, this doesn't - bode well for passwords used to log into other sites, or for sensitive - documents being edited on screen. The xhost protocol itself is inher­ - ently limited, as it is not possible to grant permissions to use the - screen on a user basis, only on a machine basis. - - - Enter xauth authentication. If you have xdm you probably already are - running xauth authentication, but xhost still works, and might still - be what people are using to run X processes between machines. Once - again, the goal is to make the security easy enough to use that the - users aren't tempted to run the xhost command anymore. - - - The sshd setup described in section ``Configuring SSH1'', with the - ``X11Forwarding'' flag set, is actually simpler to use than the xhost - technique. Once you have logged into your terminal, you can simply - rlogin to a remote machine, and run netscape, xv, or whatever you - like, without having to set the $DISPLAY variable name or allow - explicit permissions. During ssh login, it configures the system in a - way transparent to the end user, and even encrypts all of your X - packets before they go over the network. - - - If you are unable to use the sshd X11 forwarding for some reason, you - should use xauth when you want to authorize other machines to have - access to your X server. Document this for the users, or create - specialized shell scripts to help them out. The relevant command to - authorize a particular login, ``jpublic'', on machine ``barney'' to - have access to your X server is: - - - - /usr/X11/bin/xauth extract - $DISPLAY | rsh -l jpublic barney /usr/X11/bin/xauth merge - - - - - This sequence is not necessary to authorize X connections from - machines which share a common NFS-mounted home directory. The xauth - key will be immediately available to that user on all machines which - mount the same home directory. - - - I'd be tempted to delete xhost from your machines entirely. If it - causes problems with any programs, you will at least know that those - programs had poorly-designed security. It's simple enough to build a - shell script as a drop-in replacement for xhost which uses the xauth - sequence listed above. - - - Note that if rsh is not the encrypting ssh program, the xauth key is - sent plaintext. Anybody who holds the plaintext of the key can access - your server, so you do not gain much security if you don't use ssh for - these transactions. Note, also, that if the users' home directories - are exported via NFS (the Network File System), the xauth key is - available in plaintext to anybody able to snoop those NFS packets, - regardless of whether you're running ssh on your systems. - - - - 8.4. Configuring Disk Sharing - - With email coming to a central machine, the read/send from any host - setup described here is very convenient, but some care has to be taken - to protect against trivial snooping by bored local users. NFS without - AUTH_DES implemented is inherently insecure. NFS relies on the client - machine to authenticate access, there is no password verification on - the server to make sure that the client should be permitted to access - the private files of a particular user. A Windows box can be - configured to read NFS-exported volumes as any numeric uid, completely - bypassing UNIX file permissions. Consequently, NFS exports should - only be made to machines which are always Linux (or UNIX) boxes under - your direct control, and never ones which can be dual-booted into - Windows. If you want to export the mail spool directory, or any other - directory, to machines which can sometimes be used as Windows boxes, - export them with samba, setting the authentication mode to - ``security=USER''. Connecting the machines on your network with a - switch rather than a hub will also help, as it leaves very little of - interest for sniffers on Windows machines. Ultimately, though, it's - very difficult to secure any disk sharing over the network at the time - of this writing. - - - Why bother, if you can't really secure the network disks? Mostly it's - an issue of credible defense. If you leave a sheet of paper on your - desk with confidential information, and somebody in the office reads - it, he can argue that he didn't realize what the paper was, his - natural curiosity just got the better of him when he saw it sitting on - the desk. If the sheet of paper were in a filing cabinet or desk - drawer, it's an entirely different story. The purpose of taking some - basic network security measures internally is to ensure that nobody - ``accidentally'' compromises security. - - - - 9. Acknowledgements - - This document was written as internal documentation for the DYNACAN - project, as part of the project's continuing development under the - control of the Ministry of Human Resources Development Canada. - - - This document has benefited considerably from the suggestions of - - ˇ Rod Smith (rodsmith@rodsbooks.com ), - who suggested I provide details on registering a domain name and on - setting up with a dynamic IP, and pointed me at the various dynamic - IP hosting services and at Granite Canyon. - - ˇ Greg Leblanc (gleblanc@my-deja.com ) for - useful suggestions on improving the clarity of the document. - - ˇ Sami Yousif (syousif@iname.com ). - - ˇ Marc-André Dumas (m_a_dumas@hotmail.com - ), who suggested the section on - moving your domain to a new IP number. - - ˇ Osamu Aoki (aoki@pacbell.net ). - - ˇ Joao Ribeiro <(url url="mailto:sena@decoy.ath.cx" - name="sena@decoy.ath.cx">). - -12.1. Names and locations - -The first thing your browser has to do is to establish a network connection -to the machine where the document lives. To do that, it first has to find the -network location of the host www.tldp.org (??host?? is short for ??host -machine?? or ??network host'; www.tldp.org is a typical hostname). The -corresponding location is actually a number called an IP address (we'll -explain the ??IP?? part of this term later). - -To do this, your browser queries a program called a name server. The name -server may live on your machine, but it's more likely to run on a service -machine that yours talks to. When you sign up with an ISP, part of your setup -procedure will almost certainly involve telling your Internet software the IP -address of a nameserver on the ISP's network. - -The name servers on different machines talk to each other, exchanging and -keeping up to date all the information needed to resolve hostnames (map them -to IP addresses). Your nameserver may query three or four different sites -across the network in the process of resolving www.tldp.org, but this usually -happens very quickly (as in less than a second). We'll look at how -nameservers detail in the next section. - -The nameserver will tell your browser that www.tldp.org's IP address is -152.19.254.81; knowing this, your machine will be able to exchange bits with -www.tldp.org directly. ------------------------------------------------------------------------------ - -12.2. The Domain Name System - -The whole network of programs and databases that cooperates to translate -hostnames to IP addresses is called ??DNS?? (Domain Name System). When you -see references to a ??DNS server??, that means what we just called a -nameserver. Now I'll explain how the overall system works. - -Internet hostnames are composed of parts separated by dots. A domain is a -collection of machines that share a common name suffix. Domains can live -inside other domains. For example, the machine www.tldp.org lives in the -.tldp.org subdomain of the .org domain. - -Each domain is defined by an authoritative name server that knows the IP -addresses of the other machines in the domain. The authoritative (or ?? -primary') name server may have backups in case it goes down; if you see -references to a secondary name server or (??secondary DNS') it's talking -about one of those. These secondaries typically refresh their information -from their primaries every few hours, so a change made to the hostname-to-IP -mapping on the primary will automatically be propagated. - -Now here's the important part. The nameservers for a domain do not have to -know the locations of all the machines in other domains (including their own -subdomains); they only have to know the location of the nameservers. In our -example, the authoritative name server for the .org domain knows the IP -address of the nameserver for .tldp.org but not the address of all the other -machines in .tldp.org. - -The domains in the DNS system are arranged like a big inverted tree. At the -top are the root servers. Everybody knows the IP addresses of the root -servers; they're wired into your DNS software. The root servers know the IP -addresses of the nameservers for the top-level domains like .com and .org, -but not the addresses of machines inside those domains. Each top-level domain -server knows where the nameservers for the domains directly beneath it are, -and so forth. - -DNS is carefully designed so that each machine can get away with the minimum -amount of knowledge it needs to have about the shape of the tree, and local -changes to subtrees can be made simply by changing one authoritative server's -database of name-to-IP-address mappings. - -When you query for the IP address of www.tldp.org, what actually happens is -this: First, your nameserver asks a root server to tell it where it can find -a nameserver for .org. Once it knows that, it then asks the .org server to -tell it the IP address of a .tldp.org nameserver. Once it has that, it asks -the .tldp.org nameserver to tell it the address of the host www.tldp.org. - -Most of the time, your nameserver doesn't actually have to work that hard. -Nameservers do a lot of cacheing; when yours resolves a hostname, it keeps -the association with the resulting IP address around in memory for a while. -This is why, when you surf to a new website, you'll usually only see a -message from your browser about "Looking up" the host for the first page you -fetch. Eventually the name-to-address mapping expires and your DNS has to -re-query ?? this is important so you don't have invalid information hanging -around forever when a hostname changes addresses. Your cached IP address for -a site is also thrown out if the host is unreachable. ------------------------------------------------------------------------------ - - - DNS HOWTO - Nicolai Langfeldt (dns-howto(at)langfeldt.net), Jamie Nor­ - rish and others - v9.0, 2001-12-20 - - 1.2. Credits and request for help. - - I want to thank all the people that I have bothered with reading this - HOWTO (you know who you are) and all the readers that have e-mailed - suggestions and notes. - - - This will never be a finished document; please send me mail about your - problems and successes. You can help make this a better HOWTO. So - please send comments and/or questions or money to - janl(at)langfeldt.net. Or buy my DNS book (it's titled "The Concise - Guide to DNS and BIND, the bibliography has ISBNs). If you send e- - mail and want an answer please show the simple courtesy of making sure - that the return address is correct and working. Also, please read the - ``qanda'' section before mailing me. Another thing, I can only - understand Norwegian and English. - - - This is a HOWTO. I have maintained it as part of the LDP since 1995. - I have, during 2000, written a book on the same subject. I want to - say that, though this HOWTO is in many ways much like the book it is - not a watered down version concocted to market the book. The readers - of this HOWTO have helped me understand what is difficult to - understand about DNS. This has helped the book, but the book has also - helped me to think more about what this HOWTO needs. The HOWTO begot - the book. The book begot version 3 of this HOWTO. My thanks to the - book publisher, Que, that took a chance on me :-) - - 1.3. Dedication - - This HOWTO is dedicated to Anne Line Norheim Langfeldt. Though she - will probably never read it since she's not that kind of girl. - - -DNS-Serving - -6.6. Domain Name System - - A DNS server has the job of translating names (readable by humans) to - IP addresses. A DNS server does not know all the IP addresses in the - world; rather, it is able to request other servers for the unknown - addresses. The DNS server will either return the wanted IP address to - the user or report that the name cannot be found in the tables. - - Name serving on Unix (and on the vast majority of the Internet) is - done by a program called named. This is a part of the bind package of - The Internet Software Consortium. - - ˇ BIND - - ˇ DNS HOWTO - - - - This section will describe how to become a totally small time DNS admin. - - 1. Preamble - - Keywords: DNS, BIND, BIND 4, BIND 8, BIND 9, named, dialup, PPP, slip, - ISDN, Internet, domain, name, resolution, hosts, caching. - - This document is part of the Linux Documentation Project. - - 2. Introduction. - - What this is and isn't. - - DNS is the Domain Name System. DNS converts machine names to the IP - addresses that all machines on the net have. It translates (or "maps" - as the jargon would have it) from name to address and from address to - name, and some other things. This HOWTO documents how to define such - mappings using Unix system, with a few things specific to Linux. - - A mapping is simply an association between two things, in this case a - machine name, like ftp.linux.org, and the machine's IP number (or - address) 199.249.150.4. DNS also contains mappings the other way, - from the IP number to the machine name; this is called a "reverse - mapping". - - DNS is, to the uninitiated (you ;-), one of the more opaque areas of - network administration. Fortunately DNS isn't really that hard. This - HOWTO will try to make a few things clearer. It describes how to set - up a simple DNS name server, starting with a caching only server and - going on to setting up a primary DNS server for a domain. For more - complex setups you can check the ``qanda'' section of this document. - If it's not described there you will need to read the Real - Documentation. I'll get back to what this Real Documentation consists - of in ``the last chapter''. - - - Before you start on this you should configure your machine so that you - can telnet in and out of it, and successfully make all kinds of - connections to the net, and you should especially be able to do telnet - 127.0.0.1 and get your own machine (test it now!). You also need good - /etc/nsswitch.conf, /etc/resolv.conf and /etc/hosts files as a - starting point, since I will not explain their function here. If you - don't already have all this set up and working the Networking-HOWTO - and/or the Networking-Overview-HOWTO explains how to set it up. Read - them. - - - When I say `your machine' I mean the machine you are trying to set up - DNS on, not any other machine you might have that's involved in your - networking effort. - - - I assume you're not behind any kind of firewall that blocks name - queries. If you are you will need a special configuration --- see the - section on ``qanda''. - - - Name serving on Unix is done by a program called named. This is a - part of the ``BIND'' package which is coordinated by The Internet - Software Consortium. Named is included in most Linux distributions - and is usually installed as /usr/sbin/named, usually from a package - called BIND, in upper or lower case depending on the whim of the - packager. - - - If you have a named you can probably use it; if you don't have one you - can get a binary off a Linux ftp site, or get the latest and greatest - source from . This HOWTO is about BIND - version 9. The old versions of the HOWTO, about BIND 4 and 8, is - still available at in case you use - BIND 4 or 8 (incidentally, you will find this HOWTO there too). If - the named man page talks about (at the very end, in the FILES section) - named.conf you have BIND 8; if it talks about named.boot you have BIND - 4. If you have 4 and are security conscious you really ought to - upgrade to the latest version of BIND 8. Now. - - - DNS is a net-wide database. Take care about what you put into it. If - you put junk into it, you, and others, will get junk out of it. Keep - your DNS tidy and consistent and you will get good service from it. - Learn to use it, admin it, debug it and you will be another good admin - keeping the net from falling to its knees by mismanagement. - - - Tip: Make backup copies of all the files I instruct you to change if - you already have them, so that if after going through this nothing - works you can get it back to your old, working state. - - - 2.1. Other nameserver implementations. - - This section was written by Joost van Baal. - - - Various packages exist for getting a DNS server on your box. There is - the BIND package ( ); the - implementation this HOWTO is about. It's the most popular nameserver - around and it's used on the vast majority of name serving machines on - the Internet, around and being deployed since the 1980's. It's - available under a BSD license. Since it's the most popular package, - loads of documentation and knowledge about BIND is around. However, - there have been security problems with BIND. - - - Then there is djbdns ( ), a relatively new DNS - package written by Daniel J. Bernstein, who also wrote qmail. It's a - very modular suite: various small programs take care of the different - jobs a nameserver is supposed to handle. It's designed with security - in mind. It uses a simpler zone-file format, and is generally easier - to configure. However, since it's less well known, your local guru - might not be able to help you with this. Unfortunately, this software - is not Open Source. The author's advertisement is on - . - - - Whether DJBs software is really an improvement over the older - alternatives is a subject of much debate. A discussion (or is it a - flame-war?) of BIND vs djbdns, joined by ISC people, is on - - - - 3. A resolving, caching name server. - - A first stab at DNS config, very useful for dialup, cable-modem, ADSL - and similar users. - - - On Red Hat and Red Hat related distributions you can achieve the same - practical result as this HOWTO's first section by installing the - packages bind, bind-utils and caching-nameserver. If you use Debian - simply install bind (or bind9, as of this writing, BIND 9 is not - supported by Debian Stable (potato)) and bind-doc. Of course just - installing those packages won't teach you as much as reading this - HOWTO. So install the packages, and then read along verifying the - files they installed. - - - A caching only name server will find the answer to name queries and - remember the answer the next time you need it. This will shorten the - waiting time the next time significantly, especially if you're on a - slow connection. - - - First you need a file called /etc/named.conf (Debian: - /etc/bind/named.conf). This is read when named starts. For now it - should simply contain: - - - ______________________________________________________________________ - // Config file for caching only name server - // - // The version of the HOWTO you read may contain leading spaces - // (spaces in front of the characters on these lines ) in this and - // other files. You must remove them for things to work. - // - // Note that the filenames and directory names may differ, the - // ultimate contents of should be quite similar though. - - options { - directory "/var/named"; - - // Uncommenting this might help if you have to go through a - // firewall and things are not working out. But you probably - // need to talk to your firewall admin. - - // query-source port 53; - }; - - controls { - inet 127.0.0.1 allow { localhost; } keys { rndc_key; }; - }; - - key "rndc_key" { - algorithm hmac-md5; - secret "c3Ryb25nIGVub3VnaCBmb3IgYSBtYW4gYnV0IG1hZGUgZm9yIGEgd29tYW4K"; - }; - - zone "." { - type hint; - file "root.hints"; - }; - - zone "0.0.127.in-addr.arpa" { - type master; - file "pz/127.0.0"; - }; - ______________________________________________________________________ - - - - The Linux distribution packages may use different file names for each - kind of file mentioned here; they will still contain about the same - things. - - - The `directory' line tells named where to look for files. All files - named subsequently will be relative to this. Thus pz is a directory - under /var/named, i.e., /var/named/pz. /var/named is the right - directory according to the Linux File system Standard. - - - The file named /var/named/root.hints is named in this. - /var/named/root.hints should contain this: - - - - ______________________________________________________________________ - ; - ; There might be opening comments here if you already have this file. - ; If not don't worry. - ; - ; About any leading spaces in front of the lines here: remove them! - ; Lines should start in a ;, . or character, not blanks. - ; - . 6D IN NS A.ROOT-SERVERS.NET. - . 6D IN NS B.ROOT-SERVERS.NET. - . 6D IN NS C.ROOT-SERVERS.NET. - . 6D IN NS D.ROOT-SERVERS.NET. - . 6D IN NS E.ROOT-SERVERS.NET. - . 6D IN NS F.ROOT-SERVERS.NET. - . 6D IN NS G.ROOT-SERVERS.NET. - . 6D IN NS H.ROOT-SERVERS.NET. - . 6D IN NS I.ROOT-SERVERS.NET. - . 6D IN NS J.ROOT-SERVERS.NET. - . 6D IN NS K.ROOT-SERVERS.NET. - . 6D IN NS L.ROOT-SERVERS.NET. - . 6D IN NS M.ROOT-SERVERS.NET. - A.ROOT-SERVERS.NET. 6D IN A 198.41.0.4 - B.ROOT-SERVERS.NET. 6D IN A 128.9.0.107 - C.ROOT-SERVERS.NET. 6D IN A 192.33.4.12 - D.ROOT-SERVERS.NET. 6D IN A 128.8.10.90 - E.ROOT-SERVERS.NET. 6D IN A 192.203.230.10 - F.ROOT-SERVERS.NET. 6D IN A 192.5.5.241 - G.ROOT-SERVERS.NET. 6D IN A 192.112.36.4 - H.ROOT-SERVERS.NET. 6D IN A 128.63.2.53 - I.ROOT-SERVERS.NET. 6D IN A 192.36.148.17 - J.ROOT-SERVERS.NET. 6D IN A 198.41.0.10 - K.ROOT-SERVERS.NET. 6D IN A 193.0.14.129 - L.ROOT-SERVERS.NET. 6D IN A 198.32.64.12 - M.ROOT-SERVERS.NET. 6D IN A 202.12.27.33 - ______________________________________________________________________ - - - - The file describes the root name servers in the world. The servers - change over time and must be maintained now and then. See the - ``maintenance section'' for how to keep it up to date. - - - The next section in named.conf is the last zone. I will explain its - use in a later chapter; for now just make this a file named 127.0.0 in - the subdirectory pz: (Again, please remove leading spaces if you cut - and paste this) - - - ______________________________________________________________________ - $TTL 3D - @ IN SOA ns.linux.bogus. hostmaster.linux.bogus. ( - 1 ; Serial - 8H ; Refresh - 2H ; Retry - 4W ; Expire - 1D) ; Minimum TTL - NS ns.linux.bogus. - 1 PTR localhost. - ______________________________________________________________________ - - - - The sections called key and controls together specify that your named - can be remotely controlled by a program called rndc if it connects - from the local host, and identifis itself with the encoded secret key. - This key is like a password. For rndc to work you need /etc/rndc.conf - to match this: - - - ______________________________________________________________________ - key rndc_key { - algorithm "hmac-md5"; - secret "c3Ryb25nIGVub3VnaCBmb3IgYSBtYW4gYnV0IG1hZGUgZm9yIGEgd29tYW4K"; - }; - - options { - default-server localhost; - default-key rndc_key; - }; - ______________________________________________________________________ - - - - As you see the secret is identical. If you want to use rndc from - other machines their times need to be within 5 minutes of eachother. - I recommend using the ntp (xntpd and ntpdate) software to do this. - - - Next, you need a /etc/resolv.conf looking something like this: (Again: - Remove spaces!) - - - ______________________________________________________________________ - search subdomain.your-domain.edu your-domain.edu - nameserver 127.0.0.1 - ______________________________________________________________________ - - - - The `search' line specifies what domains should be searched for any - host names you want to connect to. The `nameserver' line specifies - the address of your nameserver, in this case your own machine since - that is where your named runs (127.0.0.1 is right, no matter if your - machine has another address too). If you want to list several name - servers put in one `nameserver' line for each. (Note: Named never - reads this file, the resolver that uses named does. Note 2: In some - resolv.conf files you find a line saying "domain". That's fine, but - don't use both "search" and "domain", only one of them will work). - - - To illustrate what this file does: If a client tries to look up foo, - then foo.subdomain.your-domain.edu is tried first, then foo.your- - domain.edu, and finally foo. You may not want to put in too many - domains in the search line, as it takes time to search them all. - - - The example assumes you belong in the domain subdomain.your- - domain.edu; your machine, then, is probably called your- - machine.subdomain.your-domain.edu. The search line should not contain - your TLD (Top Level Domain, `edu' in this case). If you frequently - need to connect to hosts in another domain you can add that domain to - the search line like this: (Remember to remove the leading spaces, if - any) - - - - ______________________________________________________________________ - search subdomain.your-domain.edu your-domain.edu other-domain.com - ______________________________________________________________________ - - - - and so on. Obviously you need to put real domain names in instead. - Please note the lack of periods at the end of the domain names. This - is important; please note the lack of periods at the end of the domain - names. - - - 3.1. Starting named - - After all this it's time to start named. If you're using a dialup - connection connect first. Now run named, either by running the boot - script: /etc/init.d/named start or named directly: /usr/sbin/named. - If you have tried previous versions of BIND you're probably used to - ndc. I BIND 9 it has been replaced with rndc, which can controll your - named remotely, but it can't start named anymore. If you view your - syslog message file (usually called /var/log/messages, Debian calls it - /var/log/daemon, another directory to look is the other files - /var/log) while starting named (do tail -f /var/log/messages) you - should see something like: - - - (the lines ending in \ continues on the next line) - - - - Dec 23 02:21:12 lookfar named[11031]: starting BIND 9.1.3 - Dec 23 02:21:12 lookfar named[11031]: using 1 CPU - Dec 23 02:21:12 lookfar named[11034]: loading configuration from \ - '/etc/named.conf' - Dec 23 02:21:12 lookfar named[11034]: the default for the \ - 'auth-nxdomain' option is now 'no' - Dec 23 02:21:12 lookfar named[11034]: no IPv6 interfaces found - Dec 23 02:21:12 lookfar named[11034]: listening on IPv4 interface lo, \ - 127.0.0.1#53 - Dec 23 02:21:12 lookfar named[11034]: listening on IPv4 interface eth0, \ - 10.0.0.129#53 - Dec 23 02:21:12 lookfar named[11034]: command channel listening on \ - 127.0.0.1#953 - Dec 23 02:21:13 lookfar named[11034]: running - - - - If there are any messages about errors then there is a mistake. Named - will name the file it is reading. Go back and check the file. Start - named over when it is fixed. - - - Now you can test your setup. Traditionally a program called nslookup - is used for this. These days dig is recommended: - - - - $ dig -x 127.0.0.1 - ;; Got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26669 - ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 0 - - ;; QUESTION SECTION: - ;1.0.0.127.in-addr.arpa. IN PTR - - ;; ANSWER SECTION: - 1.0.0.127.in-addr.arpa. 259200 IN PTR localhost. - - ;; AUTHORITY SECTION: - 0.0.127.in-addr.arpa. 259200 IN NS ns.linux.bogus. - - ;; Query time: 3 msec - ;; SERVER: 127.0.0.1#53(127.0.0.1) - ;; WHEN: Sun Dec 23 02:26:17 2001 - ;; MSG SIZE rcvd: 91 - - - - If that's what you get it's working. We hope. Anything very - different, go back and check everything. Each time you change a file - you need to run rndc reload. - - - Now you can enter a query. Try looking up some machine close to you. - pat.uio.no is close to me, at the University of Oslo: - - - - $ dig pat.uio.no - ; <<>> DiG 9.1.3 <<>> pat.uio.no - ;; global options: printcmd - ;; Got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15574 - ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 0 - - ;; QUESTION SECTION: - ;pat.uio.no. IN A - - ;; ANSWER SECTION: - pat.uio.no. 86400 IN A 129.240.130.16 - - ;; AUTHORITY SECTION: - uio.no. 86400 IN NS nissen.uio.no. - uio.no. 86400 IN NS nn.uninett.no. - uio.no. 86400 IN NS ifi.uio.no. - - ;; Query time: 651 msec - ;; SERVER: 127.0.0.1#53(127.0.0.1) - ;; WHEN: Sun Dec 23 02:28:35 2001 - ;; MSG SIZE rcvd: 108 - - - - This time dig asked your named to look for the machine pat.uio.no. It - then contacted one of the name server machines named in your - root.hints file, and asked its way from there. It might take tiny - while before you get the result as it may need to search all the - domains you named in /etc/resolv.conf. - - If you ask the same again you get this: - - - - $ dig pat.uio.no - - ; <<>> DiG 8.2 <<>> pat.uio.no - ;; res options: init recurs defnam dnsrch - ;; got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4 - ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 3 - ;; QUERY SECTION: - ;; pat.uio.no, type = A, class = IN - - ;; ANSWER SECTION: - pat.uio.no. 23h59m58s IN A 129.240.130.16 - - ;; AUTHORITY SECTION: - UIO.NO. 23h59m58s IN NS nissen.UIO.NO. - UIO.NO. 23h59m58s IN NS ifi.UIO.NO. - UIO.NO. 23h59m58s IN NS nn.uninett.NO. - - ;; ADDITIONAL SECTION: - nissen.UIO.NO. 23h59m58s IN A 129.240.2.3 - ifi.UIO.NO. 1d23h59m58s IN A 129.240.64.2 - nn.uninett.NO. 1d23h59m58s IN A 158.38.0.181 - - ;; Total query time: 4 msec - ;; FROM: lookfar to SERVER: default -- 127.0.0.1 - ;; WHEN: Sat Dec 16 00:23:09 2000 - ;; MSG SIZE sent: 28 rcvd: 162 - - - - As you can plainly see this time it was much faster, 4ms versus more - than half a second earlier. The answer was cached. With cached - answers there is the possibility that the answer is out of date, but - the origin servers can control the time cached answers should be - considered valid, so there is a high probability that the answer you - get is valid. - - - 3.2. Resolvers - - All OSes implementing the standard C API has the calls gethostbyname - and gethostbyaddr. These can get information from several different - sources. Which sources it gets it from is configured in - /etc/nsswitch.conf on Linux (and some other Unixes). This is a long - file specifying from which file or database to get different kinds of - data types. It usually contains helpful comments at the top, which - you should consider reading. After that find the line starting with - `hosts:'; it should read: - - - ______________________________________________________________________ - hosts: files dns - ______________________________________________________________________ - - - - (You remembered about the leading spaces, right? I won't mention them - again.) - - If there is no line starting with `hosts:' then put in the one above. - It says that programs should first look in the /etc/hosts file, then - check DNS according to resolv.conf. - - - - 3.3. Congratulations - - Now you know how to set up a caching named. Take a beer, milk, or - whatever you prefer to celebrate it. - - - - 4. Forwarding - - In large, well organized, academic or ISP (Internet Service Provider) - networks you will sometimes find that the network people have set up a - forwarder hierarchy of DNS servers which helps lighten the internal - network load and the load on the outside servers as well. It's not - easy to know if you're inside such a network or not. But by using the - DNS server of your network provider as a ``forwarder'' you can make - the responses to queries faster and less of a load on your network. - This works by your nameserver forwarding queries to your ISP - nameserver. Each time this happens you will dip into the big cache of - your ISPs nameserver, thus speeding your queries up, your nameserver - does not have to do all the work itself. If you use a modem this can - be quite a win. For the sake of this example we assume that your - network provider has two name servers they want you to use, with IP - numbers 10.0.0.1 and 10.1.0.1. Then, in your named.conf file, inside - the opening section called ``options'', insert these lines: - - - ______________________________________________________________________ - forward first; - forwarders { - 10.0.0.1; - 10.1.0.1; - }; - ______________________________________________________________________ - - - - There is also a nice trick for dialup machines using forwarders, it is - described in the ``qanda'' section. - - - Restart your nameserver and test it with dig. Should still work fine. - - - 5. A simple domain. - - How to set up your own domain. - - - 5.1. But first some dry theory - - First of all: you read all the stuff before here right? You have to. - - - Before we really start this section I'm going to serve you some theory - on and an example of how DNS works. And you're going to read it - because it's good for you. If you don't want to you should at least - skim it very quickly. Stop skimming when you get to what should go in - your named.conf file. - - DNS is a hierarchical, tree structured system. The top is written `.' - and pronounced `root', as is usual for tree data-structures. Under . - there are a number of Top Level Domains (TLDs); the best known ones - are ORG, COM, EDU and NET, but there are many more. Just like a tree - it has a root and it branches out. If you have any computer science - background you will recognize DNS as a search tree, and you will be - able to find nodes, leaf nodes and edges. The dots are nodes, the - edges are on the names. - - - When looking for a machine the query proceeds recursively into the - hierarchy starting at the root. If you want to find the address of - prep.ai.mit.edu., your nameserver has to start asking somewhere. It - starts by looking it its cache. If it knows the answer, having cached - it before, it will answer right away as we saw in the last section. - If it does not know it will see how closely it can match the requested - name and use whatever information it has cached. In the worst case - there is no match but the `.' (root) of the name, and the root servers - have to be consulted. It will remove the leftmost parts one at a - time, checking if it knows anything about ai.mit.edu., then mit.edu., - then edu., and if not that it does know about . because that was in - the hints file. It will then ask a . server about prep.ai.mit.edu. - This . server will not know the answer, but it will help your server - on its way by giving a referral, telling it where to look instead. - These referrals will eventually lead your server to a nameserver that - knows the answer. I will illustrate that now. +norec means that dig - is asking non-recursive questions so that we get to do the recursion - ourselves. The other options are to reduce the amount of dig produces - so this won't go on for too many pages: - - - - $ ;; Got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 980 - ;; flags: qr ra; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 0 - - ;; AUTHORITY SECTION: - . 518400 IN NS J.ROOT-SERVERS.NET. - . 518400 IN NS K.ROOT-SERVERS.NET. - . 518400 IN NS L.ROOT-SERVERS.NET. - . 518400 IN NS M.ROOT-SERVERS.NET. - . 518400 IN NS A.ROOT-SERVERS.NET. - . 518400 IN NS B.ROOT-SERVERS.NET. - . 518400 IN NS C.ROOT-SERVERS.NET. - . 518400 IN NS D.ROOT-SERVERS.NET. - . 518400 IN NS E.ROOT-SERVERS.NET. - . 518400 IN NS F.ROOT-SERVERS.NET. - . 518400 IN NS G.ROOT-SERVERS.NET. - . 518400 IN NS H.ROOT-SERVERS.NET. - . 518400 IN NS I.ROOT-SERVERS.NET. - - - - This is a referral. It is giving us an "Authority section" only, no - "Answer section". Our own nameserver refers us to a nameserver. Pick - one at random: - - - - $ dig +norec +noques +nostats +nocmd prep.ai.mit.edu. @D.ROOT-SERVERS.NET. - ;; Got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58260 - ;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 3 - - ;; AUTHORITY SECTION: - mit.edu. 172800 IN NS BITSY.mit.edu. - mit.edu. 172800 IN NS STRAWB.mit.edu. - mit.edu. 172800 IN NS W20NS.mit.edu. - - ;; ADDITIONAL SECTION: - BITSY.mit.edu. 172800 IN A 18.72.0.3 - STRAWB.mit.edu. 172800 IN A 18.71.0.151 - W20NS.mit.edu. 172800 IN A 18.70.0.160 - - - - It refers us to MIT.EDU servers at once. Again pick one at random: - - - - $ dig +norec +noques +nostats +nocmd prep.ai.mit.edu. @BITSY.mit.edu. - ;; Got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29227 - ;; flags: qr ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 4 - - ;; ANSWER SECTION: - prep.ai.mit.edu. 10562 IN A 198.186.203.77 - - ;; AUTHORITY SECTION: - ai.mit.edu. 21600 IN NS FEDEX.ai.mit.edu. - ai.mit.edu. 21600 IN NS LIFE.ai.mit.edu. - ai.mit.edu. 21600 IN NS ALPHA-BITS.ai.mit.edu. - ai.mit.edu. 21600 IN NS BEET-CHEX.ai.mit.edu. - - ;; ADDITIONAL SECTION: - FEDEX.ai.mit.edu. 21600 IN A 192.148.252.43 - LIFE.ai.mit.edu. 21600 IN A 128.52.32.80 - ALPHA-BITS.ai.mit.edu. 21600 IN A 128.52.32.5 - BEET-CHEX.ai.mit.edu. 21600 IN A 128.52.32.22 - - - - This time we got a "ANSWER SECTION", and an answer for our question. - The "AUTHORITY SECTION" contains information about which servers to - ask about ai.mit.edu the next time. So you can ask them directly the - next time you wonder about ai.mit.edu names. Named also gathered - information about mit.edu, so of www.mit.edu is requested it is much - closer to being able to answer the question. - - - So starting at . we found the successive name servers for each level - in the domain name by referral. If you had used your own DNS server - instead of using all those other servers, your named would of-course - cache all the information it found while digging this out for you, and - it would not have to ask again for a while. - - - In the tree analogue each ``.'' in the name is a branching point. And - each part between the ``.''s are the names of individual branches in - the tree. One climbs the tree by taking the name we want - (prep.ai.mit.edu) asking the root (.) or whatever servers father from - the root toward prep.ai.mit.edu we have information about in the - cache. Once the cache limits are reached the recursive resolver goes - out asking servers, pursuing referrals (edges) further into the name. - - - A much less talked about, but just as important domain is in- - addr.arpa. It too is nested like the `normal' domains. in-addr.arpa - allows us to get the host's name when we have its address. A - important thing to note here is that the IP addresses are written in - reverse order in the in-addr.arpa domain. If you have the address of - a machine: 198.186.203.77 named proceeds to find the named - 77.203.168.198.in-addr.arpa/ just like it did for prep.ai.mit.edu. - Example: Finding no cache entry for any match but `.', ask a root - server, m.root-servers.net refers you to some other root servers. - b.root-servers.net refers you directly to bitsy.mit.edu/. You should - be able to take it from there. - - - - 5.2. Our own domain - - Now to define our own domain. We're going to make the domain - linux.bogus and define machines in it. I use a totally bogus domain - name to make sure we disturb no-one Out There. - - - One more thing before we start: Not all characters are allowed in host - names. We're restricted to the characters of the English alphabet: a- - z, and numbers 0-9 and the character '-' (dash). Keep to those - characters (BIND 9 will not bug you if you break this rule, BIND 8 - will). Upper and lower-case characters are the same for DNS, so - pat.uio.no is identical to Pat.UiO.No. - - - We've already started this part with this line in named.conf: - - - ______________________________________________________________________ - zone "0.0.127.in-addr.arpa" { - type master; - file "pz/127.0.0"; - }; - ______________________________________________________________________ - - - - Please note the lack of `.' at the end of the domain names in this - file. This says that now we will define the zone 0.0.127.in- - addr.arpa, that we're the master server for it and that it is stored - in a file called pz/127.0.0. We've already set up this file, it - reads: - - - ______________________________________________________________________ - $TTL 3D - @ IN SOA ns.linux.bogus. hostmaster.linux.bogus. ( - 1 ; Serial - 8H ; Refresh - 2H ; Retry - 4W ; Expire - 1D) ; Minimum TTL - NS ns.linux.bogus. - 1 PTR localhost. - ______________________________________________________________________ - - Please note the `.' at the end of all the full domain names in this - file, in contrast to the named.conf file above. Some people like to - start each zone file with a $ORIGIN directive, but this is - superfluous. The origin (where in the DNS hierarchy it belongs) of a - zone file is specified in the zone section of the named.conf file; in - this case it's 0.0.127.in-addr.arpa. - - - This `zone file' contains 3 `resource records' (RRs): A SOA RR. A NS - RR and a PTR RR. SOA is short for Start Of Authority. The `@' is a - special notation meaning the origin, and since the `domain' column for - this file says 0.0.127.in-addr.arpa the first line really means - - - - 0.0.127.in-addr.arpa. IN SOA ... - - - - NS is the Name Server RR. There is no '@' at the start of this line; - it is implicit since the previous line started with a '@'. Saves some - typing that. So the NS line could also be written - - - - 0.0.127.in-addr.arpa. IN NS ns.linux.bogus - - - - It tells DNS what machine is the name server of the domain 0.0.127.in- - addr.arpa, it is ns.linux.bogus. 'ns' is a customary name for name- - servers, but as with web servers who are customarily named - www.something. The name may be anything. - - - And finally the PTR (Domain Name Pointer) record says that the host at - address 1 in the subnet 0.0.127.in-addr.arpa, i.e., 127.0.0.1 is named - localhost. - - - The SOA record is the preamble to all zone files, and there should be - exactly one in each zone file, at the top (but after the $TTL - directive). It describes the zone, where it comes from (a machine - called ns.linux.bogus), who is responsible for its contents - (hostmaster@linux.bogus; you should insert your e-mail address here), - what version of the zone file this is (serial: 1), and other things - having to do with caching and secondary DNS servers. For the rest of - the fields (refresh, retry, expire and minimum) use the numbers used - in this HOWTO and you should be safe. Before the SOA comes a - mandatory line, the $TTL 3D line. Put it in all your zone files. - - - Now restart your named (rndc stop; named) and use dig to examine your - handy work. -x asks for the inverse query: - - - - $ dig -x 127.0.0.1 - ;; Got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30944 - ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 0 - - ;; QUESTION SECTION: - ;1.0.0.127.in-addr.arpa. IN PTR - - ;; ANSWER SECTION: - 1.0.0.127.in-addr.arpa. 259200 IN PTR localhost. - - ;; AUTHORITY SECTION: - 0.0.127.in-addr.arpa. 259200 IN NS ns.linux.bogus. - - ;; Query time: 3 msec - ;; SERVER: 127.0.0.1#53(127.0.0.1) - ;; WHEN: Sun Dec 23 03:02:39 2001 - ;; MSG SIZE rcvd: 91 - - - - So it manages to get localhost from 127.0.0.1, good. Now for our main - task, the linux.bogus domain, insert a new 'zone' section in - named.conf: - - - ______________________________________________________________________ - zone "linux.bogus" { - type master; - notify no; - file "pz/linux.bogus"; - }; - ______________________________________________________________________ - - - - Note again the lack of ending `.' on the domain name in the named.conf - file. - - - In the linux.bogus zone file we'll put some totally bogus data: - - - - ______________________________________________________________________ - ; - ; Zone file for linux.bogus - ; - ; The full zone file - ; - $TTL 3D - @ IN SOA ns.linux.bogus. hostmaster.linux.bogus. ( - 199802151 ; serial, todays date + todays serial # - 8H ; refresh, seconds - 2H ; retry, seconds - 4W ; expire, seconds - 1D ) ; minimum, seconds - ; - NS ns ; Inet Address of name server - MX 10 mail.linux.bogus ; Primary Mail Exchanger - MX 20 mail.friend.bogus. ; Secondary Mail Exchanger - ; - localhost A 127.0.0.1 - ns A 192.168.196.2 - mail A 192.168.196.4 - ______________________________________________________________________ - - - - Two things must be noted about the SOA record. ns.linux.bogus must be - a actual machine with a A record. It is not legal to have a CNAME - record for the machine mentioned in the SOA record. Its name need not - be `ns', it could be any legal host name. Next, - hostmaster.linux.bogus should be read as hostmaster@linux.bogus. This - should be a mail alias, or a mailbox, where the person(s) maintaining - DNS should read mail frequently. Any mail regarding the domain will - be sent to the address listed here. The name need not be - `hostmaster', it can be your normal e-mail address, but the e-mail - address `hostmaster' is often expected to work as well. - - - There is one new RR type in this file, the MX, or Mail eXchanger RR. - It tells mail systems where to send mail that is addressed to - someone@linux.bogus, namely to mail.linux.bogus or mail.friend.bogus. - The number before each machine name is that MX RR's priority. The RR - with the lowest number (10) is the one mail should be sent to if - possible. If that fails the mail can be sent to one with a higher - number, a secondary mail handler, i.e., mail.friend.bogus which has - priority 20 here. - - - Reload named by running rndc reload. Examine the results with dig: - - - - $ dig any linux.bogus - ; <<>> DiG 9.1.3 <<>> any linux.bogus - ;; global options: printcmd - ;; Got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55239 - ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 1, ADDITIONAL: 1 - - ;; QUESTION SECTION: - ;linux.bogus. IN ANY - - ;; ANSWER SECTION: - linux.bogus. 259200 IN SOA ns.linux.bogus. \ - hostmaster.linux.bogus. 199802151 28800 7200 2419200 86400 - linux.bogus. 259200 IN NS ns.linux.bogus. - linux.bogus. 259200 IN MX 20 mail.friend.bogus. - linux.bogus. 259200 IN MX 10 mail.linux.bogus.linux.bogus. - - ;; AUTHORITY SECTION: - linux.bogus. 259200 IN NS ns.linux.bogus. - - ;; ADDITIONAL SECTION: - ns.linux.bogus. 259200 IN A 192.168.196.2 - - ;; Query time: 4 msec - ;; SERVER: 127.0.0.1#53(127.0.0.1) - ;; WHEN: Sun Dec 23 03:06:45 2001 - ;; MSG SIZE rcvd: 184 - - - - Upon careful examination you will discover a bug. The line - - - - linux.bogus. 259200 IN MX 10 mail.linux.bogus.linux.bogus. - - - - is all wrong. It should be - - - - linux.bogus. 259200 IN MX 10 mail.linux.bogus. - - - - I deliberately made a mistake so you could learn from it :-) Looking - in the zone file we find this line: - - - - MX 10 mail.linux.bogus ; Primary Mail Exchanger - - - - It is missing a period. Or has a 'linux.bogus' too many. If a - machine name does not end in a period in a zone file the origin is - added to its end causing the double linux.bogus.linux.bogus. So - either - - - ______________________________________________________________________ - MX 10 mail.linux.bogus. ; Primary Mail Exchanger - ______________________________________________________________________ - - - - or - - - ______________________________________________________________________ - MX 10 mail ; Primary Mail Exchanger - ______________________________________________________________________ - - - - is correct. I prefer the latter form, it's less to type. There are - some BIND experts that disagree, and some that agree with this. In a - zone file the domain should either be written out and ended with a `.' - or it should not be included at all, in which case it defaults to the - origin. - - - I must stress that in the named.conf file there should not be `.'s - after the domain names. You have no idea how many times a `.' too - many or few have fouled up things and confused the h*ll out of people. - - - So having made my point here is the new zone file, with some extra - information in it as well: - - - - ______________________________________________________________________ - ; - ; Zone file for linux.bogus - ; - ; The full zone file - ; - $TTL 3D - @ IN SOA ns.linux.bogus. hostmaster.linux.bogus. ( - 199802151 ; serial, todays date + todays serial # - 8H ; refresh, seconds - 2H ; retry, seconds - 4W ; expire, seconds - 1D ) ; minimum, seconds - ; - TXT "Linux.Bogus, your DNS consultants" - NS ns ; Inet Address of name server - NS ns.friend.bogus. - MX 10 mail ; Primary Mail Exchanger - MX 20 mail.friend.bogus. ; Secondary Mail Exchanger - - localhost A 127.0.0.1 - - gw A 192.168.196.1 - TXT "The router" - - ns A 192.168.196.2 - MX 10 mail - MX 20 mail.friend.bogus. - www CNAME ns - - donald A 192.168.196.3 - MX 10 mail - MX 20 mail.friend.bogus. - TXT "DEK" - - mail A 192.168.196.4 - MX 10 mail - MX 20 mail.friend.bogus. - - ftp A 192.168.196.5 - MX 10 mail - MX 20 mail.friend.bogus. - ______________________________________________________________________ - - - - CNAME (Canonical NAME) is a way to give each machine several names. - So www is an alias for ns. CNAME record usage is a bit controversial. - But it's safe to follow the rule that a MX, CNAME or SOA record should - never refer to a CNAME record, they should only refer to something - with an A record, so it is inadvisable to have - - - ______________________________________________________________________ - foobar CNAME www ; NO! - ______________________________________________________________________ - - - - but correct to have - - - - ______________________________________________________________________ - foobar CNAME ns ; Yes! - ______________________________________________________________________ - - - - Load the new database by running rndc reload, which causes named to - read its files again. - - - - $ dig linux.bogus axfr - - ; <<>> DiG 9.1.3 <<>> linux.bogus axfr - ;; global options: printcmd - linux.bogus. 259200 IN SOA ns.linux.bogus. hostmaster.linux.bogus. 199802151 28800 7200 2419200 86400 - linux.bogus. 259200 IN NS ns.linux.bogus. - linux.bogus. 259200 IN MX 10 mail.linux.bogus. - linux.bogus. 259200 IN MX 20 mail.friend.bogus. - donald.linux.bogus. 259200 IN A 192.168.196.3 - donald.linux.bogus. 259200 IN MX 10 mail.linux.bogus. - donald.linux.bogus. 259200 IN MX 20 mail.friend.bogus. - donald.linux.bogus. 259200 IN TXT "DEK" - ftp.linux.bogus. 259200 IN A 192.168.196.5 - ftp.linux.bogus. 259200 IN MX 10 mail.linux.bogus. - ftp.linux.bogus. 259200 IN MX 20 mail.friend.bogus. - gw.linux.bogus. 259200 IN A 192.168.196.1 - gw.linux.bogus. 259200 IN TXT "The router" - localhost.linux.bogus. 259200 IN A 127.0.0.1 - mail.linux.bogus. 259200 IN A 192.168.196.4 - mail.linux.bogus. 259200 IN MX 10 mail.linux.bogus. - mail.linux.bogus. 259200 IN MX 20 mail.friend.bogus. - ns.linux.bogus. 259200 IN MX 10 mail.linux.bogus. - ns.linux.bogus. 259200 IN MX 20 mail.friend.bogus. - ns.linux.bogus. 259200 IN A 192.168.196.2 - www.linux.bogus. 259200 IN CNAME ns.linux.bogus. - linux.bogus. 259200 IN SOA ns.linux.bogus. hostmaster.linux.bogus. 199802151 28800 7200 2419200 86400 - ;; Query time: 41 msec - ;; SERVER: 127.0.0.1#53(127.0.0.1) - ;; WHEN: Sun Dec 23 03:12:31 2001 - ;; XFR size: 23 records - - - - That's good. As you see it looks a bit like the zone file itself. - Let's check what it says for www alone: - - - - $ dig www.linux.bogus - - ; <<>> DiG 9.1.3 <<>> www.linux.bogus - ;; global options: printcmd - ;; Got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16633 - ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 1, ADDITIONAL: 0 - - ;; QUESTION SECTION: - ;www.linux.bogus. IN A - - ;; ANSWER SECTION: - www.linux.bogus. 259200 IN CNAME ns.linux.bogus. - ns.linux.bogus. 259200 IN A 192.168.196.2 - - ;; AUTHORITY SECTION: - linux.bogus. 259200 IN NS ns.linux.bogus. - - ;; Query time: 5 msec - ;; SERVER: 127.0.0.1#53(127.0.0.1) - ;; WHEN: Sun Dec 23 03:14:14 2001 - ;; MSG SIZE rcvd: 80 - - - - In other words, the real name of www.linux.bogus is ns.linux.bogus, - and it gives you some of the information it has about ns as well, - enough to connect to it if you were a program. - - - Now we're halfway. - - - 5.3. The reverse zone - - Now programs can convert the names in linux.bogus to addresses which - they can connect to. But also required is a reverse zone, one making - DNS able to convert from an address to a name. This name is used by a - lot of servers of different kinds (FTP, IRC, WWW and others) to decide - if they want to talk to you or not, and if so, maybe even how much - priority you should be given. For full access to all services on the - Internet a reverse zone is required. - - - Put this in named.conf: - - - ______________________________________________________________________ - zone "196.168.192.in-addr.arpa" { - type master; - notify no; - file "pz/192.168.196"; - }; - ______________________________________________________________________ - - - - This is exactly as with the 0.0.127.in-addr.arpa, and the contents are - similar: - - - - ______________________________________________________________________ - $TTL 3D - @ IN SOA ns.linux.bogus. hostmaster.linux.bogus. ( - 199802151 ; Serial, todays date + todays serial - 8H ; Refresh - 2H ; Retry - 4W ; Expire - 1D) ; Minimum TTL - NS ns.linux.bogus. - - 1 PTR gw.linux.bogus. - 2 PTR ns.linux.bogus. - 3 PTR donald.linux.bogus. - 4 PTR mail.linux.bogus. - 5 PTR ftp.linux.bogus. - ______________________________________________________________________ - - - - Now you reload your named (rndc reload) and examine your work with dig - again: - - - ______________________________________________________________________ - $ dig -x 192.168.196.4 - ;; Got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58451 - ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1 - - ;; QUESTION SECTION: - ;4.196.168.192.in-addr.arpa. IN PTR - - ;; ANSWER SECTION: - 4.196.168.192.in-addr.arpa. 259200 IN PTR mail.linux.bogus. - - ;; AUTHORITY SECTION: - 196.168.192.in-addr.arpa. 259200 IN NS ns.linux.bogus. - - ;; ADDITIONAL SECTION: - ns.linux.bogus. 259200 IN A 192.168.196.2 - - ;; Query time: 4 msec - ;; SERVER: 127.0.0.1#53(127.0.0.1) - ;; WHEN: Sun Dec 23 03:16:05 2001 - ;; MSG SIZE rcvd: 107 - ______________________________________________________________________ - - - - so, it looks OK, dump the whole thing to examine that too: - - - - ______________________________________________________________________ - $ dig 196.168.192.in-addr.arpa. AXFR - - ; <<>> DiG 9.1.3 <<>> 196.168.192.in-addr.arpa. AXFR - ;; global options: printcmd - 196.168.192.in-addr.arpa. 259200 IN SOA ns.linux.bogus. \ - hostmaster.linux.bogus. 199802151 28800 7200 2419200 86400 - 196.168.192.in-addr.arpa. 259200 IN NS ns.linux.bogus. - 1.196.168.192.in-addr.arpa. 259200 IN PTR gw.linux.bogus. - 2.196.168.192.in-addr.arpa. 259200 IN PTR ns.linux.bogus. - 3.196.168.192.in-addr.arpa. 259200 IN PTR donald.linux.bogus. - 4.196.168.192.in-addr.arpa. 259200 IN PTR mail.linux.bogus. - 5.196.168.192.in-addr.arpa. 259200 IN PTR ftp.linux.bogus. - 196.168.192.in-addr.arpa. 259200 IN SOA ns.linux.bogus. \ - hostmaster.linux.bogus. 199802151 28800 7200 2419200 86400 - ;; Query time: 6 msec - ;; SERVER: 127.0.0.1#53(127.0.0.1) - ;; WHEN: Sun Dec 23 03:16:58 2001 - ;; XFR size: 9 records - ______________________________________________________________________ - - - - Looks good! If your output didn't look like that look for error- - messages in your syslog, I explained how to do that in the first - section under the heading ``Starting named'' - - - 5.4. Words of caution - - There are some things I should add here. The IP numbers used in the - examples above are taken from one of the blocks of 'private nets', - i.e., they are not allowed to be used publicly on the Internet. So - they are safe to use in an example in a HOWTO. The second thing is - the notify no; line. It tells named not to notify its secondary - (slave) servers when it has gotten a update to one of its zone files. - In BIND 8 and later the named can notify the other servers listed in - NS records in the zone file when a zone is updated. This is handy for - ordinary use. But for private experiments with zones this feature - should be off --- we don't want the experiment to pollute the Internet - do we? - - - And, of course, this domain is highly bogus, and so are all the - addresses in it. For a real example of a real-life domain see the - next main-section. - - - 5.5. Why reverse lookups don't work. - - There are a couple of ``gotchas'' that normally are avoided with name - lookups that are often seen when setting up reverse zones. Before you - go on you need reverse lookups of your machines working on your own - nameserver. If it isn't go back and fix it before continuing. - - - I will discuss two failures of reverse lookups as seen from outside - your network: - - - 5.5.1. The reverse zone isn't delegated. - - When you ask a service provider for a network-address range and a - domain name the domain name is normally delegated as a matter of - course. A delegation is the glue NS record that helps you get from - one nameserver to another as explained in the dry theory section - above. You read that, right? If your reverse zone doesn't work go - back and read it. Now. - - - The reverse zone also needs to be delegated. If you got the - 192.168.196 net with the linux.bogus domain from your provider they - need to put NS records in for your reverse zone as well as for your - forward zone. If you follow the chain from in-addr.arpa and up to - your net you will probably find a break in the chain, most probably at - your service provider. Having found the break in the chain contact - your service-provider and ask them to correct the error. - - - 5.5.2. You've got a classless subnet - - This is a somewhat advanced topic, but classless subnets are very - common these days and you probably have one if you're a small company. - - - A classless subnet is what keeps the Internet going these days. Some - years ago there was much ado about the shortage of IP numbers. The - smart people in IETF (the Internet Engineering Task Force, they keep - the Internet working) stuck their heads together and solved the - problem. At a price. The price is in part that you'll get less than - a ``C'' subnet and some things may break. Please see Ask Mr. DNS - for an good explanation of - this and how to handle it. - - - Did you read it? I'm not going to explain it so please read it. - - - The first part of the problem is that your ISP must understand the - technique described by Mr. DNS. Not all small ISPs have a working - understanding of this. If so you might have to explain to them and be - persistent. But be sure you understand it first ;-). They will then - set up a nice reverse zone at their server which you can examine for - correctness with dig. - - - The second and last part of the problem is that you must understand - the technique. If you're unsure go back and read about it again. - Then you can set up your own classless reverse zone as described by - Mr. DNS. - - - There is another trap lurking here. (Very) Old resolvers will not be - able to follow the CNAME trick in the resolving chain and will fail to - reverse-resolve your machine. This can result in the service - assigning it an incorrect access class, deny access or something along - those lines. If you stumble into such a service the only solution - (that I know of) is for your ISP to insert your PTR record directly - into their trick classless zone file instead of the trick CNAME - record. - - - Some ISPs will offer other ways to handle this, like Web based forms - for you to input your reverse-mappings in or other automagical - systems. - - - 5.6. Slave servers - - Once you have set up your zones correctly on the master servers you - need to set up at least one slave server. Slave servers are needed - for robustness. If your master goes down the people out there on the - net will still be able to get information about your domain from the - slave. A slave should be as long away from you as possible. Your - master and slave should share as few as possible of these: Power - supply, LAN, ISP, city and country. If all of these things are - different for your master and slave you've found a really good slave. - - - A slave is simply a nameserver that copies zone files from a master. - You set it up like this: - - - ______________________________________________________________________ - zone "linux.bogus" { - type slave; - file "sz/linux.bogus"; - masters { 192.168.196.2; }; - }; - ______________________________________________________________________ - - - - A mechanism called zone-transfer is used to copy the data. The zone - transfer is controlled by your SOA record: - - - ______________________________________________________________________ - @ IN SOA ns.linux.bogus. hostmaster.linux.bogus. ( - 199802151 ; serial, todays date + todays serial # - 8H ; refresh, seconds - 2H ; retry, seconds - 4W ; expire, seconds - 1D ) ; minimum, seconds - ______________________________________________________________________ - - - - A zone is only transferred if the serial number on the master is - larger than on the slave. Every refresh interval the slave will check - if the master has been updated. If the check fails (because the - master is unavailable) it will retry the check every retry interval. - If it continues to fail as long as the expire interval the slave will - remove the zone from it's filesystem and no longer be a server for it. - - - - 6. Basic security options. - - By Jamie Norrish - - - Setting configuration options to reduce the possibility of problems. - - - There are a few simple steps that you can take which will both make - your server more secure and potentially reduce its load. The material - presented here is nothing more than a starting point; if you are - concerned about security (and you should be), please consult other - resources on the net (see ``the last chapter''). - - - The following configuration directives occur in named.conf. If a - directive occurs in the options section of the file, it applies to all - zones listed in that file. If it occurs within a zone entry, it - applies only to that zone. A zone entry overrides an options entry. - - - 6.1. Restricting zone transfers - - In order for your slave server(s) to be able to answer queries about - your domain, they must be able to transfer the zone information from - your primary server. Very few others have a need to do so. Therefore - restrict zone transfers using the allow-transfer option, assuming - 192.168.1.4 is the IP address of ns.friend.bogus and adding yourself - for debugging purposes: - - - ______________________________________________________________________ - zone "linux.bogus" { - allow-transfer { 192.168.1.4; localhost; }; - }; - ______________________________________________________________________ - - - - By restricting zone transfers you ensure that the only information - available to people is that which they ask for directly - no one can - just ask for all the details about your set-up. - - - 6.2. Protecting against spoofing - - Firstly, disable any queries for domains you don't own, except from - your internal/local machines. This not only helps prevent malicious - use of your DNS server, but also reduces unnecessary use of your - server. - - - ______________________________________________________________________ - options { - allow-query { 192.168.196.0/24; localhost; }; - }; - - zone "linux.bogus" { - allow-query { any; }; - }; - - zone "196.168.192.in-addr.arpa" { - allow-query { any; }; - }; - ______________________________________________________________________ - - - - Further, disable recursive queries except from internal/local sources. - This reduces the risk of cache poisoning attacks (where false data is - fed to your server). - - - ______________________________________________________________________ - options { - allow-recursion { 192.168.196.0/24; localhost; }; - }; - ______________________________________________________________________ - - - - 6.3. Running named as non-root - - It is a good idea to run named as a user other than root, so that if - it is compromised the privileges gained by the cracker are as limited - as possible. You first have to create a user for named to run under, - and then modify whatever init script you use that starts named. Pass - the new user name and group to named using the -u and -g flags. - - - For example, in Debian GNU/Linux 2.2 you might modify your - /etc/init.d/bind script to have the following line (where user named - have been created): - - - ______________________________________________________________________ - start-stop-daemon --start --quiet --exec /usr/sbin/named -- -u named - ______________________________________________________________________ - - - - The same can be done with Red Hat and the other distributions. - - - Dave Lugo has described a secure dual chroot setup - which you may find - interesting to read, it makes the host your run your named on even - more secure. - - - 7. A real domain example - - Where we list some real zone files - - - Users have suggested that I include a real example of a working domain - as well as the tutorial example. - - - I use this example with permission from David Bullock of LAND-5. - These files were current 24th of September 1996, and were then edited - to fit BIND 8 restrictions and use extensions by me. So, what you see - here differs a bit from what you find if you query LAND-5's name - servers now. - - - 7.1. /etc/named.conf (or /var/named/named.conf) - - Here we find master zone sections for the two reverse zones needed: - the 127.0.0 net, as well as LAND-5's 206.6.177 subnet, and a primary - line for land-5's forward zone land-5.com. Also note that instead of - stuffing the files in a directory called pz, as I do in this HOWTO, he - puts them in a directory called zone. - - - - ______________________________________________________________________ - // Boot file for LAND-5 name server - - options { - directory "/var/named"; - }; - - controls { - inet 127.0.0.1 allow { localhost; } keys { rndc_key; }; - }; - - key "rndc_key" { - algorithm hmac-md5; - secret "c3Ryb25nIGVub3VnaCBmb3IgYSBtYW4gYnV0IG1hZGUgZm9yIGEgd29tYW4K"; - }; - - zone "." { - type hint; - file "root.hints"; - }; - - zone "0.0.127.in-addr.arpa" { - type master; - file "zone/127.0.0"; - }; - - zone "land-5.com" { - type master; - file "zone/land-5.com"; - }; - - zone "177.6.206.in-addr.arpa" { - type master; - file "zone/206.6.177"; - }; - ______________________________________________________________________ - - - - If you put this in your named.conf file to play with PLEASE put - ``notify no;'' in the zone sections for the two land-5 zones so as to - avoid accidents. - - - 7.2. /var/named/root.hints - - Keep in mind that this file is dynamic, and the one listed here is - old. You're better off using a new one as explained earlier. - - - - ______________________________________________________________________ - ; <<>> DiG 8.1 <<>> @A.ROOT-SERVERS.NET. - ; (1 server found) - ;; res options: init recurs defnam dnsrch - ;; got answer: - ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10 - ;; flags: qr aa rd; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 13 - ;; QUERY SECTION: - ;; ., type = NS, class = IN - - ;; ANSWER SECTION: - . 6D IN NS G.ROOT-SERVERS.NET. - . 6D IN NS J.ROOT-SERVERS.NET. - . 6D IN NS K.ROOT-SERVERS.NET. - . 6D IN NS L.ROOT-SERVERS.NET. - . 6D IN NS M.ROOT-SERVERS.NET. - . 6D IN NS A.ROOT-SERVERS.NET. - . 6D IN NS H.ROOT-SERVERS.NET. - . 6D IN NS B.ROOT-SERVERS.NET. - . 6D IN NS C.ROOT-SERVERS.NET. - . 6D IN NS D.ROOT-SERVERS.NET. - . 6D IN NS E.ROOT-SERVERS.NET. - . 6D IN NS I.ROOT-SERVERS.NET. - . 6D IN NS F.ROOT-SERVERS.NET. - - ;; ADDITIONAL SECTION: - G.ROOT-SERVERS.NET. 5w6d16h IN A 192.112.36.4 - J.ROOT-SERVERS.NET. 5w6d16h IN A 198.41.0.10 - K.ROOT-SERVERS.NET. 5w6d16h IN A 193.0.14.129 - L.ROOT-SERVERS.NET. 5w6d16h IN A 198.32.64.12 - M.ROOT-SERVERS.NET. 5w6d16h IN A 202.12.27.33 - A.ROOT-SERVERS.NET. 5w6d16h IN A 198.41.0.4 - H.ROOT-SERVERS.NET. 5w6d16h IN A 128.63.2.53 - B.ROOT-SERVERS.NET. 5w6d16h IN A 128.9.0.107 - C.ROOT-SERVERS.NET. 5w6d16h IN A 192.33.4.12 - D.ROOT-SERVERS.NET. 5w6d16h IN A 128.8.10.90 - E.ROOT-SERVERS.NET. 5w6d16h IN A 192.203.230.10 - I.ROOT-SERVERS.NET. 5w6d16h IN A 192.36.148.17 - F.ROOT-SERVERS.NET. 5w6d16h IN A 192.5.5.241 - - ;; Total query time: 215 msec - ;; FROM: roke.uio.no to SERVER: A.ROOT-SERVERS.NET. 198.41.0.4 - ;; WHEN: Sun Feb 15 01:22:51 1998 - ;; MSG SIZE sent: 17 rcvd: 436 - ______________________________________________________________________ - - - - 7.3. /var/named/zone/127.0.0 - - Just the basics, the obligatory SOA record, and a record that maps - 127.0.0.1 to localhost. Both are required. No more should be in this - file. It will probably never need to be updated, unless your - nameserver or hostmaster address changes. - - - - ______________________________________________________________________ - $TTL 3D - @ IN SOA land-5.com. root.land-5.com. ( - 199609203 ; Serial - 28800 ; Refresh - 7200 ; Retry - 604800 ; Expire - 86400) ; Minimum TTL - NS land-5.com. - - 1 PTR localhost. - ______________________________________________________________________ - - - - If you look at a random BIND installation you will probably find that - the $TTL line is missing as it is here. It was not used before, and - only version 8.2 of BIND has started to warn about its absence. BIND - 9 requires the $TTL. - - - 7.4. /var/named/zone/land-5.com - - Here we see the mandatory SOA record, the needed NS records. We can - see that he has a secondary name server at ns2.psi.net. This is as it - should be, always have a off site secondary server as backup. We can - also see that he has a master host called land-5 which takes care of - many of the different Internet services, and that he's done it with - CNAMEs (a alternative is using A records). - - - As you see from the SOA record, the zone file originates at - land-5.com, the contact person is root@land-5.com. hostmaster is - another oft used address for the contact person. The serial number is - in the customary yyyymmdd format with todays serial number appended; - this is probably the sixth version of zone file on the 20th of - September 1996. Remember that the serial number must increase - monotonically, here there is only one digit for todays serial#, so - after 9 edits he has to wait until tomorrow before he can edit the - file again. Consider using two digits. - - - - ______________________________________________________________________ - $TTL 3D - @ IN SOA land-5.com. root.land-5.com. ( - 199609206 ; serial, todays date + todays serial # - 8H ; refresh, seconds - 2H ; retry, seconds - 4W ; expire, seconds - 1D ) ; minimum, seconds - NS land-5.com. - NS ns2.psi.net. - MX 10 land-5.com. ; Primary Mail Exchanger - TXT "LAND-5 Corporation" - - localhost A 127.0.0.1 - - router A 206.6.177.1 - - land-5.com. A 206.6.177.2 - ns A 206.6.177.3 - www A 207.159.141.192 - - ftp CNAME land-5.com. - mail CNAME land-5.com. - news CNAME land-5.com. - - funn A 206.6.177.2 - - ; - ; Workstations - ; - ws-177200 A 206.6.177.200 - MX 10 land-5.com. ; Primary Mail Host - ws-177201 A 206.6.177.201 - MX 10 land-5.com. ; Primary Mail Host - ws-177202 A 206.6.177.202 - MX 10 land-5.com. ; Primary Mail Host - ws-177203 A 206.6.177.203 - MX 10 land-5.com. ; Primary Mail Host - ws-177204 A 206.6.177.204 - MX 10 land-5.com. ; Primary Mail Host - ws-177205 A 206.6.177.205 - MX 10 land-5.com. ; Primary Mail Host - ; {Many repetitive definitions deleted - SNIP} - ws-177250 A 206.6.177.250 - MX 10 land-5.com. ; Primary Mail Host - ws-177251 A 206.6.177.251 - MX 10 land-5.com. ; Primary Mail Host - ws-177252 A 206.6.177.252 - MX 10 land-5.com. ; Primary Mail Host - ws-177253 A 206.6.177.253 - MX 10 land-5.com. ; Primary Mail Host - ws-177254 A 206.6.177.254 - MX 10 land-5.com. ; Primary Mail Host - ______________________________________________________________________ - - - - If you examine land-5s nameserver you will find that the host names - are of the form ws_number. As of late BIND 4 versions named started - enforcing the restrictions on what characters may be used in host - names. So that does not work with BIND 8 at all, and I substituted - '-' (dash) for '_' (underline) for use in this HOWTO. But, as - mentioned earlier, BIND 9 no longer enforces this restriction. - - - Another thing to note is that the workstations don't have individual - names, but rather a prefix followed by the two last parts of the IP - numbers. Using such a convention can simplify maintenance - significantly, but can be a bit impersonal, and, in fact, be a source - of irritation among your customers. - - - We also see that funn.land-5.com is an alias for land-5.com, but using - an A record, not a CNAME record. - - - 7.5. /var/named/zone/206.6.177 - - I'll comment on this file below - - - ______________________________________________________________________ - $TTL 3D - @ IN SOA land-5.com. root.land-5.com. ( - 199609206 ; Serial - 28800 ; Refresh - 7200 ; Retry - 604800 ; Expire - 86400) ; Minimum TTL - NS land-5.com. - NS ns2.psi.net. - ; - ; Servers - ; - 1 PTR router.land-5.com. - 2 PTR land-5.com. - 2 PTR funn.land-5.com. - ; - ; Workstations - ; - 200 PTR ws-177200.land-5.com. - 201 PTR ws-177201.land-5.com. - 202 PTR ws-177202.land-5.com. - 203 PTR ws-177203.land-5.com. - 204 PTR ws-177204.land-5.com. - 205 PTR ws-177205.land-5.com. - ; {Many repetitive definitions deleted - SNIP} - 250 PTR ws-177250.land-5.com. - 251 PTR ws-177251.land-5.com. - 252 PTR ws-177252.land-5.com. - 253 PTR ws-177253.land-5.com. - 254 PTR ws-177254.land-5.com. - ______________________________________________________________________ - - - - The reverse zone is the bit of the setup that seems to cause the most - grief. It is used to find the host name if you have the IP number of - a machine. Example: you are an FTP server and accept connections from - FTP clients. As you are a Norwegian FTP server you want to accept - more connections from clients in Norway and other Scandinavian - countries and less from the rest of the world. When you get a - connection from a client the C library is able to tell you the IP - number of the connecting machine because the IP number of the client - is contained in all the packets that are passed over the network. Now - you can call a function called gethostbyaddr that looks up the name of - a host given the IP number. Gethostbyaddr will ask a DNS server, - which will then traverse the DNS looking for the machine. Supposing - the client connection is from ws-177200.land-5.com. The IP number the - C library provides to the FTP server is 206.6.177.200. To find out - the name of that machine we need to find 200.177.6.206.in-addr.arpa. - The DNS server will first find the arpa. servers, then find in- - addr.arpa. servers, following the reverse trail through 206, then 6 - and at last finding the server for the 177.6.206.in-addr.arpa zone at - LAND-5. From which it will finally get the answer that for - 200.177.6.206.in-addr.arpa we have a ``PTR ws-177200.land-5.com'' - record, meaning that the name that goes with 206.6.177.200 is - ws-177200.land-5.com. - - - The FTP server prioritizes connections from the Scandinavian - countries, i.e., *.no, *.se, *.dk, the name ws-177200.land-5.com - clearly does not match any of those, and the server will put the - connection in a connection class with less bandwidth and fewer clients - allowed. If there was no reverse mapping of 206.2.177.200 through the - in-addr.arpa zone the server would have been unable to find the name - at all and would have to settle to comparing 206.2.177.200 with *.no, - *.se and *.dk, none of which will match at all, it may even deny the - connection for lack of classification. - - - Some people will tell you that reverse lookup mappings are only - important for servers, or not important at all. Not so: Many ftp, - news, IRC and even some http (WWW) servers will not accept connections - from machines of which they are not able to find the name. So reverse - mappings for machines are in fact mandatory. - - - 8. Maintenance - - Keeping it working. - - - There is one maintenance task you have to do on nameds, other than - keeping them running. That's keeping the root.hints file updated. - The easiest way is using dig. First run dig with no arguments you will - get the root.hints according to your own server. Then ask one of the - listed root servers with dig @rootserver. You will note that the - output looks terribly like a root.hints file. Save it to a file (dig - @e.root-servers.net . ns >root.hints.new) and replace the old - root.hints with it. - - - Remember to reload named after replacing the cache file. - - - Al Longyear sent me this script that can be run automatically to - update root.hints. Install a crontab entry to run it once a month and - forget it. The script assumes you have mail working and that the - mail-alias `hostmaster' is defined. You must hack it to suit your - setup. - - - - ______________________________________________________________________ - #!/bin/sh - # - # Update the nameserver cache information file once per month. - # This is run automatically by a cron entry. - # - # Original by Al Longyear - # Updated for BIND 8 by Nicolai Langfeldt - # Miscelanious error-conditions reported by David A. Ranch - # Ping test suggested by Martin Foster - # named up-test suggested by Erik Bryer. - # - ( - echo "To: hostmaster " - echo "From: system " - - # Is named up? Check the status of named. - case `rndc status 2>&1` in - *refused*) - echo "named is DOWN. root.hints was NOT updated" - echo - exit 0 - ;; - esac - - PATH=/sbin:/usr/sbin:/bin:/usr/bin: - export PATH - # NOTE: /var/named must be writable only by trusted users or this script - # will cause root compromise/denial of service opportunities. - cd /var/named 2>/dev/null || { - echo "Subject: Cannot cd to /var/named, error $?" - echo - echo "The subject says it all" - exit 1 - } - - # Are we online? Ping a server at your ISP - case `ping -qnc 1 some.machine.net 2>&1` in - *'100% packet loss'*) - echo "Subject: root.hints NOT updated. The network is DOWN." - echo - echo "The subject says it all" - exit 1 - ;; - esac - - dig @e.root-servers.net . ns >root.hints.new 2> errors - - case `cat root.hints.new` in - *NOERROR*) - # It worked - :;; - *) - echo "Subject: The root.hints file update has FAILED." - echo - echo "The root.hints update has failed" - echo "This is the dig output reported:" - echo - cat root.hints.new errors - exit 1 - ;; - esac - - echo "Subject: The root.hints file has been updated" - echo - echo "The root.hints file has been updated to contain the following - information:" - echo - cat root.hints.new - - chown root.root root.hints.new - chmod 444 root.hints.new - rm -f root.hints.old errors - mv root.hints root.hints.old - mv root.hints.new root.hints - rndc restart - echo - echo "The nameserver has been restarted to ensure that the update is complete." - echo "The previous root.hints file is now called - /var/named/root.hints.old." - ) 2>&1 | /usr/lib/sendmail -t - exit 0 - ______________________________________________________________________ - - - - Some of you might have picked up that the root.hints file is also - available by ftp from Internic. Please don't use ftp to update - root.hints, the above method is much more friendly to the net, and - Internic. - - - 9. Migrating to BIND 9 - - The BIND 9 distribution, and the prepackaged versions too, contains a - document called migration that contains notes about how to migrate - from BIND 8 to BIND 9. The document is very straight forward. If you - installed binary packages it's likely stored in /usr/share/doc/bind* - or /usr/doc/bind* somewhere. - - - If you're running BIND 4, you may find a document called - migration-4to9 in the same place. - - - 10. Questions and Answers - - Please read this section before mailing me. - - - 1. My named wants a named.boot file - - - You are reading the wrong HOWTO. Please see the old version of - this HOWTO, which covers BIND 4, at - - - 2. How do use DNS from inside a firewall? - - - A hint: forward only;. You might also need - - - ___________________________________________________________________ - query-source port 53; - - ___________________________________________________________________ - - - - inside the ``options'' part of the named.conf file as suggested in the - example ``caching'' section. - - - 3. How do I make DNS rotate through the available addresses for a - service, say www.busy.site to obtain a load balancing effect, or - similar? - - - Make several A records for www.busy.site and use BIND 4.9.3 or - later. Then BIND will round-robin the answers. It will not work - with earlier versions of BIND. - - - 4. I want to set up DNS on a (closed) intranet. What do I do? - - - You drop the root.hints file and just do zone files. That also - means you don't have to get new hint files all the time. - - - 5. How do I set up a secondary (slave) name server? - - - If the primary/master server has address 127.0.0.1 you put a line - like this in the named.conf file of your secondary: - - - ___________________________________________________________________ - zone "linux.bogus" { - type slave; - file "sz/linux.bogus"; - masters { 127.0.0.1; }; - }; - - ___________________________________________________________________ - - - - You may list several alternate master servers the zone can be copied - from inside the masters list, separated by ';' (semicolon). - - - 6. I want BIND running when I'm disconnected from the net. - - - There are four items regarding this: - - - ˇ Specific to BIND 8/9, Adam L Rice has sent me this e-mail, about - how to run DNS painlessly on a dialup machine: - - - - I have discovered with newer versions of BIND that this - [ where - he explains his way of doing this: - - - - I run named on my 'Masquerading' machine here. I have - two root.hints files, one called root.hints.real which contains - the real root server names and the other called root.hints.fake - which contains... - - ---- - ; root.hints.fake - ; this file contains no information - ---- - - When I go off line I copy the root.hints.fake file to root.hints and - restart named. - - When I go online I copy root.hints.real to root.hints and restart - named. - - This is done from ip-down & ip-up respectively. - - The first time I do a query off line on a domain name named doesn't - have details for it puts an entry like this in messages.. - - Jan 28 20:10:11 hazchem named[10147]: No root nameserver for class IN - - which I can live with. - - It certainly seems to work for me. I can use the nameserver for - local machines while off the 'net without the timeout delay for - external domain names and I while on the 'net queries for external - domains work normally - - - - Peter Denison thought that Ian does not go far enough though. He - writes: - - - - When connected) serve all cached (and local network) entries immediately - for non-cached entries, forward to my ISPs nameserver - When off-line) serve local network queries immediately - fail all other queries **immediately** - - The combination of changing the root cache file and forwarding queries - doesn't work. - - So, I've set up (with some discussion of this on the local LUG) two nameds - as follows: - - named-online: forwards to ISPs nameserver - master for localnet zone - master for localnet reverse zone (1.168.192.in-addr.arpa) - master for 0.0.127.in-addr.arpa - listens on port 60053 - - named-offline: no forwarding - "fake" root cache file - slave for 3 local zones (master is 127.0.0.1:60053) - listens on port 61053 - - And combined this with port forwarding, to send port 53 to 61053 when - off-line, and to port 60053 when online. (I'm using the new netfilter - package under 2.3.18, but the old (ipchains) mechanism should work.) - - Note that this won't quite work out-of-the-box, as there's a slight bug in - BIND 8.2, which I have logged wth the developers, preventing a slave - having a master on the same IP address (even if a different port). It's a - trivial patch, and should go in soon I hope. - - - - ˇ I have also received information about how BIND interacts with NFS - and the portmapper on a mostly offline machine from Karl-Max - Wanger: - - - - I use to run my own named on all my machines which are only - occasionally connected to the Internet by modem. The nameserver only - acts as a cache, it has no area of authority and asks back for - everything at the name servers in the root.cache file. As is usual - with Slackware, it is started before nfsd and mountd. - - With one of my machines (a Libretto 30 notebook) I had the problem - that sometimes I could mount it from another system connected to my - local LAN, but most of the time it didn't work. I had the same effect - regardless of using PLIP, a PCMCIA ethernet card or PPP over a serial - interface. - - After some time of guessing and experimenting I found out that - apparently named messed with the process of registration nfsd and - mountd have to carry out with the portmapper upon startup (I start - these daemons at boot time as usual). Starting named after nfsd and - mountd eliminated this problem completely. - - As there are no disadvantages to expect from such a modified boot - sequence I'd advise everybody to do it that way to prevent potential - trouble. - - - - 7. Where does the caching name server store its cache? Is there any - way I can control the size of the cache? - - - The cache is completely stored in memory, it is not written to disk - at any time. Every time you kill named the cache is lost. The - cache is not controllable in any way. named manages it according - to some simple rules and that is it. You cannot control the cache - or the cache size in any way for any reason. If you want to you can - ``fix'' this by hacking named. This is however not recommended. - - - 8. Does named save the cache between restarts? Can I make it save it? - - - No, named does not save the cache when it dies. That means that - the cache must be built anew each time you kill and restart named. - There is no way to make named save the cache in a file. If you - want you can ``fix'' this by hacking named. This is however not - recommended. - - - 9. How can I get a domain? I want to set up my own domain called (for - example) linux-rules.net. How can I get the domain I want assigned - to me? - - - Please contact your network service provider. They will be able to - help you with this. Please note that in most parts of the world - you need to pay money to get a domain. - - - 10. - How can I secure my DNS server? How do I set up split DNS? - - - Both of these are advanced topics. They are both covered in - . I will not explain - the topics further here. - 11. How to become a bigger time DNS admin. - - Documentation and tools. - - - Real Documentation exists. Online and in print. The reading of - several of these is required to make the step from small time DNS - admin to a big time one. - - - I have written The Concise Guide to DNS and BIND (by Nicolai - Langfeldt, me), published by Que (ISBN 0-7897-2273-9). The book is - much like this HOWTO, just more details, and a lot more of everything. - It has also been translated to Polish and published as DNS i BIND by - Helion ( , ISBN 83-7197-446-9). - Now in 4th edition is DNS and BIND by Cricket Liu and P. Albitz from - O'Reilly & Associates (ISBN 0-937175-82-X, affectionately known as the - Cricket book). Another book is Linux DNS Server Administration, by - Craig Hunt, published by Sybex (ISBN 0782127363), I have not read it - yet. Another must for good DNS administration (or good anything for - that matter) is Zen and the Art of Motorcycle Maintenance by Robert M. - Pirsig. - - - Online you will find my book, along with tons of other books, - available electronically as a subscription service at - . There is stuff on - (DNS Resources Directory), - ; A FAQ, a reference manual (the ARM - should be enclosed in the BIND distribution as well) as well as papers - and protocol definitions and DNS hacks (these, and most, if not all, - of the RFCs mentioned below, are also contained in the BIND - distribution). I have not read most of these. The newsgroup - is about DNS. In addition there - are a number of RFCs about DNS, the most important are probably the - ones listed here. Those that have BCP (Best Current Practice) numbers - are highly recommended. - - - - RFC 2671 - P. Vixie, Extension Mechanisms for DNS (EDNS0) August 1999. - - - RFC 2317 - BCP 20, H. Eidnes et. al. Classless IN-ADDR.ARPA delegation, - March 1998. This is about CIDR, or classless subnet reverse - lookups. - - - RFC 2308 - M. Andrews, Negative Caching of DNS Queries, March 1998. About - negative caching and the $TTL zone file directive. - - - RFC 2219 - BCP 17, M. Hamilton and R. Wright, Use of DNS Aliases for - Network Services, October 1997. About CNAME usage. - - - RFC 2182 - BCP 16, R. Elz et. al., Selection and Operation of Secondary DNS - Servers, July 1997. - - - - RFC 2052 - A. Gulbrandsen, P. Vixie, A DNS RR for specifying the location - of services (DNS SRV), October 1996 - - - RFC 1918 - Y. Rekhter, R. Moskowitz, D. Karrenberg, G. de Groot, E. Lear, - Address Allocation for Private Internets, 02/29/1996. - - - RFC 1912 - D. Barr, Common DNS Operational and Configuration Errors, - 02/28/1996. - - - RFC 1912 Errors - B. Barr Errors in RFC 1912. Only available at - - - - RFC 1713 - A. Romao, Tools for DNS debugging, 11/03/1994. - - - RFC 1712 - C. Farrell, M. Schulze, S. Pleitner, D. Baldoni, DNS Encoding of - Geographical Location, 11/01/1994. - - - RFC 1183 - R. Ullmann, P. Mockapetris, L. Mamakos, C. Everhart, New DNS RR - Definitions, 10/08/1990. - - - RFC 1035 - P. Mockapetris, Domain names - implementation and specification, - 11/01/1987. - - - RFC 1034 - P. Mockapetris, Domain names - concepts and facilities, - 11/01/1987. - - - RFC 1033 - M. Lottor, Domain administrators operations guide, 11/01/1987. - - - RFC 1032 - M. Stahl, Domain administrators guide, 11/01/1987. - - - RFC 974 - C. Partridge, Mail routing and the domain system, 01/01/1986. - - diff --git a/LDP/guide/docbook/Linux-Networking/Database.xml b/LDP/guide/docbook/Linux-Networking/Database.xml deleted file mode 100644 index cc707d1c..00000000 --- a/LDP/guide/docbook/Linux-Networking/Database.xml +++ /dev/null @@ -1,20 +0,0 @@ - - -Database - - -Most databases are supported under Linux, including Oracle, DB2, Sybase, Informix, MySQL, PostgreSQL, -InterBase and Paradox. Databases, and the Structures Query Language they work with, are complex, and this -chapter has neither the space or depth to deal with them. Read the next section on PHP to learn how to set -a dynamically generated Web portal in about five minutes. - -We'll be using MySQL because it's extremely fast, capable of handling large databases (200G databases aren't -unheard of), and has recently been made open source. It also works well with PHP. While currently -lacking transaction support (due to speed concerns), a future version of MySQL will have this opt - - -* Connecting to MS SQL 6.x+ via Openlink/PHP/ODBC mini-HOWTO - -* Sybase Adaptive Server Anywhere for Linux HOWTO - - diff --git a/LDP/guide/docbook/Linux-Networking/Email-Hosting.xml b/LDP/guide/docbook/Linux-Networking/Email-Hosting.xml deleted file mode 100644 index 936b9b11..00000000 --- a/LDP/guide/docbook/Linux-Networking/Email-Hosting.xml +++ /dev/null @@ -1,17 +0,0 @@ - - -Email - - -Alongside the Web, mail is the top reason for the popularity of the Internet. Email is an inexpensive and fast method of time-shifted messaging which, much like the Web, is actually based around sending and receiving plain text files. The protocol used is called the Simple Mail Transfer Protocol (SMTP). The server programs that implement SMTP to move mail from one server to another are called Mail Transfer Agents (MTAs). - - - -In times gone by, users would Telnet into the SMTP server itself and use a command line program like elm or pine to check ther mail. These days, users run email clients like Netscape, Evolution, Kmail or Outlook on their desktop to check their email off a local SMTP server. Additional protocols like POP3 and IMAP4 are used between the SMTP server and desktop mail client to allow clients to manipulate files on, and download from, their local mail server. The programs that implement POP3 and IMAP4 are called Mail Delivery Agents (MDAs). They are generally separate from MTAs. - - -* Linux Mail-Queue mini-HOWTO - -* The Linux Mail User HOWTO - - diff --git a/LDP/guide/docbook/Linux-Networking/FTP.xml b/LDP/guide/docbook/Linux-Networking/FTP.xml deleted file mode 100644 index 90fc6694..00000000 --- a/LDP/guide/docbook/Linux-Networking/FTP.xml +++ /dev/null @@ -1,34 +0,0 @@ - - -FTP - - -File Transport Protocol (FTP) is an efficient way to transfer files between -machines across networks and clients and servers exist for almost all platforms -making FTP the most convenient (and therefore popular) method of transferring -files. FTP was first developed by the University of California, Berkeley for -inclusion in 4.2BSD (Berkeley Unix). The RFC (Request for Comments) -documents for the protocol is now known as RFC 959 and is available at -ftp://nic.merit.edu/documents/rfc/rfc0959.txt. - - - -There are two typical modes of running an FTP server - either anonymously or -account-based. Anonymous FTP servers are by far the most popular; they allow -any machine to access the FTP server and the files stored on it with the same -permissions. No usernames or passwords are transmitted down the wire. -Account-based FTP allows users to login with real usernames and passwords. -While it provides greater access control than anonymous FTP, transmitting real -usernames and password unencrypted over the Internet is generally avoided for -security reasons. - - - -An FTP client is the userland application that provides access to FTP -servers. There are many FTP clients available. Some are graphical, and -some are text-based. - - -* FTP HOWTO - - diff --git a/LDP/guide/docbook/Linux-Networking/LDAP.xml b/LDP/guide/docbook/Linux-Networking/LDAP.xml deleted file mode 100644 index 720d594e..00000000 --- a/LDP/guide/docbook/Linux-Networking/LDAP.xml +++ /dev/null @@ -1,4397 +0,0 @@ - - -LDAP - -Information about installing, configuring, running and maintaining a LDAP -(Lightweight Directory Access Protocol) Server on a Linux machine is -presented on this section. This section also presents details about how to -create LDAP databases, how to add, how to update and how to delete -information on the directory. This paper is mostly based on the University of -Michigan LDAP information pages and on the OpenLDAP Administrator's Guide. - ------------------------------------------------------------------------------ -Chapter 1. Introduction - -The main purpose of this document is to set up and use a LDAP Directory -Server on your Linux machine.You will learn how to install, configure, run -and maintain the LDAP server. After you also learn how you can store, -retrieve and update information on your Directory using the LDAP clients and -utilities. The daemon for the LDAP directory server is called slapd and it -runs on many different UNIX platforms. - -There is another daemon that cares for replication between LDAP servers. It's -called slurpd and for the moment you don't need to worry about it. In this -document you will run a slapd which provides directory service for your local -domain only, without replication, so without slurpd. Complete information -about replication is available at: OpenLDAP Administrator's Guide - -The local domain setup represents a simple choice for configuring your -server, good for starting and easy to upgrade to another configuration later -if you want. The information presented on this document represents a nice -initialization on using the LDAP server. Possibly after reading this document -you will feel encouraged to expand the capabilities of your server and even -write your own clients, using the already available C, C++ and Java -Development Kits. ------------------------------------------------------------------------------ - -1.1. What's LDAP ? - -LDAP stands for Lightweight Directory Access Protocol. As the name suggests, -it is a lightweight client-server protocol for accessing directory services, -specifically X.500-based directory services. LDAP runs over TCP/IP or other -connection oriented transfer services. LDAP is defined in [ftp://ftp.isi.edu/ -in-notes/rfc2251.txt] RFC2251 "The Lightweight Directory Access Protocol -(v3). - -A directory is similar to a database, but tends to contain more descriptive, -attribute-based information. The information in a directory is generally read -much more often than it is written. Directories are tuned to give -quick-response to high-volume lookup or search operations. They may have the -ability to replicate information widely in order to increase availability and -reliability, while reducing response time. When directory information is -replicated, temporary inconsistencies between the replicas may be OK, as long -as they get in sync eventually. - -There are many different ways to provide a directory service. Different -methods allow different kinds of information to be stored in the directory, -place different requirements on how that information can be referenced, -queried and updated, how it is protected from unauthorized access, etc. Some -directory services are local, providing service to a restricted context -(e.g., the finger service on a single machine). Other services are global, -providing service to a much broader context. ------------------------------------------------------------------------------ - -1.2. How does LDAP work ? - -LDAP directory service is based on a client-server model. One or more LDAP -servers contain the data making up the LDAP directory tree or LDAP backend -database. An LDAP client connects to an LDAP server and asks it a question. -The server responds with the answer, or with a pointer to where the client -can get more information (typically, another LDAP server). No matter what -LDAP server a client connects to, it sees the same view of the directory; a -name presented to one LDAP server references the same entry it would at -another LDAP server. This is an important feature of a global directory -service, like LDAP. ------------------------------------------------------------------------------ - -1.3. LDAP backends, objects and attributes - -The LDAP server daemon is called Slapd. Slapd supports a variety of different -database backends which you can use. - -They include the primary choice BDB, a high-performance transactional -database backend; LDBM, a lightweight DBM based backend; SHELL, a backend -interface to arbitrary shell scripts and PASSWD, a simple backend interface -to the passwd(5) file. - -BDB utilizes [http://www.sleepycat.com/] Sleepycat Berkeley DB 4. LDBM -utilizes either [http://www.sleepycat.com/] Berkeley DB or [http:// -www.gnu.org/software/gdbm/] GDBM. - -BDB transactional backend is suited for multi-user read/write database -access, with any mix of read and write operations. BDB is used in -applications that require: - -  * Transactions, including making multiple changes to the database - atomically and rolling back uncommitted changes when necessary. - -  * Ability to recover from systems crashes and hardware failures without - losing any committed transactions. - - -In this document I assume that you choose the BDB database. - -To import and export directory information between LDAP-based directory -servers, or to describe a set of changes which are to be applied to a -directory, the file format known as LDIF, for LDAP Data Interchange Format, -is typically used. A LDIF file stores information in object-oriented -hierarchies of entries. The LDAP software package you're going to get comes -with an utility to convert LDIF files to the BDB format - -A common LDIF file looks like this: -dn: o=TUDelft, c=NL -o: TUDelft -objectclass: organization -dn: cn=Luiz Malere, o=TUDelft, c=NL -cn: Luiz Malere -sn: Malere -mail: malere@yahoo.com -objectclass: person - -As you can see each entry is uniquely identified by a distinguished name, or -DN. The DN consists of the name of the entry plus a path of names tracing the -entry back to the top of the directory hierarchy (just like a tree). - -In LDAP, an object class defines the collection of attributes that can be -used to define an entry. The LDAP standard provides these basic types of -object classes: - -  * Groups in the directory, including unordered lists of individual objects - or groups of objects. - -  * Locations, such as the country name and description. - -  * Organizations in the directory. - -  * People in the directory. - - -An entry can belong to more than one object class. For example, the entry for -a person is defined by the person object class, but may also be defined by -attributes in the inetOrgPerson, groupOfNames, and organization -objectclasses. The server's object class structure (it's schema) determines -the total list of required and allowed attributes for a particular entry. - -Directory data is represented as attribute-value pairs. Any specific piece of -information is associated with a descriptive attribute. - -For instance, the commonName, or cn, attribute is used to store a person's -name . A person named Jonas Salk can be represented in the directory as -cn: Jonas Salk - -Each person entered in the directory is defined by the collection of -attributes in the person object class. Other attributes used to define this -entry could include: -givenname: Jonas -surname: Salk -mail: jonass@airius.com - -Required attributes include the attributes that must be present in entries -using the object class. All entries require the objectClass attribute, which -lists the object classes to which an entry belongs. - -Allowed attributes include the attributes that may be present in entries -using the object class. For example, in the person object class, the cn and -sn attributes are required. The description, telephoneNumber, seeAlso, and -userpassword attributes are allowed but are not required. - -Each attribute has a corresponding syntax definition. The syntax definition -describes the type of information provided by the attribute, for instance: - -  * bin binary. - -  * ces case exact string (case must match during comparisons). - -  * cis case ignore string (case is ignored during comparisons). - -  * tel telephone number string (like cis but blanks and dashes `- ' are - ignored during comparisons). - -  * dn distinguished name. - - -Note: Usually objectclass and attribute definitions reside on schema files, -on the subdirectory schema under the OpenLDAP installation home. ------------------------------------------------------------------------------ - -1.4. New versions of this document - -This document may receive corrections and updates based on the feedback -received by the readers. You should look at: - -[http://www.tldp.org/HOWTO/LDAP-HOWTO.html] http://www.tldp.org/HOWTO/ -LDAP-HOWTO.html - -for new versions of this HOWTO. ------------------------------------------------------------------------------ - -1.5. Opinions and Sugestions - -If you have any kind of doubt about some information avaiable on this -document, please contact me on the following email address: -[malere@yahoo.com] malere@yahoo.com - -If you have commentaries and/or sugestions, please let me know too! ------------------------------------------------------------------------------ - -1.6. Acknowledgments - -This Howto was result of an internship made by me on the TUDelft University - -Netherlands. I would like to thank the persons that encouraged me to write -this document: Rene van Leuken and Wim Tiwon. Thank you very much. They are -also Linux fans, just like me. - -I would like to thank also Thomas Bendler, author of the German Ldap-Howto, -for his contributions to my document and Joshua Go, great volunteer on the -LDP project. - -Karl Lattimer deserves a prize, for his great contribution on SASL related -issues. - -And thanks my Lord! ------------------------------------------------------------------------------ - -1.7. Copyright and Disclaimer - -Copyright (c) 1999 Luiz Ernesto Pinheiro Malčre. Permission is granted to -copy, distribute and/or modify this document under the terms of the GNU Free -Documentation License, Version 1.1 or any later version published by the Free -Software Foundation; with no Invariant Sections, with no Front-Cover Texts -and with no Back-Cover Texts. A copy of the license is included in the -section entitled "GNU Free Documentation License". - -If you have questions, please visit the following url: [http://www.gnu.org/ -licenses/fdl.txt] http://www.gnu.org/licenses/fdl.txt and contact the Linux -HOWTO coordinator, at: [guylhem@metalab.unc.edu] guylhem@metalab.unc.edu ------------------------------------------------------------------------------ - -Chapter 2. Installing the LDAP Server - -Five steps are necessary to install the server: - -  * Install the pre-required packages (if not already installed). - -  * Download the server. - -  * Unpack the software. - -  * Configure the Makefiles. - -  * Build the server. - - ------------------------------------------------------------------------------ -2.1. Pre-Requirements - -To be fully LDAPv3 compliant, OpenLDAP clients and servers require -installation of some additional packages. For writing this document, I've -used a Mandrake 9.0 box with a 2.4.20 Kernel, manually installing the -Berkeley BDB package and SASL libraries. - -OpenSSL TLS Libraries - -The OpenSSL TLS libraries are normally part of the base system or compose an -optional software component. The official OpenSSL url is: [http:// -www.openssl.org] http://www.openssl.org - -Kerberos Authentication Services - -OpenLDAP clients and servers support Kerberos-based authentication services. -In particular, OpenLDAP supports SASL/GSSAPI authentication mechanism using -either Heimdal or MIT Kerberos V packages. If you desire to use -Kerberos-based SASL/GSSAPI authentication, you should install either Heimdal -or MIT Kerberos V. Heimdal Kerberos is available from [http://www.pdc.kth.se/ -heimdal] http://www.pdc.kth.se/heimdal MIT Kerberos is available from [http:/ -/web.mit.edu/kerberos/www] http://web.mit.edu/kerberos/www - -The use of strong authentication services, such as those provided by -Kerberos, is highly recommended. - -Cyrus's Simple Authentication and Security Layer Libraries - -Cyrus's SASL libraries are normally part of the base system or compose an -optional software component. Cyrus SASL is available from [http:// -asg.web.cmu.edu/sasl/sasl-library.html] http://asg.web.cmu.edu/sasl/ -sasl-library.html. Cyrus SASL will make use of OpenSSL and Kerberos/GSSAPI -libraries if preinstalled. By the time of this writing, I've used Cyrus SASL -2.1.17. - -Database Software - -Slapd's primary database backend, BDB, requires [http://www.sleepycat.com] -Sleepycat Software Berkeley DB, version 4. If not available at configure -time, you will not be able to build slapd with primary database backend. - -Your operating system may provide Berkeley DB, version 4, in the base system -or as an optional software component. If not, there are several versions -available at [http://www.sleepycat.com/download.html] Sleepycat. At the time -of this writing, the latest release, version 4.2.52, is recommended. -OpenLDAP's slapd LDBM backend supports a variety of database managers, like -Berkeley DB (version 3) and GDBM. GDBM is available from [http://www.fsf.org -/] FSF's download site [ftp://ftp.gnu.org/pub/gnu/gdbm/] ftp://ftp.gnu.org/ -pub/gnu/gdbm/. - -Threads - -Threads support are almost guaranteed to be part of your base Linux system. -OpenLDAP is designed to take advantage of threads. OpenLDAP supports POSIX -pthreads, Mach CThreads, and a number of other varieties. The configure -script will complain if it cannot find a suitable thread subsystem. If this -occurs, please consult the Software - Installation - Platform Hints section -of the OpenLDAP FAQ: [http://www.openldap.org/faq/] http://www.openldap.org/ -faq/. - -TCP Wrappers - -Slapd supports TCP wrappers (IP level access control filters) if -preinstalled. Use of TCP wrappers or other IP-level access filters (such as -those provided by an IP-level firewall) is recommended for servers containing -non-public information. ------------------------------------------------------------------------------ - -2.2. Downloading the Package - -There are two free distributed LDAP servers: University of Michigan LDAP -server and OpenLDAP server. There's also the Netscape Directory Server, which -is free only under some conditions (educational institutions get it free, for -example). The OpenLDAP server is based on the latest version of the -University of Michigan Server and there are mailing lists and additional -documentation available for it. This document assumes that you are using the -OpenLDAP server. - -It's latest tar gzipped version is avaiable on the following address: - -[http://www.openldap.org] http://www.openldap.org - -If you want to get the latest version of University of Michigan Server, go to -this address: - -[ftp://terminator.rs.itd.umich.edu/ldap] ftp://terminator.rs.itd.umich.edu/ -ldap - -To write this document, I used the 2.2.5 version of the OpenLDAP package. My -operating system is a Mandrake Linux 9.0 with kernel 2.4.20. - -On the OpenLDAP site you can always find the latest development and stable -versions of the OpenLDAP server. By the time this document was updated, the -latest stable version was openldap-stable-20031217.tgz (version 2.1.25). The -latest development version was also openldap-2.2.5.tgz. ------------------------------------------------------------------------------ - -2.3. Unpacking the Software - -Now that you have the tar gzipped package on your local machine, you can -unpack it. - -First copy the package to a desirable directory, for example /usr/local. Next -use the following command: -tar xvzf openldap-2.2.5.tgz - -You can use this command too, as well: -gunzip openldap-2.2.5.tgz | tar xvf - ------------------------------------------------------------------------------ - -2.4. Configuring the Software - -The OpenLDAP server sources are distributed with a configuration script for -setting options like installation directories, compiler and linker flags. -Type the following command on the directory where you unpacked the software: -./configure --help - -This will print all options that you can customize with the configure script -before you build the software. Some usefull options are --prefix=pref , ---exec-prefix=eprefix and --bindir=dir, for setting instalation directories. -Normally if you run configure without options, it will auto-detect the -appropriate settings and prepare to build things on the default common -location. So just type: -./configure - -And watch the output to see if all went well - -Tip: Sometimes you need to pass specific options to your configure script, -like for example --with-tls (for enabling slapd to use a secure channel: -LDAPS://). In this case, you might have your SSL/TLS libraries residing on a -non-standard directory of your system. You can make the configure script -aware of the libraries location changing you environment with the env -command. Example: suppose you've installed the openssl package under /usr/ -local/openssl. The following command will build slapd with SSL/TLS support: -env CPPFLAGS=-I/usr/local/openssl/include \ - LDFLAGS=-L/usr/local/openssl/lib \ - configure --with-tls ... - -You can specify the following environment variables with the env command -before the configure script: - -  * CC: Specify alternative C Compiler. - -  * CFLAGS: Specify additional compiler flags. - -  * CPPFLAGS: Specify C Preprocessor flags. - -  * LDFLAGS: Specify linker flags. - -  * LIBS: Specify additional libraries. - - ------------------------------------------------------------------------------ -2.5. Building the Server - -After configuring the software you can start building it. First build the -dependencies, using the command: -make depend - -Build the server after that, using the command: -make - -If all goes well, the server will build as configured. If not, return to the -previous step to review the configuration settings. You should read the -INSTALL and README files located in the directory where you unpacked the -software. Also, check the configure script specific hints, they are located -in the path doc/install/configure under the directory you unpacked the -software. - -To ensure a correct build, you should run the test suite (it only takes a few -minutes): -make test - -Tests which apply to your configuration will run and they should pass. Some -tests, such as the replication test, may be skipped. - -Now install the binaries and man pages. You may need to be superuser to do -this (depending on where you are installing things): -su root -c 'make install' - -That's all, now you have the binary of the server and the binaries of several -other utilities. Go to the Chapter 3 section to see how to configure the -operation of your LDAP server. ------------------------------------------------------------------------------ - -Chapter 3. Configuring the LDAP Server - -Once the software has been installed and built, you are ready to configure it -for use at your site. All slapd runtime configuration is accomplished through -the slapd.conf file, installed in the prefix directory you specified in the -configuration script or by default in /usr/local/etc/openldap. - -This section details the commonly used configuration directives in -slapd.conf. For a complete list, see the slapd.conf(5) manual page. The -configuration file directives are separated into global, backend specific and -database specific. Here you will find descriptions of directives, together -with their default values (if any) and examples of use. ------------------------------------------------------------------------------ - -3.1. Configuration File Format - -The slapd.conf file consists of three types of configuration information: -global, backend specific, and database specific. Global information is -specified first, followed by information associated with a particular backend -type, which is then followed by information associated with a particular -database instance. - -Global directives can be overridden in a backend and/or database directives, -backend directives can be overridden by database directives. - -Blank lines and comment lines beginning with a '#' character are ignored. If -a line begins with white space, it is considered a continuation of the -previous line (even if the previous line is a comment). The general format of -slapd.conf is as follows: -# global configuration directives - - -# backend definition -backend - - -# first database definition & config directives -database - - -# second database definition & config directives -database - - -# second "typeA" database definition & config directives -database - - -# subsequent backend & database definitions & config directives -... - -A configuration directive may take arguments. If so, they are separated by -white space. If an argument contains white space, the argument should be -enclosed in double quotes "like this". If an argument contains a double quote -or a backslash character `\', the character should be preceded by a backslash -character `\'. - -The distribution contains an example configuration file that will be -installed in the /usr/local/etc/openldap directory. A number of files -containing schema definitions (attribute types and object classes) are also -provided in the /usr/local/etc/openldap/schema directory. ------------------------------------------------------------------------------ - -3.2. Global Directives - -Directives described in this section apply to all backends and databases -unless specifically overridden in a backend or database definition. Arguments -that should be replaced by actual text are shown in brackets <>. -access to [ by ]+ - -This directive grants access (specified by ) to a set of entries -and/or attributes (specified by ) by one or more requesters (specified -by ). See the Section 3.7 examples for more details. - -Important: If no access directives are specified, the default access control -policy, access to * by * read, allows all both authenticated and anonymous -users read access. -attributetype - -This directive defines an attribute type. Check the following URL for more -details: [http://www.openldap.org/doc/admin22/schema.html] Schema -Specification -idletimeout - -Specify the number of seconds to wait before forcibly closing an idle client -connection. An idletimeout of 0, the default, disables this feature. -include - -This directive specifies that slapd should read additional configuration -information from the given file before continuing with the next line of the -current file. The included file should follow the normal slapd config file -format. The file is commonly used to include files containing schema -specifications. - -Note:You should be careful when using this directive - there is no small -limit on the number of nested include directives, and no loop detection is -done. -loglevel - -This directive specifies the level at which debugging statements and -operation statistics should be syslogged (currently logged to the syslogd(8) -LOCAL4 facility). You must have configured OpenLDAP --enable-debug (the -default) for this to work (except for the two statistics levels, which are -always enabled). Log levels are additive. To display what numbers correspond -to what kind of debugging, invoke slapd with -? or consult the table below. -The possible values for are: - -Table 3-1. Debugging Levels -+-----+-----------------------------------------+ -|Level|Description | -+-----+-----------------------------------------+ -|-1 |enable all debugging | -+-----+-----------------------------------------+ -|0 |no debugging | -+-----+-----------------------------------------+ -|1 |trace function calls | -+-----+-----------------------------------------+ -|2 |debug packet handling | -+-----+-----------------------------------------+ -|4 |heavy trace debugging | -+-----+-----------------------------------------+ -|8 |connection management | -+-----+-----------------------------------------+ -|16 |print out packets sent and received | -+-----+-----------------------------------------+ -|32 |search filter processing | -+-----+-----------------------------------------+ -|64 |configuration file processing | -+-----+-----------------------------------------+ -|128 |access control list processing | -+-----+-----------------------------------------+ -|256 |stats log connections/operations/results | -+-----+-----------------------------------------+ -|512 |stats log entries sent | -+-----+-----------------------------------------+ -|1024 |print communication with shell backends | -+-----+-----------------------------------------+ -|2048 |print entry parsing debugging | -+-----+-----------------------------------------+ - -Example: - -loglevel 255 or loglevel -1 - -This will cause lots and lots of debugging information to be syslogged. - -Default: - -loglevel 256 -objectclass - -This directive defines an object class. Check the following URL for more -details: [http://www.openldap.org/doc/admin22/schema.html] Schema -Specification -referral - -This directive specifies the referral to pass back when slapd cannot find a -local database to handle a request. - -Example: - -referral ldap://root.openldap.org - -This will refer non-local queries to the global root LDAP server at the -OpenLDAP Project. Smart LDAP clients can re-ask their query at that server, -but note that most of these clients are only going to know how to handle -simple LDAP URLs that contain a host part and optionally a distinguished name -part. -sizelimit - -This directive specifies the maximum number of entries to return from a -search operation. - -Default: - -sizelimit 500 -timelimit - -This directive specifies the maximum number of seconds (in real time) slapd -will spend answering a search request. If a request is not finished in this -time, a result indicating an exceeded timelimit will be returned. - -Default: - -timelimit 3600 ------------------------------------------------------------------------------ - -3.3. General Backend Directives - -Directives in this section apply only to the backend in which they are -defined. They are supported by every type of backend. Backend directives -apply to all databases instances of the same type and, depending on the -directive, may be overridden by database directives. -backend - -This directive marks the beginning of a backend definition. should be -one of bdb or one of other supported backend types listed below: - - -Table 3-2. Database Backends -+-------+------------------------------------------------------+ -|Type |Description | -+-------+------------------------------------------------------+ -|bdb |Berkeley DB transactional backend | -+-------+------------------------------------------------------+ -|dnssrv |DNS SRV backend | -+-------+------------------------------------------------------+ -|ldbm |Lightweight DBM backend | -+-------+------------------------------------------------------+ -|ldap |Lightweight Directory Access Protocol (Proxy) backend | -+-------+------------------------------------------------------+ -|meta |Meta Directory backend | -+-------+------------------------------------------------------+ -|monitor|Monitor backend | -+-------+------------------------------------------------------+ -|passwd |Provides read-only access to passwd(5) | -+-------+------------------------------------------------------+ -|perl |Perl programmable backend | -+-------+------------------------------------------------------+ -|shell |Shell (external program) backend | -+-------+------------------------------------------------------+ -|sql |SQL programmable backend | -+-------+------------------------------------------------------+ - -Example: - -backend bdb - -This marks the beginning of a new BDB backend definition ------------------------------------------------------------------------------ - -3.4. General Database Directives - -Directives in this section apply only to the database in which they are -defined. They are supported by every type of database. -database - -This directive marks the beginning of a new database instance definition. < -type> should be one of the backend types listed on the previous item. - -Example: - -database bdb - -This marks the beginning of a new BDB backend database instance definition. -readonly { on | off } - -This directive puts the database into "read-only" mode. Any attempts to -modify the database will return an "unwilling to perform" error. - -Default: - -readonly off -replica uri=ldap[s]://[:] | host=[:] - [bindmethod={simple|kerberos|sasl}] - ["binddn="] - [saslmech=] - [authcid=] - [authzid=] - [credentials=] - [srvtab=] - -This directive specifies a replication site for this database. The uri= -parameter specifies a scheme, a host and optionally a port where the slave -slapd instance can be found. Either a domain name or IP address may be used -for . If is not given, the standard LDAP port number (389 or -636) is used. - -Note: host is deprecated in favor of the uri parameter. - -uri allows the replica LDAP server to be specified as an LDAP URI such as -ldap://slave.example.com:389 or ldaps://slave.example.com:636 - -The binddn= parameter gives the DN to bind as for updates to the slave slapd. -It should be a DN which has read/write access to the slave slapd's database. -It must also match the updatedn directive in the slave slapd's config file. -Generally, this DN should not be the same as the rootdn of the master -database. Since DNs are likely to contain embedded spaces, the entire "binddn -=" string should be enclosed in double quotes. - -The bindmethod is simple or kerberos or sasl, depending on whether simple -password-based authentication or Kerberos authentication or SASL -authentication is to be used when connecting to the slave slapd. - -Simple authentication should not be used unless adequate integrity and -privacy protections are in place (e.g. TLS or IPSEC). Simple authentication -requires specification of binddn and credentials parameters. - -Kerberos authentication is deprecated in favor of SASL authentication -mechanisms, in particular the KERBEROS_V4 and GSSAPI mechanisms. Kerberos -authentication requires binddn and srvtab parameters. - -SASL authentication is generally recommended. SASL authentication requires -specification of a mechanism using the saslmech parameter. Depending on the -mechanism, an authentication identity and/or credentials can be specified -using authcid and credentials respectively. The authzid parameter may be used -to specify an authorization identity. - -Check this URL for additional details: [http://www.openldap.org/doc/admin22/ -replication.html] Replication with Slurpd. -replogfile - -This directive specifies the name of the replication log file to which slapd -will log changes. The replication log is typically written by slapd and read -by slurpd. Normally, this directive is only used if slurpd is being used to -replicate the database. However, you can also use it to generate a -transaction log, if slurpd is not running. In this case, you will need to -periodically truncate the file, since it will grow indefinitely otherwise. - -Check this URL for additional details: [http://www.openldap.org/doc/admin22/ -replication.html] Replication with Slurpd. -rootdn - -This directive specifies the DN that is not subject to access control or -administrative limit restrictions for operations on this database. The DN -need not refer to an entry in the directory. The DN may refer to a SASL -identity. - -Entry-based Example: - -rootdn "cn=Manager, dc=example, dc=com" - -SASL-based Example: - -rootdn "uid=root,cn=example.com,cn=digest-md5,cn=auth" -rootpw - -This directive can be used to specify a password for the rootdn (when the -rootdn is set to a DN within the database). - -Example: - -rootpw secret - -It is also permissible to provide hash of the password in RFC 2307 form. -slappasswd may be used to generate the password hash. - -Example: - -rootpw {SSHA}ZKKuqbEKJfKSXhUbHG3fG8MDn9j1v4QN - -The hash was generated using the command slappasswd -s secret. -suffix - -This directive specifies the DN suffix of queries that will be passed to this -backend database. Multiple suffix lines can be given, and at least one is -required for each database definition. - -Example: - -suffix "dc=example, dc=com" - -Queries with a DN ending in "dc=example, dc=com" will be passed to this -backend. - -Note: When the backend to pass a query to is selected, slapd looks at the -suffix line(s) in each database definition in the order they appear in the -file. Thus, if one database suffix is a prefix of another, it must appear -after it in the config file. -syncrepl - -This directive is used to keep a replicated database synchronized with the -master database, so that the replicated database content will be kept up to -date with the master content. - -This document doesn't cover in details this directive, because we're -configuring a single LDAP Server. For more informations about this directive, -please visit : [http://www.openldap.org/doc/admin22/syncrepl.html] LDAP Sync -Replication. -updatedn - -This directive is only applicable in a slave slapd. It specifies the DN -allowed to make changes to the replica. This may be the DN slurpd binds as -when making changes to the replica or the DN associated with a SASL identity. - -Entry-based Example: - -updatedn "cn=Update Daemon, dc=example, dc=com" - -SASL-based Example: - -updatedn "uid=slurpd,cn=example.com,cn=digest-md5,cn=auth" - -Check this URL for additional details: [http://www.openldap.org/doc/admin22/ -replication.html] Replication with Slurpd. -updateref - -This directive is only applicable in a slave slapd. It specifies the URL to -return to clients which submit update requests upon the replica. If specified -multiple times, each URL is provided. - -Example: - -updateref ldap://master.example.net ------------------------------------------------------------------------------ - -3.5. BDB Database Directives - -Directives in this category only apply a BDB database. That is, they must -follow a "database bdb" line and come before any subsequent "backend" or -"database" line. For a complete reference of BDB configuration directives, -see the slapd-bdb manpages (man slapd-bdb). -directory - -This directive specifies the directory where the BDB files containing the -database and associated indexes reside. - -Default: - -directory /usr/local/var/openldap-data -sessionlog - -This directive specifies a session log store in the syncrepl replication -provider server which contains information on the entries that have been -scoped out of the replication content identified by . The first syncrepl -search request having the same value in the cookie establishes the -session log store in the provider server. The number of the entries in the -session log store is limited by . Excessive entries are removed from -the store in the FIFO order. Both and are non-negative -integers. has no more than three decimal digits. - -The LDAP Content Synchronization operation that falls into a pre-existing -session can use the session log store in order to reduce the amount of -synchronization traffic. If the replica is not so outdated that it can be -made up-to-date by the information in the session store, the provider slapd -will send the consumer slapd the identities of the scoped-out entries -together with the in-scope entries added to or modified within the -replication content. If the replica status is outdated too much and beyond -the coverage of the history store, then the provider slapd will send the -identities of the unchanged in-scope entries along with the changed in-scope -entries. The consumer slapd will then remove those entries in the replica -which are not identified as present in the provider content. - -For more informations about syncrepl, please visit : [http://www.openldap.org -/doc/admin22/syncrepl.html] LDAP Sync Replication. ------------------------------------------------------------------------------ - -3.6. LDBM Database Directives - -Directives in this category only apply to the LDBM backend database. That is, -they must follow a "database ldbm" line and come before any other "database" -or "backend" line. For a complete reference of LDBM configuration directives, -see the slapd-ldbm manpages (man slapd-ldbm). -cachesize - -This directive specifies the size in entries of the in-memory cache -maintained by the LDBM backend database instance. - -Default: - -cachesize 1000 -dbcachesize - -This directive specifies the size in bytes of the in-memory cache associated -with each open index file. If not supported by the underlying database -method, this directive is ignored without comment. Increasing this number -uses more memory but can cause a dramatic performance increase, especially -during modifies or when building indexes. - -Default: - -dbcachesize 100000 -dbnolocking - -This option, if present, disables database locking. Enabling this option may -improve performance at the expense of data security. -dbnosync - -This option causes on-disk database contents not to be immediately -synchronized with in memory changes upon change. Enabling this option may -improve performance at the expense of data security. -directory - -This directive specifies the directory where the LDBM files containing the -database and associated indexes live. - -Default: - -directory /usr/local/var/openldap-data -index { | default} [pres,eq,approx,sub,none] - -This directive specifies the indexes to maintain for the given attribute. If -only an is given, the default indexes are maintained. - -Example: -index default pres,eq -index uid -index cn,sn pres,eq,sub -index objectClass eq - -The first line sets the default set of indexes to maintain to present and -equality. The second line causes the default (pres,eq) set of indices to be -maintained for the uid attribute type. The third line causes present, -equality and substring indices to be maintained for cn and sn attribute -types. The fourth line causes an equality index for the objectClass attribute -type. - -By default, no indices are maintained. It is generally advised that minimally -an equality index upon objectClass be maintained. - -index objectClass eq -mode - -This directive specifies the file protection mode that newly created database -index files should have. - -Default: - -mode 0600 ------------------------------------------------------------------------------ - -3.7. Access Control Examples - -The access control facility provided by the access directive is quite -powerful. This section shows some examples of it's use. First, some simple -examples: -access to * by * read - -This access directive grants read access to everyone. - -The following example shows the use of a regular expression to select the -entries by DN in two access directives where ordering is significant. -access to dn=".*, o=U of M, c=US" -by * search -access to dn=".*, c=US" -by * read - -Read access is granted to entries under the c=US subtree, except for those -entries under the "o=U of M, c=US" subtree, to which search access is -granted. No access is granted to c=US as neither access directive matches -this DN.If the order of these access directives was reversed, the -U-M-specific directive would never be matched, since all U-M entries are also -c=US entries. - -Another way to implement the same access controls is: -access to dn.children="dc=example,dc=com" - by * search - access to dn.children="dc=com" - by * read - -Read access is granted to entries under the dc=com subtree, except for those -entries under the dc=example,dc=com subtree, to which search access is -granted. No access is granted to dc=com as neither access directive matches -this DN. If the order of these access directives was reversed, the trailing -directive would never be reached, since all entries under dc=example,dc=com -are also under dc=com entries. - -Note: Also note that if no access to directive or no "by " clause -matches, access is denied. That is, every access to directive ends with an -implicit by * none clause and every access list ends with an implicit access -to * by * none directive. - -The next example again shows the importance of ordering, both of the access -directives and the "by " clauses. It also shows the use of an attribute -selector to grant access to a specific attribute and various selectors. -access to dn.subtree="dc=example,dc=com" attr=homePhone - by self write - by dn.children=dc=example,dc=com" search - by peername=IP:10\..+ read -access to dn.subtree="dc=example,dc=com" - by self write - by dn.children="dc=example,dc=com" search - by anonymous auth - -This example applies to entries in the "dc=example,dc=com" subtree. To all -attributes except homePhone, an entry can write to itself, entries under -example.com entries can search by them, anybody else has no access (implicit -by * none) excepting for authentication/authorization (which is always done -anonymously). The homePhone attribute is writable by the entry, searchable by -entries under example.com, readable by clients connecting from network 10, -and otherwise not readable (implicit by * none). All other access is denied -by the implicit access to * by * none. - -Sometimes it is useful to permit a particular DN to add or remove itself from -an attribute. For example, if you would like to create a group and allow -people to add and remove only their own DN from the member attribute, you -could accomplish it with an access directive like this: -access to attr=member,entry - by dnattr=member selfwrite - -The dnattr selector says that the access applies to entries listed in -the member attribute. The selfwrite access selector says that such members -can only add or delete their own DN from the attribute, not other values. The -addition of the entry attribute is required because access to the entry is -required to access any of the entry's attributes. - -There's plenty of information about Access Control on the OpenLDAP -Administrator's Guide. Take a look at: [http://www.openldap.org/doc/admin22/ -slapdconfig.html#Access Control] Access Control for more information about -this subject. ------------------------------------------------------------------------------ - -3.8. Configuration File Example - -The following is an example configuration file, interspersed with explanatory -text. It defines two databases to handle different parts of the X.500 tree; -both are BDB database instances. The line numbers shown are provided for -reference only and are not included in the actual file. First, the global -configuration section: -1. # example config file - global configuration section -2. include /usr/local/etc/schema/core.schema -3. referral ldap://root.openldap.org -4. access to * by * read - -Line 1 is a comment. Line 2 includes another config file which contains core -schema definitions. The referral directive on line 3 means that queries not -local to one of the databases defined below will be referred to the LDAP -server running on the standard port (389) at the host root.openldap.org. - -Line 4 is a global access control. It applies to all entries (after any -applicable database-specific access controls). - -The next section of the configuration file defines a BDB backend that will -handle queries for things in the "dc=example,dc=com" portion of the tree. The -database is to be replicated to two slave slapds, one on truelies, the other -on judgmentday. Indexes are to be maintained for several attributes, and the -userPassword attribute is to be protected from unauthorized access. -5. # BDB definition for the example.com -6. database bdb -7. suffix "dc=example,dc=com" -8. directory /usr/local/var/openldap-data -9. rootdn "cn=Manager,dc=example,dc=com" -10. rootpw secret -11. # replication directives -12. replogfile /usr/local/var/openldap/slapd.replog -13. replica uri=ldap://slave1.example.com:389 -14. binddn="cn=Replicator,dc=example,dc=com" -15. bindmethod=simple credentials=secret -16. replica uri=ldaps://slave2.example.com:636 -17. binddn="cn=Replicator,dc=example,dc=com" -18. bindmethod=simple credentials=secret -19. # indexed attribute definitions -20. index uid pres,eq -21. index cn,sn,uid pres,eq,sub -22. index objectClass eq -23. # database access control definitions -24. access to attr=userPassword -25. by self write -26. by anonymous auth -27. by dn.base="cn=Admin,dc=example,dc=com" write -28. by * none -29. access to * -30. by self write -31. by dn.base="cn=Admin,dc=example,dc=com" write -32. by * read - -Line 5 is a comment. The start of the database definition is marked by the -database keyword on line 6. Line 7 specifies the DN suffix for queries to -pass to this database. Line 8 specifies the directory in which the database -files will live. - -Lines 9 and 10 identify the database "super user" entry and associated -password. This entry is not subject to access control or size or time limit -restrictions. Please remeber to encrypt the rootpw using slappasswd. - -Example: rootpw {SSHA}Jq4xhhkGa7weT/0xKmaecT4HEXsdqiYA - -Lines 11 through 18 are for replication. See the [http://www.openldap.org/doc -/admin22/replication.html] Replication link for more information on these -directives. - -Lines 20 through 22 indicate the indexes to maintain for various attributes. - -Lines 24 through 32 specify access control for entries in the this database. -As this is the first database, the controls also apply to entries not held in -any database (such as the Root DSE). For all applicable entries, the -userPassword attribute is writable by the entry itself and by the "admin" -entry. It may be used for authentication/authorization purposes, but is -otherwise not readable. All other attributes are writable by the entry and -the "admin" entry, but may be read by all users (authenticated or not). - -The next section of the example configuration file defines another BDB -database. This one handles queries involving the dc=example,dc=net subtree -but is managed by the same entity as the first database. Note that without -line 39, the read access would be allowed due to the global access rule at -line 4. -33. # BDB definition for example.net -34. database bdb -35. suffix "dc=example,dc=net" -36. directory /usr/local/var/openldap-data-net -37. rootdn "cn=Manager,dc=example,dc=com" -38. index objectClass eq -39. access to * by users read ------------------------------------------------------------------------------ - -Chapter 4. Running the LDAP Server - -The LDAP daemon slapd is designed to be run as a stand-alone server. This -allows the server to take advantage of caching, manage concurrency issues -with underlying databases, and conserve system resources. Running from inetd -(8) is not an option. ------------------------------------------------------------------------------ - -4.1. Command Line Options - -Slapd supports a number of command-line options as detailed in the manual -page. This section details a few commonly used options: --f - -This option specifies an alternate configuration file for slapd. The default -is normally /usr/local/etc/openldap/slapd.conf. --h - -This option specifies alternative listener configurations. The default is -ldap:/// which implies LDAP over TCP on all interfaces on the default LDAP -port 389. You can specify specific host-port pairs or other protocol schemes -(such as ldaps:// or ldapi://). For example, -h "ldaps:// ldap://127.0.0.1: -667" will create two listeners: one for LDAP over SSL on all interfaces on -the default LDAP/SSL port 636, and one for LDAP over TCP on the localhost -(loopback) interface on port 667. Hosts may be specified using IPv4 -dotted-decimal form or using host names. Port values must be numeric. --n - -This option specifies the service name used for logging and other purposes. -The default service name is slapd. --l - -This option specifies the local user for the syslog(8) facility. Values can -be LOCAL0, LOCAL1, LOCAL2, ..., and LOCAL7. The default is LOCAL4. This -option may not be supported on all systems. See the Section 6.5 for more -details. --u user -g group - -These options specify the user and group, respectively, to run slapd as. user -can be either a user name or uid. group can be either a group name or gid. --r directory - -This option specifies a run-time directory. slapd will chroot(2) to this -directory after opening listeners but before reading any configuration files -or initializing any backends. --d | ? - -This option sets the slapd debug level to . When level is a `?' -character, the various debugging levels are printed and slapd exits, -regardless of any other options you give it. Current debugging levels are: - -Table 4-1. Debugging Levels -+-----+-----------------------------------------+ -|Level|Description | -+-----+-----------------------------------------+ -|-1 |enable all debugging | -+-----+-----------------------------------------+ -|0 |no debugging | -+-----+-----------------------------------------+ -|1 |trace function calls | -+-----+-----------------------------------------+ -|2 |debug packet handling | -+-----+-----------------------------------------+ -|4 |heavy trace debugging | -+-----+-----------------------------------------+ -|8 |connection management | -+-----+-----------------------------------------+ -|16 |print out packets sent and received | -+-----+-----------------------------------------+ -|32 |search filter processing | -+-----+-----------------------------------------+ -|64 |configuration file processing | -+-----+-----------------------------------------+ -|128 |access control list processing | -+-----+-----------------------------------------+ -|256 |stats log connections/operations/results | -+-----+-----------------------------------------+ -|512 |stats log entries sent | -+-----+-----------------------------------------+ -|1024 |print communication with shell backends | -+-----+-----------------------------------------+ -|2048 |print entry parsing debugging | -+-----+-----------------------------------------+ - -You may enable multiple levels by specifying the debug option once for each -desired level. Or, since debugging levels are additive, you can do the math -yourself. That is, if you want to trace function calls and watch the config -file being processed, you could set level to the sum of those two levels (in -this case, -d 65). Or, you can let slapd do the math, (e.g. -d 1 -d 64). -Consult for more details. - -Note: slapd must have been compiled with -DLDAP_DEBUG defined for any -debugging information beyond the two stats levels to be available. ------------------------------------------------------------------------------ - -4.2. Starting the LDAP Server - -In general, slapd is run like this: -/usr/local/etc/libexec/slapd [ diff --git a/LDP/guide/docbook/Linux-Networking/NFS.xml b/LDP/guide/docbook/Linux-Networking/NFS.xml deleted file mode 100644 index 2dcf6561..00000000 --- a/LDP/guide/docbook/Linux-Networking/NFS.xml +++ /dev/null @@ -1,2558 +0,0 @@ - - -NFS - -NFS (Network File System) - -The TCP/IP suite's equivalent of file sharing. This protocol operates at the Process/Application -layer of the DOD model, similar to the application layer of the OSI model. - -SLIP (Serial Line Internet Protocol) and PPP (Point-to-Point Protocol) - -Two protocols commonly used for dial-up access to the Internet. They are typically used with -TCP/IP; while SLIP works only with TCP/IP, PPP can be used with other protocols. - -SLIP was the first protocol for dial-up Internet access. It opeates at the physical layer of the -OSI model, and provides a simple interface to a UNIX or other dial-up host for Internet access. -SLIP does not provide security, so authentication is handled through prompts before initiating -the SLIP connection. - -PPP is a more recent development. It operates at the physical and data link layers of the OSI -model. In addition to the features of SLIP, PPP supports data compression, security (authentication), -and error control. PPP can also dynamically assign network addresses. - -Since PPP provides easier authentication and better security, it should be used for dial-up connections -whenever possible. However, you may need to use SLIRP to communicate with dial-up servers (particularly -older UNIC machines and dedicated hardware servers) that don't support PPP. - -> Start Config-HOWTO - -2.15. Automount Points - -If you don't like the mounting/unmounting thing, consider using autofs(5). You tell the autofs daemon what to automount and where starting with a file, /etc/auto.master. Its structure is simple: - - -/misc/etc/auto.misc -/mnt/etc/auto.mnt - -In this example you tell autofs to automount media in /misc and /mnt, while the mountpoints are specified in/etc/auto.misc and /etc/auto.mnt. An example of /etc/auto.misc: - - -# an NFS export -server -romy.buddy.net:/pub/export -# removable media -cdrom -fstype=iso9660,ro:/dev/hdb -floppy-fstype=auto:/dev/fd0 - -Start the automounter. From now on, whenever you try to access the inexistent mount point /misc/cdrom, il will be created and the CD-ROM will be mounted. - ->End Config-HOWTO - - 5.4. Unix Environment - - The preferred way to share files in a Unix networking environment is - through NFS. NFS stands for Network File Sharing and it is a protocol - originally developed by Sun Microsystems. It is a way to share files - between machines as if they were local. A client "mounts" a filesystem - "exported" by an NFS server. The mounted filesystem will appear to the - client machine as if it was part of the local filesystem. - - It is possible to mount the root filesystem at startup time, thus - allowing diskless clients to boot up and access all files from a - server. In other words, it is possible to have a fully functional - computer without a hard disk. - - Coda is a network filesystem (like NFS) that supports disconnected - operation, persistant caching, among other goodies. It's included in - 2.2.x kernels. Really handy for slow or unreliable networks and - laptops. - - NFS-related documents: - - ˇ http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root.html - - ˇ http://metalab.unc.edu/mdw/HOWTO/Diskless-HOWTO.html - - ˇ http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root-Client-mini- - HOWTO/index.html - - ˇ http://www.redhat.com/support/docs/rhl/NFS-Tips/NFS-Tips.html - - ˇ http://metalab.unc.edu/mdw/HOWTO/NFS-HOWTO.html - - CODA can be found at: http://www.coda.cs.cmu.edu/ - - - 5.4. Unix Environment - - The preferred way to share files in a Unix networking environment is - through NFS. NFS stands for Network File Sharing and it is a protocol - originally developed by Sun Microsystems. It is a way to share files - between machines as if they were local. A client "mounts" a filesystem - "exported" by an NFS server. The mounted filesystem will appear to the - client machine as if it was part of the local filesystem. - - It is possible to mount the root filesystem at startup time, thus - allowing diskless clients to boot up and access all files from a - server. In other words, it is possible to have a fully functional - computer without a hard disk. - - Coda is a network filesystem (like NFS) that supports disconnected - operation, persistant caching, among other goodies. It's included in - 2.2.x kernels. Really handy for slow or unreliable networks and - laptops. - - NFS-related documents: - - ˇ http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root.html - - ˇ http://metalab.unc.edu/mdw/HOWTO/Diskless-HOWTO.html - - ˇ http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root-Client-mini- - HOWTO/index.html - - ˇ http://www.redhat.com/support/docs/rhl/NFS-Tips/NFS-Tips.html - - ˇ http://metalab.unc.edu/mdw/HOWTO/NFS-HOWTO.html - - CODA can be found at: http://www.coda.cs.cmu.edu/ - -Samba is the Linux implementation of SMB under Linux. NFS is the Unix equivalent - a way to import and -export local files to and from remote machines. Like SMB, NFS sends information including user -passwords unencrypted, is its best to limit it to within your local network. - -As you know, all storage in Linux is visible within a single tree structure, and new hard disks, -CD-ROMs, Zip drives and other spaces are mounted on a particular directory. NFS shares are also -attached to the system in this manner. NFS is included in most Linux kernels, and the tools -necessary to be an NFS server and client come in most distributions. - -However, users of Linux kernel 2.2 hoping to use NFS may wish to upgrade to -kernel 2.4; while the earlier version of Linux NFS did work well, it was far slower than -most other Unix implementations of this protocol. - ->Start Config-HOWTO -2.15. Automount Points - -If you don't like the mounting/unmounting thing, consider using autofs(5). You tell the autofs daemon what to automount and where starting with a file, /etc/auto.master. Its structure is simple: - - -/misc/etc/auto.misc -/mnt/etc/auto.mnt - -In this example you tell autofs to automount media in /misc and /mnt, while the mountpoints are specified in/etc/auto.misc and /etc/auto.mnt. An example of /etc/auto.misc: - - -# an NFS export -server -romy.buddy.net:/pub/export -# removable media -cdrom -fstype=iso9660,ro:/dev/hdb -floppy-fstype=auto:/dev/fd0 - -Start the automounter. From now on, whenever you try to access the inexistent mount point /misc/cdrom, il will be created and the CD-ROM will be mounted. ->End Config-HOWTO - -> Linux NFS-HOWTO -> NFS-Root mini-HOWTO -> NFS-Root-Client Mini-HOWTO -> The Linux NIS(YP)/NYS/NIS+ HOWTO - - -Linux NFS-HOWTO - -Tavis Barr - -         tavis dot barr at liu dot edu -        - -Nicolai Langfeldt - -         janl at linpro dot no -        - -Seth Vidal - -        skvidal at phy dot duke dot edu -       - -Tom McNeal - -        trmcneal at attbi dot com -       - -2002-08-25 -Revision History -Revision v3.1 2002-08-25 Revised by: tavis -Typo in firewalling section in 3.0 -Revision v3.0 2002-07-16 Revised by: tavis -Updates plus additions to performance, security ------------------------------------------------------------------------------ - -Table of Contents -1. Preamble - 1.1. Legal stuff - 1.2. Disclaimer - 1.3. Feedback - 1.4. Translation - 1.5. Dedication - - -2. Introduction - 2.1. What is NFS? - 2.2. What is this HOWTO and what is it not? - 2.3. Knowledge Pre-Requisites - 2.4. Software Pre-Requisites: Kernel Version and nfs-utils - 2.5. Where to get help and further information - - -3. Setting Up an NFS Server - 3.1. Introduction to the server setup - 3.2. Setting up the Configuration Files - 3.3. Getting the services started - 3.4. Verifying that NFS is running - 3.5. Making changes to /etc/exports later on - - -4. Setting up an NFS Client - 4.1. Mounting remote directories - 4.2. Getting NFS File Systems to Be Mounted at Boot Time - 4.3. Mount options - - -5. Optimizing NFS Performance - 5.1. Setting Block Size to Optimize Transfer Speeds - 5.2. Packet Size and Network Drivers - 5.3. Overflow of Fragmented Packets - 5.4. NFS over TCP - 5.5. Timeout and Retransmission Values - 5.6. Number of Instances of the NFSD Server Daemon - 5.7. Memory Limits on the Input Queue - 5.8. Turning Off Autonegotiation of NICs and Hubs - 5.9. Synchronous vs. Asynchronous Behavior in NFS - 5.10. Non-NFS-Related Means of Enhancing Server Performance - - -6. Security and NFS - 6.1. The portmapper - 6.2. Server security: nfsd and mountd - 6.3. Client Security - 6.4. NFS and firewalls (ipchains and netfilter) - 6.5. Tunneling NFS through SSH - 6.6. Summary - - -7. Troubleshooting - 7.1. Unable to See Files on a Mounted File System - 7.2. File requests hang or timeout waiting for access to the file. - 7.3. Unable to mount a file system - 7.4. I do not have permission to access files on the mounted volume. - 7.5. When I transfer really big files, NFS takes over all the CPU cycles - on the server and it screeches to a halt. - 7.6. Strange error or log messages - 7.7. Real permissions don't match what's in /etc/exports. - 7.8. Flaky and unreliable behavior - 7.9. nfsd won't start - 7.10. File Corruption When Using Multiple Clients - - -8. Using Linux NFS with Other OSes - 8.1. AIX - 8.2. BSD - 8.3. Tru64 Unix - 8.4. HP-UX - 8.5. IRIX - 8.6. Solaris - 8.7. SunOS - - - -1. Preamble - -1.1. Legal stuff - -Copyright (c) <2002> by Tavis Barr, Nicolai Langfeldt, Seth Vidal, and Tom -McNeal. This material may be distributed only subject to the terms and -conditions set forth in the Open Publication License, v1.0 or later (the -latest version is presently available at [http://www.opencontent.org/openpub -/] http://www.opencontent.org/openpub/). ------------------------------------------------------------------------------ - -1.2. Disclaimer - -This document is provided without any guarantees, including merchantability -or fitness for a particular use. The maintainers cannot be responsible if -following instructions in this document leads to damaged equipment or data, -angry neighbors, strange habits, divorce, or any other calamity. ------------------------------------------------------------------------------ - -1.3. Feedback - -This will never be a finished document; we welcome feedback about how it can -be improved. As of February 2002, the Linux NFS home page is being hosted at -[http://nfs.sourceforge.net] http://nfs.sourceforge.net. Check there for -mailing lists, bug fixes, and updates, and also to verify who currently -maintains this document. ------------------------------------------------------------------------------ - -1.4. Translation - -If you are able to translate this document into another language, we would be -grateful and we will also do our best to assist you. Please notify the -maintainers. ------------------------------------------------------------------------------ - -1.5. Dedication - -NFS on Linux was made possible by a collaborative effort of many people, but -a few stand out for special recognition. The original version was developed -by Olaf Kirch and Alan Cox. The version 3 server code was solidified by Neil -Brown, based on work from Saadia Khan, James Yarbrough, Allen Morris, H.J. -Lu, and others (including himself). The client code was written by Olaf Kirch -and updated by Trond Myklebust. The version 4 lock manager was developed by -Saadia Khan. Dave Higgen and H.J. Lu both have undertaken the thankless job -of extensive maintenance and bug fixes to get the code to actually work the -way it was supposed to. H.J. has also done extensive development of the -nfs-utils package. Of course this dedication is leaving many people out. - -The original version of this document was developed by Nicolai Langfeldt. It -was heavily rewritten in 2000 by Tavis Barr and Seth Vidal to reflect -substantial changes in the workings of NFS for Linux developed between the -2.0 and 2.4 kernels. It was edited again in February 2002, when Tom McNeal -made substantial additions to the performance section. Thomas Emmel, Neil -Brown, Trond Myklebust, Erez Zadok, and Ion Badulescu also provided valuable -comments and contributions. ------------------------------------------------------------------------------ - -2. Introduction - -2.1. What is NFS? - -The Network File System (NFS) was developed to allow machines to mount a disk -partition on a remote machine as if it were on a local hard drive. This -allows for fast, seamless sharing of files across a network. - -It also gives the potential for unwanted people to access your hard drive -over the network (and thereby possibly read your email and delete all your -files as well as break into your system) if you set it up incorrectly. So -please read the Security section of this document carefully if you intend to -implement an NFS setup. - -There are other systems that provide similar functionality to NFS. Samba -([http://www.samba.org] http://www.samba.org) provides file services to -Windows clients. The Andrew File System from IBM ([http://www.transarc.com/ -Product/EFS/AFS/index.html] http://www.transarc.com/Product/EFS/AFS/ -index.html), recently open-sourced, provides a file sharing mechanism with -some additional security and performance features. The Coda File System -([http://www.coda.cs.cmu.edu/] http://www.coda.cs.cmu.edu/) is still in -development as of this writing but is designed to work well with disconnected -clients. Many of the features of the Andrew and Coda file systems are slated -for inclusion in the next version of NFS (Version 4) ([http://www.nfsv4.org] -http://www.nfsv4.org). The advantage of NFS today is that it is mature, -standard, well understood, and supported robustly across a variety of -platforms. ------------------------------------------------------------------------------ - -2.2. What is this HOWTO and what is it not? - -This HOWTO is intended as a complete, step-by-step guide to setting up NFS -correctly and effectively. Setting up NFS involves two steps, namely -configuring the server and then configuring the client. Each of these steps -is dealt with in order. The document then offers some tips for people with -particular needs and hardware setups, as well as security and troubleshooting -advice. - -This HOWTO is not a description of the guts and underlying structure of NFS. -For that you may wish to read Linux NFS and Automounter Administration by -Erez Zadok (Sybex, 2001). The classic NFS book, updated and still quite -useful, is Managing NFS and NIS by Hal Stern, published by O'Reilly & -Associates, Inc. A much more advanced technical description of NFS is -available in NFS Illustrated by Brent Callaghan. - -This document is also not intended as a complete reference manual, and does -not contain an exhaustive list of the features of Linux NFS. For that, you -can look at the man pages for nfs(5), exports(5), mount(8), fstab(5), nfsd(8) -, lockd(8), statd(8), rquotad(8), and mountd(8). - -It will also not cover PC-NFS, which is considered obsolete (users are -encouraged to use Samba to share files with Windows machines) or NFS Version -4, which is still in development. ------------------------------------------------------------------------------ - -2.3. Knowledge Pre-Requisites - -You should know some basic things about TCP/IP networking before reading this -HOWTO; if you are in doubt, read the Networking- Overview-HOWTO. ------------------------------------------------------------------------------ - -2.4. Software Pre-Requisites: Kernel Version and nfs-utils - -The difference between Version 2 NFS and version 3 NFS will be explained -later on; for now, you might simply take the suggestion that you will need -NFS Version 3 if you are installing a dedicated or high-volume file server. -NFS Version 2 should be fine for casual use. - -NFS Version 2 has been around for quite some time now (at least since the 1.2 -kernel series) however you will need a kernel version of at least 2.2.18 if -you wish to do any of the following: - -  * Mix Linux NFS with other operating systems' NFS - -  * Use file locking reliably over NFS - -  * Use NFS Version 3. - - -There are also patches available for kernel versions above 2.2.14 that -provide the above functionality. Some of them can be downloaded from the -Linux NFS homepage. If your kernel version is 2.2.14- 2.2.17 and you have the -source code on hand, you can tell if these patches have been added because -NFS Version 3 server support will be a configuration option. However, unless -you have some particular reason to use an older kernel, you should upgrade -because many bugs have been fixed along the way. Kernel 2.2.19 contains some -additional locking improvements over 2.2.18. - -Version 3 functionality will also require the nfs-utils package of at least -version 0.1.6, and mount version 2.10m or newer. However because nfs-utils -and mount are fully backwards compatible, and because newer versions have -lots of security and bug fixes, there is no good reason not to install the -newest nfs-utils and mount packages if you are beginning an NFS setup. - -All 2.4 and higher kernels have full NFS Version 3 functionality. - -In all cases, if you are building your own kernel, you will need to select -NFS and NFS Version 3 support at compile time. Most (but not all) standard -distributions come with kernels that support NFS version 3. - -Handling files larger than 2 GB will require a 2.4x kernel and a 2.2.x -version of glibc. - -All kernels after 2.2.18 support NFS over TCP on the client side. As of this -writing, server-side NFS over TCP only exists in a buggy form as an -experimental option in the post-2.2.18 series; patches for 2.4 and 2.5 -kernels have been introduced starting with 2.4.17 and 2.5.6. The patches are -believed to be stable, though as of this writing they are relatively new and -have not seen widespread use or integration into the mainstream 2.4 kernel. - -Because so many of the above functionalities were introduced in kernel -version 2.2.18, this document was written to be consistent with kernels above -this version (including 2.4.x). If you have an older kernel, this document -may not describe your NFS system correctly. - -As we write this document, NFS version 4 has only recently been finalized as -a protocol, and no implementations are considered production-ready. It will -not be dealt with here. ------------------------------------------------------------------------------ - -2.5. Where to get help and further information - -As of November 2000, the Linux NFS homepage is at [http:// -nfs.sourceforge.net] http://nfs.sourceforge.net. Please check there for NFS -related mailing lists as well as the latest version of nfs-utils, NFS kernel -patches, and other NFS related packages. - -When you encounter a problem or have a question not covered in this manual, -the faq or the man pages, you should send a message to the nfs mailing list -(). To best help the developers and other users -help you assess your problem you should include: - -  * the version of nfs-utils you are using - -  * the version of the kernel and any non-stock applied kernels. - -  * the distribution of linux you are using - -  * the version(s) of other operating systems involved. - - -It is also useful to know the networking configuration connecting the hosts. - -If your problem involves the inability mount or export shares please also -include: - -  * a copy of your /etc/exports file - -  * the output of rpcinfo -p localhost run on the server - -  * the output of rpcinfo -p servername run on the client - - -Sending all of this information with a specific question, after reading all -the documentation, is the best way to ensure a helpful response from the -list. - -You may also wish to look at the man pages for nfs(5), exports(5), mount(8), -fstab(5), nfsd(8), lockd(8), statd(8), rquotad(8), and mountd(8). ------------------------------------------------------------------------------ - -3. Setting Up an NFS Server - -3.1. Introduction to the server setup - -It is assumed that you will be setting up both a server and a client. If you -are just setting up a client to work off of somebody else's server (say in -your department), you can skip to Section 4. However, every client that is -set up requires modifications on the server to authorize that client (unless -the server setup is done in a very insecure way), so even if you are not -setting up a server you may wish to read this section to get an idea what -kinds of authorization problems to look out for. - -Setting up the server will be done in two steps: Setting up the configuration -files for NFS, and then starting the NFS services. ------------------------------------------------------------------------------ - -3.2. Setting up the Configuration Files - -There are three main configuration files you will need to edit to set up an -NFS server: /etc/exports, /etc/hosts.allow, and /etc/hosts.deny. Strictly -speaking, you only need to edit /etc/exports to get NFS to work, but you -would be left with an extremely insecure setup. You may also need to edit -your startup scripts; see Section 3.3.3 for more on that. ------------------------------------------------------------------------------ - -3.2.1. /etc/exports - -This file contains a list of entries; each entry indicates a volume that is -shared and how it is shared. Check the man pages (man exports) for a complete -description of all the setup options for the file, although the description -here will probably satistfy most people's needs. - -An entry in /etc/exports will typically look like this: - directory machine1(option11,option12) machine2(option21,option22) - -where - -directory - the directory that you want to share. It may be an entire volume though - it need not be. If you share a directory, then all directories under it - within the same file system will be shared as well. - -machine1 and machine2 - client machines that will have access to the directory. The machines may - be listed by their DNS address or their IP address (e.g., - machine.company.com or 192.168.0.8). Using IP addresses is more reliable - and more secure. If you need to use DNS addresses, and they do not seem - to be resolving to the right machine, see Section 7.3. - -optionxx - the option listing for each machine will describe what kind of access - that machine will have. Important options are: - -   + ro: The directory is shared read only; the client machine will not be - able to write to it. This is the default. - -   + rw: The client machine will have read and write access to the - directory. - -   + no_root_squash: By default, any file request made by user root on the - client machine is treated as if it is made by user nobody on the - server. (Excatly which UID the request is mapped to depends on the - UID of user "nobody" on the server, not the client.) If - no_root_squash is selected, then root on the client machine will have - the same level of access to the files on the system as root on the - server. This can have serious security implications, although it may - be necessary if you want to perform any administrative work on the - client machine that involves the exported directories. You should not - specify this option without a good reason. - -   + no_subtree_check: If only part of a volume is exported, a routine - called subtree checking verifies that a file that is requested from - the client is in the appropriate part of the volume. If the entire - volume is exported, disabling this check will speed up transfers. - -   + sync: By default, all but the most recent version (version 1.11) of - the exportfs command will use async behavior, telling a client - machine that a file write is complete - that is, has been written to - stable storage - when NFS has finished handing the write over to the - filesysytem. This behavior may cause data corruption if the server - reboots, and the sync option prevents this. See Section 5.9 for a - complete discussion of sync and async behavior. - - - -Suppose we have two client machines, slave1 and slave2, that have IP -addresses 192.168.0.1 and 192.168.0.2, respectively. We wish to share our -software binaries and home directories with these machines. A typical setup -for /etc/exports might look like this: -+---------------------------------------------------------------------------+ -| /usr/local 192.168.0.1(ro) 192.168.0.2(ro) | -| /home 192.168.0.1(rw) 192.168.0.2(rw) | -| | -+---------------------------------------------------------------------------+ - -Here we are sharing /usr/local read-only to slave1 and slave2, because it -probably contains our software and there may not be benefits to allowing -slave1 and slave2 to write to it that outweigh security concerns. On the -other hand, home directories need to be exported read-write if users are to -save work on them. - -If you have a large installation, you may find that you have a bunch of -computers all on the same local network that require access to your server. -There are a few ways of simplifying references to large numbers of machines. -First, you can give access to a range of machines at once by specifying a -network and a netmask. For example, if you wanted to allow access to all the -machines with IP addresses between 192.168.0.0 and 192.168.0.255 then you -could have the entries: -+---------------------------------------------------------------------------+ -| /usr/local 192.168.0.0/255.255.255.0(ro) | -| /home 192.168.0.0/255.255.255.0(rw) | -| | -+---------------------------------------------------------------------------+ - -See the [http://www.linuxdoc.org/HOWTO/Networking-Overview-HOWTO.html] -Networking-Overview HOWTO for further information about how netmasks work, -and you may also wish to look at the man pages for init and hosts.allow. - -Second, you can use NIS netgroups in your entry. To specify a netgroup in -your exports file, simply prepend the name of the netgroup with an "@". See -the [http://www.linuxdoc.org/HOWTO/NIS-HOWTO.html] NIS HOWTO for details on -how netgroups work. - -Third, you can use wildcards such as *.foo.com or 192.168. instead of -hostnames. There were problems with wildcard implementation in the 2.2 kernel -series that were fixed in kernel 2.2.19. - -However, you should keep in mind that any of these simplifications could -cause a security risk if there are machines in your netgroup or local network -that you do not trust completely. - -A few cautions are in order about what cannot (or should not) be exported. -First, if a directory is exported, its parent and child directories cannot be -exported if they are in the same filesystem. However, exporting both should -not be necessary because listing the parent directory in the /etc/exports -file will cause all underlying directories within that file system to be -exported. - -Second, it is a poor idea to export a FAT or VFAT (i.e., MS-DOS or Windows 95 -/98) filesystem with NFS. FAT is not designed for use on a multi-user -machine, and as a result, operations that depend on permissions will not work -well. Moreover, some of the underlying filesystem design is reported to work -poorly with NFS's expectations. - -Third, device or other special files may not export correctly to non-Linux -clients. See Section 8 for details on particular operating systems. ------------------------------------------------------------------------------ - -3.2.2. /etc/hosts.allow and /etc/hosts.deny - -These two files specify which computers on the network can use services on -your machine. Each line of the file contains a single entry listing a service -and a set of machines. When the server gets a request from a machine, it does -the following: - -  * It first checks hosts.allow to see if the machine matches a description - listed in there. If it does, then the machine is allowed access. - -  * If the machine does not match an entry in hosts.allow, the server then - checks hosts.deny to see if the client matches a listing in there. If it - does then the machine is denied access. - -  * If the client matches no listings in either file, then it is allowed - access. - - -In addition to controlling access to services handled by inetd (such as -telnet and FTP), this file can also control access to NFS by restricting -connections to the daemons that provide NFS services. Restrictions are done -on a per-service basis. - -The first daemon to restrict access to is the portmapper. This daemon -essentially just tells requesting clients how to find all the NFS services on -the system. Restricting access to the portmapper is the best defense against -someone breaking into your system through NFS because completely unauthorized -clients won't know where to find the NFS daemons. However, there are two -things to watch out for. First, restricting portmapper isn't enough if the -intruder already knows for some reason how to find those daemons. And second, -if you are running NIS, restricting portmapper will also restrict requests to -NIS. That should usually be harmless since you usually want to restrict NFS -and NIS in a similar way, but just be cautioned. (Running NIS is generally a -good idea if you are running NFS, because the client machines need a way of -knowing who owns what files on the exported volumes. Of course there are -other ways of doing this such as syncing password files. See the [http:// -www.linuxdoc.org/HOWTO/NIS-HOWTO.html] NIS HOWTO for information on setting -up NIS.) - -In general it is a good idea with NFS (as with most internet services) to -explicitly deny access to IP addresses that you don't need to allow access -to. - -The first step in doing this is to add the followng entry to /etc/hosts.deny: - -+---------------------------------------------------------------------------+ -| portmap:ALL | -| | -+---------------------------------------------------------------------------+ - -Starting with nfs-utils 0.2.0, you can be a bit more careful by controlling -access to individual daemons. It's a good precaution since an intruder will -often be able to weasel around the portmapper. If you have a newer version of -nfs-utils, add entries for each of the NFS daemons (see the next section to -find out what these daemons are; for now just put entries for them in -hosts.deny): - -+---------------------------------------------------------------------------+ -| lockd:ALL | -| mountd:ALL | -| rquotad:ALL | -| statd:ALL | -| | -+---------------------------------------------------------------------------+ - -Even if you have an older version of nfs-utils, adding these entries is at -worst harmless (since they will just be ignored) and at best will save you -some trouble when you upgrade. Some sys admins choose to put the entry ALL: -ALL in the file /etc/hosts.deny, which causes any service that looks at these -files to deny access to all hosts unless it is explicitly allowed. While this -is more secure behavior, it may also get you in trouble when you are -installing new services, you forget you put it there, and you can't figure -out for the life of you why they won't work. - -Next, we need to add an entry to hosts.allow to give any hosts access that we -want to have access. (If we just leave the above lines in hosts.deny then -nobody will have access to NFS.) Entries in hosts.allow follow the format - - -+---------------------------------------------------------------------------+ -| service: host [or network/netmask] , host [or network/netmask] | -| | -+---------------------------------------------------------------------------+ - -Here, host is IP address of a potential client; it may be possible in some -versions to use the DNS name of the host, but it is strongly discouraged. - -Suppose we have the setup above and we just want to allow access to -slave1.foo.com and slave2.foo.com, and suppose that the IP addresses of these -machines are 192.168.0.1 and 192.168.0.2, respectively. We could add the -following entry to /etc/hosts.allow: - - -+---------------------------------------------------------------------------+ -| portmap: 192.168.0.1 , 192.168.0.2 | -| | -+---------------------------------------------------------------------------+ - -For recent nfs-utils versions, we would also add the following (again, these -entries are harmless even if they are not supported): - - -+---------------------------------------------------------------------------+ -| lockd: 192.168.0.1 , 192.168.0.2 | -| rquotad: 192.168.0.1 , 192.168.0.2 | -| mountd: 192.168.0.1 , 192.168.0.2 | -| statd: 192.168.0.1 , 192.168.0.2 | -| | -+---------------------------------------------------------------------------+ - -If you intend to run NFS on a large number of machines in a local network, / -etc/hosts.allow also allows for network/netmask style entries in the same -manner as /etc/exports above. ------------------------------------------------------------------------------ - -3.3. Getting the services started - -3.3.1. Pre-requisites - -The NFS server should now be configured and we can start it running. First, -you will need to have the appropriate packages installed. This consists -mainly of a new enough kernel and a new enough version of the nfs-utils -package. See Section 2.4 if you are in doubt. - -Next, before you can start NFS, you will need to have TCP/IP networking -functioning correctly on your machine. If you can use telnet, FTP, and so on, -then chances are your TCP networking is fine. - -That said, with most recent Linux distributions you may be able to get NFS up -and running simply by rebooting your machine, and the startup scripts should -detect that you have set up your /etc/exports file and will start up NFS -correctly. If you try this, see Section 3.4 Verifying that NFS is running. If -this does not work, or if you are not in a position to reboot your machine, -then the following section will tell you which daemons need to be started in -order to run NFS services. If for some reason nfsd was already running when -you edited your configuration files above, you will have to flush your -configuration; see Section 3.5 for details. ------------------------------------------------------------------------------ - -3.3.2. Starting the Portmapper - -NFS depends on the portmapper daemon, either called portmap or rpc.portmap. -It will need to be started first. It should be located in /sbin but is -sometimes in /usr/sbin. Most recent Linux distributions start this daemon in -the boot scripts, but it is worth making sure that it is running before you -begin working with NFS (just type ps aux | grep portmap). ------------------------------------------------------------------------------ - -3.3.3. The Daemons - -NFS serving is taken care of by five daemons: rpc.nfsd, which does most of -the work; rpc.lockd and rpc.statd, which handle file locking; rpc.mountd, -which handles the initial mount requests, and rpc.rquotad, which handles user -file quotas on exported volumes. Starting with 2.2.18, lockd is called by -nfsd upon demand, so you do not need to worry about starting it yourself. -statd will need to be started separately. Most recent Linux distributions -will have startup scripts for these daemons. - -The daemons are all part of the nfs-utils package, and may be either in the / -sbin directory or the /usr/sbin directory. - -If your distribution does not include them in the startup scripts, then then -you should add them, configured to start in the following order: - -rpc.portmap -rpc.mountd, rpc.nfsd -rpc.statd, rpc.lockd (if necessary), and rpc.rquotad - -The nfs-utils package has sample startup scripts for RedHat and Debian. If -you are using a different distribution, in general you can just copy the -RedHat script, but you will probably have to take out the line that says: -+---------------------------------------------------------------------------+ -| . ../init.d/functions | -| | -+---------------------------------------------------------------------------+ -to avoid getting error messages. ------------------------------------------------------------------------------ - -3.4. Verifying that NFS is running - -To do this, query the portmapper with the command rpcinfo -p to find out what -services it is providing. You should get something like this: -+---------------------------------------------------------------------------+ -| program vers proto port | -| 100000 2 tcp 111 portmapper | -| 100000 2 udp 111 portmapper | -| 100011 1 udp 749 rquotad | -| 100011 2 udp 749 rquotad | -| 100005 1 udp 759 mountd | -| 100005 1 tcp 761 mountd | -| 100005 2 udp 764 mountd | -| 100005 2 tcp 766 mountd | -| 100005 3 udp 769 mountd | -| 100005 3 tcp 771 mountd | -| 100003 2 udp 2049 nfs | -| 100003 3 udp 2049 nfs | -| 300019 1 tcp 830 amd | -| 300019 1 udp 831 amd | -| 100024 1 udp 944 status | -| 100024 1 tcp 946 status | -| 100021 1 udp 1042 nlockmgr | -| 100021 3 udp 1042 nlockmgr | -| 100021 4 udp 1042 nlockmgr | -| 100021 1 tcp 1629 nlockmgr | -| 100021 3 tcp 1629 nlockmgr | -| 100021 4 tcp 1629 nlockmgr | -| | -+---------------------------------------------------------------------------+ - -This says that we have NFS versions 2 and 3, rpc.statd version 1, network -lock manager (the service name for rpc.lockd) versions 1, 3, and 4. There are -also different service listings depending on whether NFS is travelling over -TCP or UDP. Linux systems use UDP by default unless TCP is explicitly -requested; however other OSes such as Solaris default to TCP. - -If you do not at least see a line that says portmapper, a line that says nfs, -and a line that says mountd then you will need to backtrack and try again to -start up the daemons (see Section 7, Troubleshooting, if this still doesn't -work). - -If you do see these services listed, then you should be ready to set up NFS -clients to access files from your server. ------------------------------------------------------------------------------ - -3.5. Making changes to /etc/exports later on - -If you come back and change your /etc/exports file, the changes you make may -not take effect immediately. You should run the command exportfs -ra to force -nfsd to re-read the /etc/exports   file. If you can't find the exportfs -command, then you can kill nfsd with the -HUP flag (see the man pages for -kill for details). - -If that still doesn't work, don't forget to check hosts.allow to make sure -you haven't forgotten to list any new client machines there. Also check the -host listings on any firewalls you may have set up (see Section 7 and Section -6 for more details on firewalls and NFS). ------------------------------------------------------------------------------ - -4. Setting up an NFS Client - -4.1. Mounting remote directories - -Before beginning, you should double-check to make sure your mount program is -new enough (version 2.10m if you want to use Version 3 NFS), and that the -client machine supports NFS mounting, though most standard distributions do. -If you are using a 2.2 or later kernel with the /proc filesystem you can -check the latter by reading the file /proc/filesystems and making sure there -is a line containing nfs. If not, typing insmod nfs may make it magically -appear if NFS has been compiled as a module; otherwise, you will need to -build (or download) a kernel that has NFS support built in. In general, -kernels that do not have NFS compiled in will give a very specific error when -the mount command below is run. - -To begin using machine as an NFS client, you will need the portmapper running -on that machine, and to use NFS file locking, you will also need rpc.statd -and rpc.lockd running on both the client and the server. Most recent -distributions start those services by default at boot time; if yours doesn't, -see Section 3.2 for information on how to start them up. - -With portmap, lockd, and statd running, you should now be able to mount the -remote directory from your server just the way you mount a local hard drive, -with the mount command. Continuing our example from the previous section, -suppose our server above is called master.foo.com,and we want to mount the / -home directory on slave1.foo.com. Then, all we have to do, from the root -prompt on slave1.foo.com, is type: -+---------------------------------------------------------------------------+ -| # mount master.foo.com:/home /mnt/home | -| | -+---------------------------------------------------------------------------+ -and the directory /home on master will appear as the directory /mnt/home on -slave1. (Note that this assumes we have created the directory /mnt/home as an -empty mount point beforehand.) - -If this does not work, see the Troubleshooting section (Section 7). - -You can get rid of the file system by typing -+---------------------------------------------------------------------------+ -| # umount /mnt/home | -| | -+---------------------------------------------------------------------------+ -just like you would for a local file system. ------------------------------------------------------------------------------ - -4.2. Getting NFS File Systems to Be Mounted at Boot Time - -NFS file systems can be added to your /etc/fstab file the same way local file -systems can, so that they mount when your system starts up. The only -difference is that the file system type will be set to nfs and the dump and -fsck order (the last two entries) will have to be set to zero. So for our -example above, the entry in /etc/fstab would look like: - # device mountpoint fs-type options dump fsckorder - ... - master.foo.com:/home /mnt nfs rw 0 0 - ... - - -See the man pages for fstab if you are unfamiliar with the syntax of this -file. If you are using an automounter such as amd or autofs, the options in -the corresponding fields of your mount listings should look very similar if -not identical. - -At this point you should have NFS working, though a few tweaks may still be -necessary to get it to work well. You should also read Section 6 to be sure -your setup is reasonably secure. ------------------------------------------------------------------------------ - -4.3. Mount options - -4.3.1. Soft vs. Hard Mounting - -There are some options you should consider adding at once. They govern the -way the NFS client handles a server crash or network outage. One of the cool -things about NFS is that it can handle this gracefully. If you set up the -clients right. There are two distinct failure modes: - -soft - If a file request fails, the NFS client will report an error to the - process on the client machine requesting the file access. Some programs - can handle this with composure, most won't. We do not recommend using - this setting; it is a recipe for corrupted files and lost data. You - should especially not use this for mail disks --- if you value your mail, - that is. - -hard - The program accessing a file on a NFS mounted file system will hang when - the server crashes. The process cannot be interrupted or killed (except - by a "sure kill") unless you also specify intr. When the NFS server is - back online the program will continue undisturbed from where it was. We - recommend using hard,intr on all NFS mounted file systems. - - -Picking up the from previous example, the fstab entry would now look like: - # device mountpoint fs-type options dump fsckord - ... - master.foo.com:/home /mnt/home nfs rw,hard,intr 0 0 - ... - ------------------------------------------------------------------------------ - -4.3.2. Setting Block Size to Optimize Transfer Speeds - -The rsize and wsize mount options specify the size of the chunks of data that -the client and server pass back and forth to each other. - -The defaults may be too big or to small; there is no size that works well on -all or most setups. On the one hand, some combinations of Linux kernels and -network cards (largely on older machines) cannot handle blocks that large. On -the other hand, if they can handle larger blocks, a bigger size might be -faster. - -Getting the block size right is an important factor in performance and is a -must if you are planning to use the NFS server in a production environment. -See Section 5 for details. ------------------------------------------------------------------------------ - -5. Optimizing NFS Performance - -Careful analysis of your environment, both from the client and from the -server point of view, is the first step necessary for optimal NFS -performance. The first sections will address issues that are generally -important to the client. Later (Section 5.3 and beyond), server side issues -will be discussed. In both cases, these issues will not be limited -exclusively to one side or the other, but it is useful to separate the two in -order to get a clearer picture of cause and effect. - -Aside from the general network configuration - appropriate network capacity, -faster NICs, full duplex settings in order to reduce collisions, agreement in -network speed among the switches and hubs, etc. - one of the most important -client optimization settings are the NFS data transfer buffer sizes, -specified by the mount command options rsize and wsize. ------------------------------------------------------------------------------ - -5.1. Setting Block Size to Optimize Transfer Speeds - -The mount command options rsize and wsize specify the size of the chunks of -data that the client and server pass back and forth to each other. If no -rsize and wsize options are specified, the default varies by which version of -NFS we are using. The most common default is 4K (4096 bytes), although for -TCP-based mounts in 2.2 kernels, and for all mounts beginning with 2.4 -kernels, the server specifies the default block size. - -The theoretical limit for the NFS V2 protocol is 8K. For the V3 protocol, the -limit is specific to the server. On the Linux server, the maximum block size -is defined by the value of the kernel constant NFSSVC_MAXBLKSIZE, found in -the Linux kernel source file ./include/linux/nfsd/const.h. The current -maximum block size for the kernel, as of 2.4.17, is 8K (8192 bytes), but the -patch set implementing NFS over TCP/IP transport in the 2.4 series, as of -this writing, uses a value of 32K (defined in the patch as 32*1024) for the -maximum block size. - -All 2.4 clients currently support up to 32K block transfer sizes, allowing -the standard 32K block transfers across NFS mounts from other servers, such -as Solaris, without client modification. - -The defaults may be too big or too small, depending on the specific -combination of hardware and kernels. On the one hand, some combinations of -Linux kernels and network cards (largely on older machines) cannot handle -blocks that large. On the other hand, if they can handle larger blocks, a -bigger size might be faster. - -You will want to experiment and find an rsize and wsize that works and is as -fast as possible. You can test the speed of your options with some simple -commands, if your network environment is not heavily used. Note that your -results may vary widely unless you resort to using more complex benchmarks, -such as Bonnie, Bonnie++, or IOzone. - -The first of these commands transfers 16384 blocks of 16k each from the -special file /dev/zero (which if you read it just spits out zeros really -fast) to the mounted partition. We will time it to see how long it takes. So, -from the client machine, type: - # time dd if=/dev/zero of=/mnt/home/testfile bs=16k count=16384 - -This creates a 256Mb file of zeroed bytes. In general, you should create a -file that's at least twice as large as the system RAM on the server, but make -sure you have enough disk space! Then read back the file into the great black -hole on the client machine (/dev/null) by typing the following: - # time dd if=/mnt/home/testfile of=/dev/null bs=16k - -Repeat this a few times and average how long it takes. Be sure to unmount and -remount the filesystem each time (both on the client and, if you are zealous, -locally on the server as well), which should clear out any caches. - -Then unmount, and mount again with a larger and smaller block size. They -should be multiples of 1024, and not larger than the maximum block size -allowed by your system. Note that NFS Version 2 is limited to a maximum of -8K, regardless of the maximum block size defined by NFSSVC_MAXBLKSIZE; -Version 3 will support up to 64K, if permitted. The block size should be a -power of two since most of the parameters that would constrain it (such as -file system block sizes and network packet size) are also powers of two. -However, some users have reported better successes with block sizes that are -not powers of two but are still multiples of the file system block size and -the network packet size. - -Directly after mounting with a larger size, cd into the mounted file system -and do things like ls, explore the filesystem a bit to make sure everything -is as it should. If the rsize/wsize is too large the symptoms are very odd -and not 100% obvious. A typical symptom is incomplete file lists when doing -ls, and no error messages, or reading files failing mysteriously with no -error messages. After establishing that the given rsize/ wsize works you can -do the speed tests again. Different server platforms are likely to have -different optimal sizes. - -Remember to edit /etc/fstab to reflect the rsize/wsize you found to be the -most desirable. - -If your results seem inconsistent, or doubtful, you may need to analyze your -network more extensively while varying the rsize and wsize values. In that -case, here are several pointers to benchmarks that may prove useful: - -  * Bonnie [http://www.textuality.com/bonnie/] http://www.textuality.com/ - bonnie/ - -  * Bonnie++ [http://www.coker.com.au/bonnie++/] http://www.coker.com.au/ - bonnie++/ - -  * IOzone file system benchmark [http://www.iozone.org/] http:// - www.iozone.org/ - -  * The official NFS benchmark, SPECsfs97 [http://www.spec.org/osg/sfs97/] - http://www.spec.org/osg/sfs97/ - - -The easiest benchmark with the widest coverage, including an extensive spread -of file sizes, and of IO types - reads, & writes, rereads & rewrites, random -access, etc. - seems to be IOzone. A recommended invocation of IOzone (for -which you must have root privileges) includes unmounting and remounting the -directory under test, in order to clear out the caches between tests, and -including the file close time in the measurements. Assuming you've already -exported /tmp to everyone from the server foo, and that you've installed -IOzone in the local directory, this should work: - # echo "foo:/tmp /mnt/foo nfs rw,hard,intr,rsize=8192,wsize=8192 0 0" - >> /etc/fstab - # mkdir /mnt/foo - # mount /mnt/foo - # ./iozone -a -R -c -U /mnt/foo -f /mnt/foo/testfile > logfile - -The benchmark should take 2-3 hours at most, but of course you will need to -run it for each value of rsize and wsize that is of interest. The web site -gives full documentation of the parameters, but the specific options used -above are: - -  * -a Full automatic mode, which tests file sizes of 64K to 512M, using - record sizes of 4K to 16M - -  * -R Generate report in excel spreadsheet form (The "surface plot" option - for graphs is best) - -  * -c Include the file close time in the tests, which will pick up the NFS - version 3 commit time - -  * -U Use the given mount point to unmount and remount between tests; it - clears out caches - -  * -f When using unmount, you have to locate the test file in the mounted - file system - - ------------------------------------------------------------------------------ -5.2. Packet Size and Network Drivers - -While many Linux network card drivers are excellent, some are quite shoddy, -including a few drivers for some fairly standard cards. It is worth -experimenting with your network card directly to find out how it can best -handle traffic. - -Try pinging back and forth between the two machines with large packets using -the -f and -s options with ping (see ping(8) for more details) and see if a -lot of packets get dropped, or if they take a long time for a reply. If so, -you may have a problem with the performance of your network card. - -For a more extensive analysis of NFS behavior in particular, use the nfsstat -command to look at nfs transactions, client and server statistics, network -statistics, and so forth. The "-o net" option will show you the number of -dropped packets in relation to the total number of transactions. In UDP -transactions, the most important statistic is the number of retransmissions, -due to dropped packets, socket buffer overflows, general server congestion, -timeouts, etc. This will have a tremendously important effect on NFS -performance, and should be carefully monitored. Note that nfsstat does not -yet implement the -z option, which would zero out all counters, so you must -look at the current nfsstat counter values prior to running the benchmarks. - -To correct network problems, you may wish to reconfigure the packet size that -your network card uses. Very often there is a constraint somewhere else in -the network (such as a router) that causes a smaller maximum packet size -between two machines than what the network cards on the machines are actually -capable of. TCP should autodiscover the appropriate packet size for a -network, but UDP will simply stay at a default value. So determining the -appropriate packet size is especially important if you are using NFS over -UDP. - -You can test for the network packet size using the tracepath command: From -the client machine, just type tracepath server 2049 and the path MTU should -be reported at the bottom. You can then set the MTU on your network card -equal to the path MTU, by using the MTU option to ifconfig, and see if fewer -packets get dropped. See the ifconfig man pages for details on how to reset -the MTU. - -In addition, netstat -s will give the statistics collected for traffic across -all supported protocols. You may also look at /proc/net/snmp for information -about current network behavior; see the next section for more details. ------------------------------------------------------------------------------ - -5.3. Overflow of Fragmented Packets - -Using an rsize or wsize larger than your network's MTU (often set to 1500, in -many networks) will cause IP packet fragmentation when using NFS over UDP. IP -packet fragmentation and reassembly require a significant amount of CPU -resource at both ends of a network connection. In addition, packet -fragmentation also exposes your network traffic to greater unreliability, -since a complete RPC request must be retransmitted if a UDP packet fragment -is dropped for any reason. Any increase of RPC retransmissions, along with -the possibility of increased timeouts, are the single worst impediment to -performance for NFS over UDP. - -Packets may be dropped for many reasons. If your network topography is -complex, fragment routes may differ, and may not all arrive at the Server for -reassembly. NFS Server capacity may also be an issue, since the kernel has a -limit of how many fragments it can buffer before it starts throwing away -packets. With kernels that support the /proc filesystem, you can monitor the -files /proc/sys/net/ipv4/ipfrag_high_thresh and /proc/sys/net/ipv4/ -ipfrag_low_thresh. Once the number of unprocessed, fragmented packets reaches -the number specified by ipfrag_high_thresh (in bytes), the kernel will simply -start throwing away fragmented packets until the number of incomplete packets -reaches the number specified by ipfrag_low_thresh. - -Another counter to monitor is IP: ReasmFails in the file /proc/net/snmp; this -is the number of fragment reassembly failures. if it goes up too quickly -during heavy file activity, you may have problem. ------------------------------------------------------------------------------ - -5.4. NFS over TCP - -A new feature, available for both 2.4 and 2.5 kernels but not yet integrated -into the mainstream kernel at the time of this writing, is NFS over TCP. -Using TCP has a distinct advantage and a distinct disadvantage over UDP. The -advantage is that it works far better than UDP on lossy networks. When using -TCP, a single dropped packet can be retransmitted, without the retransmission -of the entire RPC request, resulting in better performance on lossy networks. -In addition, TCP will handle network speed differences better than UDP, due -to the underlying flow control at the network level. - -The disadvantage of using TCP is that it is not a stateless protocol like -UDP. If your server crashes in the middle of a packet transmission, the -client will hang and any shares will need to be unmounted and remounted. - -The overhead incurred by the TCP protocol will result in somewhat slower -performance than UDP under ideal network conditions, but the cost is not -severe, and is often not noticable without careful measurement. If you are -using gigabit ethernet from end to end, you might also investigate the usage -of jumbo frames, since the high speed network may allow the larger frame -sizes without encountering increased collision rates, particularly if you -have set the network to full duplex. ------------------------------------------------------------------------------ - -5.5. Timeout and Retransmission Values - -Two mount command options, timeo and retrans, control the behavior of UDP -requests when encountering client timeouts due to dropped packets, network -congestion, and so forth. The -o timeo option allows designation of the -length of time, in tenths of seconds, that the client will wait until it -decides it will not get a reply from the server, and must try to send the -request again. The default value is 7 tenths of a second. The -o retrans -option allows designation of the number of timeouts allowed before the client -gives up, and displays the Server not responding message. The default value -is 3 attempts. Once the client displays this message, it will continue to try -to send the request, but only once before displaying the error message if -another timeout occurs. When the client reestablishes contact, it will fall -back to using the correct retrans value, and will display the Server OK -message. - -If you are already encountering excessive retransmissions (see the output of -the nfsstat command), or want to increase the block transfer size without -encountering timeouts and retransmissions, you may want to adjust these -values. The specific adjustment will depend upon your environment, and in -most cases, the current defaults are appropriate. ------------------------------------------------------------------------------ - -5.6. Number of Instances of the NFSD Server Daemon - -Most startup scripts, Linux and otherwise, start 8 instances of nfsd. In the -early days of NFS, Sun decided on this number as a rule of thumb, and -everyone else copied. There are no good measures of how many instances are -optimal, but a more heavily-trafficked server may require more. You should -use at the very least one daemon per processor, but four to eight per -processor may be a better rule of thumb. If you are using a 2.4 or higher -kernel and you want to see how heavily each nfsd thread is being used, you -can look at the file /proc/net/rpc/nfsd. The last ten numbers on the th line -in that file indicate the number of seconds that the thread usage was at that -percentage of the maximum allowable. If you have a large number in the top -three deciles, you may wish to increase the number of nfsd instances. This is -done upon starting nfsd using the number of instances as the command line -option, and is specified in the NFS startup script (/etc/rc.d/init.d/nfs on -Red Hat) as RPCNFSDCOUNT. See the nfsd(8) man page for more information. ------------------------------------------------------------------------------ - -5.7. Memory Limits on the Input Queue - -On 2.2 and 2.4 kernels, the socket input queue, where requests sit while they -are currently being processed, has a small default size limit (rmem_default) -of 64k. This queue is important for clients with heavy read loads, and -servers with heavy write loads. As an example, if you are running 8 instances -of nfsd on the server, each will only have 8k to store write requests while -it processes them. In addition, the socket output queue - important for -clients with heavy write loads and servers with heavy read loads - also has a -small default size (wmem_default). - -Several published runs of the NFS benchmark [http://www.spec.org/osg/sfs97/] -SPECsfs specify usage of a much higher value for both the read and write -value sets, [rw]mem_default and [rw]mem_max. You might consider increasing -these values to at least 256k. The read and write limits are set in the proc -file system using (for example) the files /proc/sys/net/core/rmem_default and -/proc/sys/net/core/rmem_max. The rmem_default value can be increased in three -steps; the following method is a bit of a hack but should work and should not -cause any problems: - -  * Increase the size listed in the file: - # echo 262144 > /proc/sys/net/core/rmem_default - # echo 262144 > /proc/sys/net/core/rmem_max - -  * Restart NFS. For example, on Red Hat systems, - # /etc/rc.d/init.d/nfs restart - -  * You might return the size limits to their normal size in case other - kernel systems depend on it: - # echo 65536 > /proc/sys/net/core/rmem_default - # echo 65536 > /proc/sys/net/core/rmem_max - - -This last step may be necessary because machines have been reported to crash -if these values are left changed for long periods of time. ------------------------------------------------------------------------------ - -5.8. Turning Off Autonegotiation of NICs and Hubs - -If network cards auto-negotiate badly with hubs and switches, and ports run -at different speeds, or with different duplex configurations, performance -will be severely impacted due to excessive collisions, dropped packets, etc. -If you see excessive numbers of dropped packets in the nfsstat output, or -poor network performance in general, try playing around with the network -speed and duplex settings. If possible, concentrate on establishing a -100BaseT full duplex subnet; the virtual elimination of collisions in full -duplex will remove the most severe performance inhibitor for NFS over UDP. Be -careful when turning off autonegotiation on a card: The hub or switch that -the card is attached to will then resort to other mechanisms (such as -parallel detection) to determine the duplex settings, and some cards default -to half duplex because it is more likely to be supported by an old hub. The -best solution, if the driver supports it, is to force the card to negotiate -100BaseT full duplex. ------------------------------------------------------------------------------ - -5.9. Synchronous vs. Asynchronous Behavior in NFS - -The default export behavior for both NFS Version 2 and Version 3 protocols, -used by exportfs in nfs-utils versions prior to Version 1.11 (the latter is -in the CVS tree, but not yet released in a package, as of January, 2002) is -"asynchronous". This default permits the server to reply to client requests -as soon as it has processed the request and handed it off to the local file -system, without waiting for the data to be written to stable storage. This is -indicated by the async option denoted in the server's export list. It yields -better performance at the cost of possible data corruption if the server -reboots while still holding unwritten data and/or metadata in its caches. -This possible data corruption is not detectable at the time of occurrence, -since the async option instructs the server to lie to the client, telling the -client that all data has indeed been written to the stable storage, -regardless of the protocol used. - -In order to conform with "synchronous" behavior, used as the default for most -proprietary systems supporting NFS (Solaris, HP-UX, RS/6000, etc.), and now -used as the default in the latest version of exportfs, the Linux Server's -file system must be exported with the sync option. Note that specifying -synchronous exports will result in no option being seen in the server's -export list: - -  * Export a couple file systems to everyone, using slightly different - options: - - # /usr/sbin/exportfs -o rw,sync *:/usr/local - # /usr/sbin/exportfs -o rw *:/tmp - -  * Now we can see what the exported file system parameters look like: - - # /usr/sbin/exportfs -v - /usr/local *(rw) - /tmp *(rw,async) - - -If your kernel is compiled with the /proc filesystem, then the file /proc/fs/ -nfs/exports will also show the full list of export options. - -When synchronous behavior is specified, the server will not complete (that -is, reply to the client) an NFS version 2 protocol request until the local -file system has written all data/metadata to the disk. The server will -complete a synchronous NFS version 3 request without this delay, and will -return the status of the data in order to inform the client as to what data -should be maintained in its caches, and what data is safe to discard. There -are 3 possible status values, defined an enumerated type, nfs3_stable_how, in -include/linux/nfs.h. The values, along with the subsequent actions taken due -to these results, are as follows: - -  * NFS_UNSTABLE - Data/Metadata was not committed to stable storage on the - server, and must be cached on the client until a subsequent client commit - request assures that the server does send data to stable storage. - -  * NFS_DATA_SYNC - Metadata was not sent to stable storage, and must be - cached on the client. A subsequent commit is necessary, as is required - above. - -  * NFS_FILE_SYNC - No data/metadata need be cached, and a subsequent commit - need not be sent for the range covered by this request. - - -In addition to the above definition of synchronous behavior, the client may -explicitly insist on total synchronous behavior, regardless of the protocol, -by opening all files with the O_SYNC option. In this case, all replies to -client requests will wait until the data has hit the server's disk, -regardless of the protocol used (meaning that, in NFS version 3, all requests -will be NFS_FILE_SYNC requests, and will require that the Server returns this -status). In that case, the performance of NFS Version 2 and NFS Version 3 -will be virtually identical. - -If, however, the old default async behavior is used, the O_SYNC option has no -effect at all in either version of NFS, since the server will reply to the -client without waiting for the write to complete. In that case the -performance differences between versions will also disappear. - -Finally, note that, for NFS version 3 protocol requests, a subsequent commit -request from the NFS client at file close time, or at fsync() time, will -force the server to write any previously unwritten data/metadata to the disk, -and the server will not reply to the client until this has been completed, as -long as sync behavior is followed. If async is used, the commit is -essentially a no-op, since the server once again lies to the client, telling -the client that the data has been sent to stable storage. This again exposes -the client and server to data corruption, since cached data may be discarded -on the client due to its belief that the server now has the data maintained -in stable storage. ------------------------------------------------------------------------------ - -5.10. Non-NFS-Related Means of Enhancing Server Performance - -In general, server performance and server disk access speed will have an -important effect on NFS performance. Offering general guidelines for setting -up a well-functioning file server is outside the scope of this document, but -a few hints may be worth mentioning: - -  * If you have access to RAID arrays, use RAID 1/0 for both write speed and - redundancy; RAID 5 gives you good read speeds but lousy write speeds. - -  * A journalling filesystem will drastically reduce your reboot time in the - event of a system crash. Currently, [ftp://ftp.uk.linux.org/pub/linux/sct - /fs/jfs/] ext3 will work correctly with NFS version 3. In addition, - Reiserfs version 3.6 will work with NFS version 3 on 2.4.7 or later - kernels (patches are available for previous kernels). Earlier versions of - Reiserfs did not include room for generation numbers in the inode, - exposing the possibility of undetected data corruption during a server - reboot. - -  * Additionally, journalled file systems can be configured to maximize - performance by taking advantage of the fact that journal updates are all - that is necessary for data protection. One example is using ext3 with - data=journal so that all updates go first to the journal, and later to - the main file system. Once the journal has been updated, the NFS server - can safely issue the reply to the clients, and the main file system - update can occur at the server's leisure. - - The journal in a journalling file system may also reside on a separate - device such as a flash memory card so that journal updates normally - require no seeking. With only rotational delay imposing a cost, this - gives reasonably good synchronous IO performance. Note that ext3 - currently supports journal relocation, and ReiserFS will (officially) - support it soon. The Reiserfs tool package found at [ftp:// - ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.x.0k.tar.gz] ftp:// - ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.x.0k.tar.gz contains - the reiserfstune tool, which will allow journal relocation. It does, - however, require a kernel patch which has not yet been officially - released as of January, 2002. - -  * Using an automounter (such as autofs or amd) may prevent hangs if you - cross-mount files on your machines (whether on purpose or by oversight) - and one of those machines goes down. See the [http://www.linuxdoc.org/ - HOWTO/mini/Automount.html] Automount Mini-HOWTO for details. - -  * Some manufacturers (Network Appliance, Hewlett Packard, and others) - provide NFS accelerators in the form of Non-Volatile RAM. NVRAM will - boost access speed to stable storage up to the equivalent of async - access. - - ------------------------------------------------------------------------------ -6. Security and NFS - -This list of security tips and explanations will not make your site -completely secure. NOTHING will make your site completely secure. Reading -this section may help you get an idea of the security problems with NFS. This -is not a comprehensive guide and it will always be undergoing changes. If you -have any tips or hints to give us please send them to the HOWTO maintainer. - -If you are on a network with no access to the outside world (not even a -modem) and you trust all the internal machines and all your users then this -section will be of no use to you. However, its our belief that there are -relatively few networks in this situation so we would suggest reading this -section thoroughly for anyone setting up NFS. - -With NFS, there are two steps required for a client to gain access to a file -contained in a remote directory on the server. The first step is mount -access. Mount access is achieved by the client machine attempting to attach -to the server. The security for this is provided by the /etc/exports file. -This file lists the names or IP addresses for machines that are allowed to -access a share point. If the client's ip address matches one of the entries -in the access list then it will be allowed to mount. This is not terribly -secure. If someone is capable of spoofing or taking over a trusted address -then they can access your mount points. To give a real-world example of this -type of "authentication": This is equivalent to someone introducing -themselves to you and you believing they are who they claim to be because -they are wearing a sticker that says "Hello, My Name is ...." Once the -machine has mounted a volume, its operating system will have access to all -files on the volume (with the possible exception of those owned by root; see -below) and write access to those files as well, if the volume was exported -with the rw option. - -The second step is file access. This is a function of normal file system -access controls on the client and not a specialized function of NFS. Once the -drive is mounted the user and group permissions on the files determine access -control. - -An example: bob on the server maps to the UserID 9999. Bob makes a file on -the server that is only accessible the user (the equivalent to typing chmod -600 filename). A client is allowed to mount the drive where the file is -stored. On the client mary maps to UserID 9999. This means that the client -user mary can access bob's file that is marked as only accessible by him. It -gets worse: If someone has become superuser on the client machine they can su -- username and become any user. NFS will be none the wiser. - -Its not all terrible. There are a few measures you can take on the server to -offset the danger of the clients. We will cover those shortly. - -If you don't think the security measures apply to you, you're probably wrong. -In Section 6.1 we'll cover securing the portmapper, server and client -security in Section 6.2 and Section 6.3 respectively. Finally, in Section 6.4 -we'll briefly talk about proper firewalling for your nfs server. - -Finally, it is critical that all of your nfs daemons and client programs are -current. If you think that a flaw is too recently announced for it to be a -problem for you, then you've probably already been compromised. - -A good way to keep up to date on security alerts is to subscribe to the -bugtraq mailinglists. You can read up on how to subscribe and various other -information about bugtraq here: [http://www.securityfocus.com/forums/bugtraq/ -faq.html] http://www.securityfocus.com/forums/bugtraq/faq.html - -Additionally searching for NFS at [http://www.securityfocus.com] -securityfocus.com's search engine will show you all security reports -pertaining to NFS. - -You should also regularly check CERT advisories. See the CERT web page at -[http://www.cert.org] www.cert.org. ------------------------------------------------------------------------------ - -6.1. The portmapper - -The portmapper keeps a list of what services are running on what ports. This -list is used by a connecting machine to see what ports it wants to talk to -access certain services. - -The portmapper is not in as bad a shape as a few years ago but it is still a -point of worry for many sys admins. The portmapper, like NFS and NIS, should -not really have connections made to it outside of a trusted local area -network. If you have to expose them to the outside world - be careful and -keep up diligent monitoring of those systems. - -Not all Linux distributions were created equal. Some seemingly up-to-date -distributions do not include a securable portmapper. The easy way to check if -your portmapper is good or not is to run strings(1) and see if it reads the -relevant files, /etc/hosts.deny and /etc/hosts.allow. Assuming your -portmapper is /sbin/portmap you can check it with this command: - strings /sbin/portmap | grep hosts. - - -On a securable machine it comes up something like this: -+---------------------------------------------------------------------------+ -| /etc/hosts.allow | -| /etc/hosts.deny | -| @(#) hosts_ctl.c 1.4 94/12/28 17:42:27 | -| @(#) hosts_access.c 1.21 97/02/12 02:13:22 | -| | -+---------------------------------------------------------------------------+ - -First we edit /etc/hosts.deny. It should contain the line - -+---------------------------------------------------------------------------+ -| portmap: ALL | -| | -+---------------------------------------------------------------------------+ - -which will deny access to everyone. While it is closed run: -+---------------------------------------------------------------------------+ -| rpcinfo -p | -| | -+---------------------------------------------------------------------------+ -just to check that your portmapper really reads and obeys this file. Rpcinfo -should give no output, or possibly an error message. The files /etc/ -hosts.allow and /etc/hosts.deny take effect immediately after you save them. -No daemon needs to be restarted. - -Closing the portmapper for everyone is a bit drastic, so we open it again by -editing /etc/hosts.allow. But first we need to figure out what to put in it. -It should basically list all machines that should have access to your -portmapper. On a run of the mill Linux system there are very few machines -that need any access for any reason. The portmapper administers nfsd, mountd, -ypbind/ypserv, rquotad, lockd (which shows up as nlockmgr), statd (which -shows up as status) and 'r' services like ruptime and rusers. Of these only -nfsd, mountd, ypbind/ypserv and perhaps rquotad,lockd and statd are of any -consequence. All machines that need to access services on your machine should -be allowed to do that. Let's say that your machine's address is 192.168.0.254 -and that it lives on the subnet 192.168.0.0, and that all machines on the -subnet should have access to it (for an overview of those terms see the the -[http://www.linuxdoc.org/HOWTO/Networking-Overview-HOWTO.html] -Networking-Overview-HOWTO). Then we write: -+---------------------------------------------------------------------------+ -| portmap: 192.168.0.0/255.255.255.0 | -| | -+---------------------------------------------------------------------------+ -in /etc/hosts.allow. If you are not sure what your network or netmask are, -you can use the ifconfig command to determine the netmask and the netstat -command to determine the network. For, example, for the device eth0 on the -above machine ifconfig should show: - -+---------------------------------------------------------------------------+ -| ... | -| eth0 Link encap:Ethernet HWaddr 00:60:8C:96:D5:56 | -| inet addr:192.168.0.254 Bcast:192.168.0.255 Mask:255.255.255.0 | -| UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 | -| RX packets:360315 errors:0 dropped:0 overruns:0 | -| TX packets:179274 errors:0 dropped:0 overruns:0 | -| Interrupt:10 Base address:0x320 | -| ... | -| | -+---------------------------------------------------------------------------+ -and netstat -rn should show: -+---------------------------------------------------------------------------------+ -| Kernel routing table | -| Destination Gateway Genmask Flags Metric Ref Use Iface | -| ... | -| 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 174412 eth0 | -| ... | -| | -+---------------------------------------------------------------------------------+ -(The network address is in the first column). - -The /etc/hosts.deny and /etc/hosts.allow files are described in the manual -pages of the same names. - -IMPORTANT: Do not put anything but IP NUMBERS in the portmap lines of these -files. Host name lookups can indirectly cause portmap activity which will -trigger host name lookups which can indirectly cause portmap activity which -will trigger... - -Versions 0.2.0 and higher of the nfs-utils package also use the hosts.allow -and hosts.deny files, so you should put in entries for lockd, statd, mountd, -and rquotad in these files too. For a complete example, see Section 3.2.2. - -The above things should make your server tighter. The only remaining problem -is if someone gains administrative access to one of your trusted client -machines and is able to send bogus NFS requests. The next section deals with -safeguards against this problem. ------------------------------------------------------------------------------ - -6.2. Server security: nfsd and mountd - -On the server we can decide that we don't want to trust any requests made as -root on the client. We can do that by using the root_squash option in /etc/ -exports: - /home slave1(rw,root_squash) - - -This is, in fact, the default. It should always be turned on unless you have -a very good reason to turn it off. To turn it off use the no_root_squash -option. - -Now, if a user with UID 0 (i.e., root's user ID number) on the client -attempts to access (read, write, delete) the file system, the server -substitutes the UID of the server's 'nobody' account. Which means that the -root user on the client can't access or change files that only root on the -server can access or change. That's good, and you should probably use -root_squash on all the file systems you export. "But the root user on the -client can still use su to become any other user and access and change that -users files!" say you. To which the answer is: Yes, and that's the way it is, -and has to be with Unix and NFS. This has one important implication: All -important binaries and files should be owned by root, and not bin or other -non-root account, since the only account the clients root user cannot access -is the servers root account. In the exports(5) man page there are several -other squash options listed so that you can decide to mistrust whomever you -(don't) like on the clients. - -The TCP ports 1-1024 are reserved for root's use (and therefore sometimes -referred to as "secure ports") A non-root user cannot bind these ports. -Adding the secure option to an /etc/exports means that it will only listed to -requests coming from ports 1-1024 on the client, so that a malicious non-root -user on the client cannot come along and open up a spoofed NFS dialogue on a -non-reserved port. This option is set by default. ------------------------------------------------------------------------------ - -6.3. Client Security - -6.3.1. The nosuid mount option - -On the client we can decide that we don't want to trust the server too much a -couple of ways with options to mount. For example we can forbid suid programs -to work off the NFS file system with the nosuid option. Some unix programs, -such as passwd, are called "suid" programs: They set the id of the person -running them to whomever is the owner of the file. If a file is owned by root -and is suid, then the program will execute as root, so that they can perform -operations (such as writing to the password file) that only root is allowed -to do. Using the nosuid option is a good idea and you should consider using -this with all NFS mounted disks. It means that the server's root user cannot -make a suid-root program on the file system, log in to the client as a normal -user and then use the suid-root program to become root on the client too. One -could also forbid execution of files on the mounted file system altogether -with the noexec option. But this is more likely to be impractical than nosuid -since a file system is likely to at least contain some scripts or programs -that need to be executed. ------------------------------------------------------------------------------ - -6.3.2. The broken_suid mount option - -Some older programs (xterm being one of them) used to rely on the idea that -root can write everywhere. This is will break under new kernels on NFS -mounts. The security implications are that programs that do this type of suid -action can potentially be used to change your apparent uid on nfs servers -doing uid mapping. So the default has been to disable this broken_suid in the -linux kernel. - -The long and short of it is this: If you're using an old linux distribution, -some sort of old suid program or an older unix of some type you might have to -mount from your clients with the broken_suid option to mount. However, most -recent unixes and linux distros have xterm and such programs just as a normal -executable with no suid status, they call programs to do their setuid work. - -You enter the above options in the options column, with the rsize and wsize, -separated by commas. ------------------------------------------------------------------------------ - -6.3.3. Securing portmapper, rpc.statd, and rpc.lockd on the client - -In the current (2.2.18+) implementation of NFS, full file locking is -supported. This means that rpc.statd and rpc.lockd must be running on the -client in order for locks to function correctly. These services require the -portmapper to be running. So, most of the problems you will find with nfs on -the server you may also be plagued with on the client. Read through the -portmapper section above for information on securing the portmapper. ------------------------------------------------------------------------------ - -6.4. NFS and firewalls (ipchains and netfilter) - -IPchains (under the 2.2.X kernels) and netfilter (under the 2.4.x kernels) -allow a good level of security - instead of relying on the daemon (or perhaps -its TCP wrapper) to determine which machines can connect, the connection -attempt is allowed or disallowed at a lower level. In this case, you can stop -the connection much earlier and more globally, which can protect you from all -sorts of attacks. - -Describing how to set up a Linux firewall is well beyond the scope of this -document. Interested readers may wish to read the [http://www.linuxdoc.org/ -HOWTO/Firewall-HOWTO.html] Firewall-HOWTO or the [http://www.linuxdoc.org/ -HOWTO/IPCHAINS-HOWTO.HTML] IPCHAINS-HOWTO. For users of kernel 2.4 and above -you might want to visit the netfilter webpage at: [http:// -netfilter.filewatcher.org] http://netfilter.filewatcher.org. If you are -already familiar with the workings of ipchains or netfilter this section will -give you a few tips on how to better setup your NFS daemons to more easily -firewall and protect them. - -A good rule to follow for your firewall configuration is to deny all, and -allow only some - this helps to keep you from accidentally allowing more than -you intended. - -In order to understand how to firewall the NFS daemons, it will help to -breifly review how they bind to ports. - -When a daemon starts up, it requests a free port from the portmapper. The -portmapper gets the port for the daemon and keeps track of the port currently -used by that daemon. When other hosts or processes need to communicate with -the daemon, they request the port number from the portmapper in order to find -the daemon. So the ports will perpetually float because different ports may -be free at different times and so the portmapper will allocate them -differently each time. This is a pain for setting up a firewall. If you never -know where the daemons are going to be then you don't know precisely which -ports to allow access to. This might not be a big deal for many people -running on a protected or isolated LAN. For those people on a public network, -though, this is horrible. - -In kernels 2.4.13 and later with nfs-utils 0.3.3 or later you no longer have -to worry about the floating of ports in the portmapper. Now all of the -daemons pertaining to nfs can be "pinned" to a port. Most of them nicely take -a -p option when they are started; those daemons that are started by the -kernel take some kernel arguments or module options. They are described -below. - -Some of the daemons involved in sharing data via nfs are already bound to a -port. portmap is always on port 111 tcp and udp. nfsd is always on port 2049 -TCP and UDP (however, as of kernel 2.4.17, NFS over TCP is considered -experimental and is not for use on production machines). - -The other daemons, statd, mountd, lockd, and rquotad, will normally move -around to the first available port they are informed of by the portmapper. - -To force statd to bind to a particular port, use the -p portnum option. To -force statd to respond on a particular port, additionally use the -o portnum -option when starting it. - -To force mountd to bind to a particular port use the -p portnum option. - -For example, to have statd broadcast of port 32765 and listen on port 32766, -and mountd listen on port 32767, you would type: -# statd -p 32765 -o 32766 -# mountd -p 32767 - -lockd is started by the kernel when it is needed. Therefore you need to pass -module options (if you have it built as a module) or kernel options to force -lockd to listen and respond only on certain ports. - -If you are using loadable modules and you would like to specify these options -in your /etc/modules.conf file add a line like this to the file: -options lockd nlm_udpport=32768 nlm_tcpport=32768 - -The above line would specify the udp and tcp port for lockd to be 32768. - -If you are not using loadable modules or if you have compiled lockd into the -kernel instead of building it as a module then you will need to pass it an -option on the kernel boot line. - -It should look something like this: - vmlinuz 3 root=/dev/hda1 lockd.udpport=32768 lockd.tcpport=32768 - -The port numbers do not have to match but it would simply add unnecessary -confusion if they didn't. - -If you are using quotas and using rpc.quotad to make these quotas viewable -over nfs you will need to also take it into account when setting up your -firewall. There are two rpc.rquotad source trees. One of those is maintained -in the nfs-utils tree. The other in the quota-tools tree. They do not operate -identically. The one provided with nfs-utils supports binding the daemon to a -port with the -p directive. The one in quota-tools does not. Consult your -distribution's documentation to determine if yours does. - -For the sake of this discussion lets describe a network and setup a firewall -to protect our nfs server. Our nfs server is 192.168.0.42 our client is -192.168.0.45 only. As in the example above, statd has been started so that it -only binds to port 32765 for incoming requests and it must answer on port -32766. mountd is forced to bind to port 32767. lockd's module parameters have -been set to bind to 32768. nfsd is, of course, on port 2049 and the -portmapper is on port 111. - -We are not using quotas. - -Using IPCHAINS, a simple firewall might look something like this: -ipchains -A input -f -j ACCEPT -s 192.168.0.45 -ipchains -A input -s 192.168.0.45 -d 0/0 32765:32768 -p 6 -j ACCEPT -ipchains -A input -s 192.168.0.45 -d 0/0 32765:32768 -p 17 -j ACCEPT -ipchains -A input -s 192.168.0.45 -d 0/0 2049 -p 17 -j ACCEPT -ipchains -A input -s 192.168.0.45 -d 0/0 2049 -p 6 -j ACCEPT -ipchains -A input -s 192.168.0.45 -d 0/0 111 -p 6 -j ACCEPT -ipchains -A input -s 192.168.0.45 -d 0/0 111 -p 17 -j ACCEPT -ipchains -A input -s 0/0 -d 0/0 -p 6 -j DENY -y -l -ipchains -A input -s 0/0 -d 0/0 -p 17 -j DENY -l - -The equivalent set of commands in netfilter is: -iptables -A INPUT -f -j ACCEPT -s 192.168.0.45 -iptables -A INPUT -s 192.168.0.45 -d 0/0 32765:32768 -p 6 -j ACCEPT -iptables -A INPUT -s 192.168.0.45 -d 0/0 32765:32768 -p 17 -j ACCEPT -iptables -A INPUT -s 192.168.0.45 -d 0/0 2049 -p 17 -j ACCEPT -iptables -A INPUT -s 192.168.0.45 -d 0/0 2049 -p 6 -j ACCEPT -iptables -A INPUT -s 192.168.0.45 -d 0/0 111 -p 6 -j ACCEPT -iptables -A INPUT -s 192.168.0.45 -d 0/0 111 -p 17 -j ACCEPT -iptables -A INPUT -s 0/0 -d 0/0 -p 6 -j DENY --syn --log-level 5 -iptables -A INPUT -s 0/0 -d 0/0 -p 17 -j DENY --log-level 5 - -The first line says to accept all packet fragments (except the first packet -fragment which will be treated as a normal packet). In theory no packet will -pass through until it is reassembled, and it won't be reassembled unless the -first packet fragment is passed. Of course there are attacks that can be -generated by overloading a machine with packet fragments. But NFS won't work -correctly unless you let fragments through. See Section 7.8 for details. - -The other lines allow specific connections from any port on our client host -to the specific ports we have made available on our server. This means that -if, say, 192.158.0.46 attempts to contact the NFS server it will not be able -to mount or see what mounts are available. - -With the new port pinning capabilities it is obviously much easier to control -what hosts are allowed to mount your NFS shares. It is worth mentioning that -NFS is not an encrypted protocol and anyone on the same physical network -could sniff the traffic and reassemble the information being passed back and -forth. ------------------------------------------------------------------------------ - -6.5. Tunneling NFS through SSH - -One method of encrypting NFS traffic over a network is to use the -port-forwarding capabilities of ssh. However, as we shall see, doing so has a -serious drawback if you do not utterly and completely trust the local users -on your server. - -The first step will be to export files to the localhost. For example, to -export the /home partition, enter the following into /etc/exports: -/home 127.0.0.1(rw) - -The next step is to use ssh to forward ports. For example, ssh can tell the -server to forward to any port on any machine from a port on the client. Let -us assume, as in the previous section, that our server is 192.168.0.42, and -that we have pinned mountd to port 32767 using the argument -p 32767. Then, -on the client, we'll type: - # ssh root@192.168.0.42 -L 250:localhost:2049 -f sleep 60m - # ssh root@192.168.0.42 -L 251:localhost:32767 -f sleep 60m - -The above command causes ssh on the client to take any request directed at -the client's port 250 and forward it, first through sshd on the server, and -then on to the server's port 2049. The second line causes a similar type of -forwarding between requests to port 251 on the client and port 32767 on the -server. The localhost is relative to the server; that is, the forwarding will -be done to the server itself. The port could otherwise have been made to -forward to any other machine, and the requests would look to the outside -world as if they were coming from the server. Thus, the requests will appear -to NFSD on the server as if they are coming from the server itself. Note that -in order to bind to a port below 1024 on the client, we have to run this -command as root on the client. Doing this will be necessary if we have -exported our filesystem with the default secure option. - -Finally, we are pulling a little trick with the last option, -f sleep 60m. -Normally, when we use ssh, even with the -L option, we will open up a shell -on the remote machine. But instead, we just want the port forwarding to -execute in the background so that we get our shell on the client back. So, we -tell ssh to execute a command in the background on the server to sleep for 60 -minutes. This will cause the port to be forwarded for 60 minutes until it -gets a connection; at that point, the port will continue to be forwarded -until the connection dies or until the 60 minutes are up, whichever happens -later. The above command could be put in our startup scripts on the client, -right after the network is started. - -Next, we have to mount the filesystem on the client. To do this, we tell the -client to mount a filesystem on the localhost, but at a different port from -the usual 2049. Specifically, an entry in /etc/fstab would look like: - localhost:/home /mnt/home nfs rw,hard,intr,port=250,mountport=251 0 0 - -Having done this, we can see why the above will be incredibly insecure if we -have any ordinary users who are able to log in to the server locally. If they -can, there is nothing preventing them from doing what we did and using ssh to -forward a privileged port on their own client machine (where they are -legitimately root) to ports 2049 and 32767 on the server. Thus, any ordinary -user on the server can mount our filesystems with the same rights as root on -our client. - -If you are using an NFS server that does not have a way for ordinary users to -log in, and you wish to use this method, there are two additional caveats: -First, the connection travels from the client to the server via sshd; -therefore you will have to leave port 22 (where sshd listens) open to your -client on the firewall. However you do not need to leave the other ports, -such as 2049 and 32767, open anymore. Second, file locking will no longer -work. It is not possible to ask statd or the locking manager to make requests -to a particular port for a particular mount; therefore, any locking requests -will cause statd to connect to statd on localhost, i.e., itself, and it will -fail with an error. Any attempt to correct this would require a major rewrite -of NFS. - -It may also be possible to use IPSec to encrypt network traffic between your -client and your server, without compromising any local security on the -server; this will not be taken up here. See the [http://www.freeswan.org/] -FreeS/WAN home page for details on using IPSec under Linux. ------------------------------------------------------------------------------ - -6.6. Summary - -If you use the hosts.allow, hosts.deny, root_squash, nosuid and privileged -port features in the portmapper/NFS software, you avoid many of the presently -known bugs in NFS and can almost feel secure about that at least. But still, -after all that: When an intruder has access to your network, s/he can make -strange commands appear in your .forward or read your mail when /home or /var -/mail is NFS exported. For the same reason, you should never access your PGP -private key over NFS. Or at least you should know the risk involved. And now -you know a bit of it. - -NFS and the portmapper makes up a complex subsystem and therefore it's not -totally unlikely that new bugs will be discovered, either in the basic design -or the implementation we use. There might even be holes known now, which -someone is abusing. But that's life. ------------------------------------------------------------------------------ - -7. Troubleshooting - - - This is intended as a step-by-step guide to what to do when things go - wrong using NFS. Usually trouble first rears its head on the client end, - so this diagnostic will begin there. - - ------------------------------------------------------------------------------ -7.1. Unable to See Files on a Mounted File System - -First, check to see if the file system is actually mounted. There are several -ways of doing this. The most reliable way is to look at the file /proc/ -mounts, which will list all mounted filesystems and give details about them. -If this doesn't work (for example if you don't have the /proc filesystem -compiled into your kernel), you can type mount -f although you get less -information. - -If the file system appears to be mounted, then you may have mounted another -file system on top of it (in which case you should unmount and remount both -volumes), or you may have exported the file system on the server before you -mounted it there, in which case NFS is exporting the underlying mount point -(if so then you need to restart NFS on the server). - -If the file system is not mounted, then attempt to mount it. If this does not -work, see Symptom 3. ------------------------------------------------------------------------------ - -7.2. File requests hang or timeout waiting for access to the file. - -This usually means that the client is unable to communicate with the server. -See Symptom 3 letter b. ------------------------------------------------------------------------------ - -7.3. Unable to mount a file system - -There are two common errors that mount produces when it is unable to mount a -volume. These are: - - a. failed, reason given by server: Permission denied - - This means that the server does not recognize that you have access to the - volume. - - i. Check your /etc/exports file and make sure that the volume is - exported and that your client has the right kind of access to it. For - example, if a client only has read access then you have to mount the - volume with the ro option rather than the rw option. - - ii. Make sure that you have told NFS to register any changes you made to - /etc/exports since starting nfsd by running the exportfs command. Be - sure to type exportfs -ra to be extra certain that the exports are - being re-read. - - iii. Check the file /proc/fs/nfs/exports and make sure the volume and - client are listed correctly. (You can also look at the file /var/lib/ - nfs/xtab for an unabridged list of how all the active export options - are set.) If they are not, then you have not re-exported properly. If - they are listed, make sure the server recognizes your client as being - the machine you think it is. For example, you may have an old listing - for the client in /etc/hosts that is throwing off the server, or you - may not have listed the client's complete address and it may be - resolving to a machine in a different domain. One trick is login to - the server from the client via ssh or telnet; if you then type who, - one of the listings should be your login session and the name of your - client machine as the server sees it. Try using this machine name in - your /etc/exports entry. Finally, try to ping the client from the - server, and try to ping the server from the client. If this doesn't - work, or if there is packet loss, you may have lower-level network - problems. - - iv. It is not possible to export both a directory and its child (for - example both /usr and /usr/local). You should export the parent - directory with the necessary permissions, and all of its - subdirectories can then be mounted with those same permissions. - - - b. RPC: Program Not Registered: (or another "RPC" error): - - This means that the client does not detect NFS running on the server. - This could be for several reasons. - - i. First, check that NFS actually is running on the server by typing - rpcinfo -p on the server. You should see something like this: - +------------------------------------------------------------+ - | program vers proto port | - | 100000 2 tcp 111 portmapper | - | 100000 2 udp 111 portmapper | - | 100011 1 udp 749 rquotad | - | 100011 2 udp 749 rquotad | - | 100005 1 udp 759 mountd | - | 100005 1 tcp 761 mountd | - | 100005 2 udp 764 mountd | - | 100005 2 tcp 766 mountd | - | 100005 3 udp 769 mountd | - | 100005 3 tcp 771 mountd | - | 100003 2 udp 2049 nfs | - | 100003 3 udp 2049 nfs | - | 300019 1 tcp 830 amd | - | 300019 1 udp 831 amd | - | 100024 1 udp 944 status | - | 100024 1 tcp 946 status | - | 100021 1 udp 1042 nlockmgr | - | 100021 3 udp 1042 nlockmgr | - | 100021 4 udp 1042 nlockmgr | - | 100021 1 tcp 1629 nlockmgr | - | 100021 3 tcp 1629 nlockmgr | - | 100021 4 tcp 1629 nlockmgr | - | | - +------------------------------------------------------------+ - This says that we have NFS versions 2 and 3, rpc.statd version 1, - network lock manager (the service name for rpc.lockd) versions 1, 3, - and 4. There are also different service listings depending on whether - NFS is travelling over TCP or UDP. UDP is usually (but not always) - the default unless TCP is explicitly requested. - - If you do not see at least portmapper, nfs, and mountd, then you need - to restart NFS. If you are not able to restart successfully, proceed - to Symptom 9. - - ii. Now check to make sure you can see it from the client. On the client, - type rpcinfo -p server where server is the DNS name or IP address of - your server. - - If you get a listing, then make sure that the type of mount you are - trying to perform is supported. For example, if you are trying to - mount using Version 3 NFS, make sure Version 3 is listed; if you are - trying to mount using NFS over TCP, make sure that is registered. - (Some non-Linux clients default to TCP). Type man rpcinfo for more - details on how to read the output. If the type of mount you are - trying to perform is not listed, try a different type of mount. - - If you get the error No Remote Programs Registered, then you need to - check your /etc/hosts.allow and /etc/hosts.deny files on the server - and make sure your client actually is allowed access. Again, if the - entries appear correct, check /etc/hosts (or your DNS server) and - make sure that the machine is listed correctly, and make sure you can - ping the server from the client. Also check the error logs on the - system for helpful messages: Authentication errors from bad /etc/ - hosts.allow entries will usually appear in /var/log/messages, but may - appear somewhere else depending on how your system logs are set up. - The man pages for syslog can help you figure out how your logs are - set up. Finally, some older operating systems may behave badly when - routes between the two machines are asymmetric. Try typing tracepath - [server] from the client and see if the word "asymmetric" shows up - anywhere in the output. If it does then this may be causing packet - loss. However asymmetric routes are not usually a problem on recent - linux distributions. - - If you get the error Remote system error - No route to host, but you - can ping the server correctly, then you are the victim of an - overzealous firewall. Check any firewalls that may be set up, either - on the server or on any routers in between the client and the server. - Look at the man pages for ipchains, netfilter, and ipfwadm, as well - as the [http://www.linuxdoc.org/HOWTO/IPCHAINS-HOWTO.html] - IPChains-HOWTO and the [http://www.linuxdoc.org/HOWTO/ - Firewall-HOWTO.html] Firewall-HOWTO for help. - - - ------------------------------------------------------------------------------ -7.4. I do not have permission to access files on the mounted volume. - -This could be one of two problems. - -If it is a write permission problem, check the export options on the server -by looking at /proc/fs/nfs/exports and make sure the filesystem is not -exported read-only. If it is you will need to re-export it read/write (don't -forget to run exportfs -ra after editing /etc/exports). Also, check /proc/ -mounts and make sure the volume is mounted read/write (although if it is -mounted read-only you ought to get a more specific error message). If not -then you need to re-mount with the rw option. - -The second problem has to do with username mappings, and is different -depending on whether you are trying to do this as root or as a non-root user. - -If you are not root, then usernames may not be in sync on the client and the -server. Type id [user] on both the client and the server and make sure they -give the same UID number. If they don't then you are having problems with -NIS, NIS+, rsync, or whatever system you use to sync usernames. Check group -names to make sure that they match as well. Also, make sure you are not -exporting with the all_squash option. If the user names match then the user -has a more general permissions problem unrelated to NFS. - -If you are root, then you are probably not exporting with the no_root_squash -option; check /proc/fs/nfs/exports or /var/lib/nfs/xtab on the server and -make sure the option is listed. In general, being able to write to the NFS -server as root is a bad idea unless you have an urgent need -- which is why -Linux NFS prevents it by default. See Section 6 for details. - -If you have root squashing, you want to keep it, and you're only trying to -get root to have the same permissions on the file that the user nobody should -have, then remember that it is the server that determines which uid root gets -mapped to. By default, the server uses the UID and GID of nobody in the /etc/ -passwd file, but this can also be overridden with the anonuid and anongid -options in the /etc/exports file. Make sure that the client and the server -agree about which UID nobody gets mapped to. ------------------------------------------------------------------------------ - -7.5. When I transfer really big files, NFS takes over all the CPU cycles on -the server and it screeches to a halt. - -This is a problem with the fsync() function in 2.2 kernels that causes all -sync-to-disk requests to be cumulative, resulting in a write time that is -quadratic in the file size. If you can, upgrading to a 2.4 kernel should -solve the problem. Also, exporting with the no_wdelay option forces the -program to use o_sync() instead, which may prove faster. ------------------------------------------------------------------------------ - -7.6. Strange error or log messages - - a. Messages of the following format: - - +-------------------------------------------------------------------------------------------+ - | Jan 7 09:15:29 server kernel: fh_verify: mail/guest permission failure, acc=4, error=13 | - | Jan 7 09:23:51 server kernel: fh_verify: ekonomi/test permission failure, acc=4, error=13 | - | | - +-------------------------------------------------------------------------------------------+ - - These happen when a NFS setattr operation is attempted on a file you - don't have write access to. The messages are harmless. - - b. The following messages frequently appear in the logs: - - +---------------------------------------------------------------------+ - | kernel: nfs: server server.domain.name not responding, still trying | - | kernel: nfs: task 10754 can't get a request slot | - | kernel: nfs: server server.domain.name OK | - | | - +---------------------------------------------------------------------+ - - The "can't get a request slot" message means that the client-side RPC - code has detected a lot of timeouts (perhaps due to network congestion, - perhaps due to an overloaded server), and is throttling back the number - of concurrent outstanding requests in an attempt to lighten the load. The - cause of these messages is basically sluggish performance. See Section 5 - for details. - - c. After mounting, the following message appears on the client: - - +---------------------------------------------------------------+ - |nfs warning: mount version older than kernel | - | | - +---------------------------------------------------------------+ - - It means what it says: You should upgrade your mount package and/or - am-utils. (If for some reason upgrading is a problem, you may be able to - get away with just recompiling them so that the newer kernel features are - recognized at compile time). - - d. Errors in startup/shutdown log for lockd - - You may see a message of the following kind in your boot log: - +---------------------------------------------------------------+ - |nfslock: rpc.lockd startup failed | - | | - +---------------------------------------------------------------+ - - They are harmless. Older versions of rpc.lockd needed to be started up - manually, but newer versions are started automatically by nfsd. Many of - the default startup scripts still try to start up lockd by hand, in case - it is necessary. You can alter your startup scripts if you want the - messages to go away. - - e. The following message appears in the logs: - - +---------------------------------------------------------------+ - |kmem_create: forcing size word alignment - nfs_fh | - | | - +---------------------------------------------------------------+ - - This results from the file handle being 16 bits instead of a mulitple of - 32 bits, which makes the kernel grimace. It is harmless. - - ------------------------------------------------------------------------------ -7.7. Real permissions don't match what's in /etc/exports. - -/etc/exports is very sensitive to whitespace - so the following statements -are not the same: -/export/dir hostname(rw,no_root_squash) -/export/dir hostname (rw,no_root_squash) - -The first will grant hostname rw access to /export/dir without squashing root -privileges. The second will grant hostname rw privileges with root squash and -it will grant everyone else read/write access, without squashing root -privileges. Nice huh? ------------------------------------------------------------------------------ - -7.8. Flaky and unreliable behavior - -Simple commands such as ls work, but anything that transfers a large amount -of information causes the mount point to lock. - -This could be one of two problems: - - i. It will happen if you have ipchains on at the server and/or the client - and you are not allowing fragmented packets through the chains. Allow - fragments from the remote host and you'll be able to function again. See - Section 6.4 for details on how to do this. - -ii. You may be using a larger rsize and wsize in your mount options than the - server supports. Try reducing rsize and wsize to 1024 and seeing if the - problem goes away. If it does, then increase them slowly to a more - reasonable value. - - ------------------------------------------------------------------------------ -7.9. nfsd won't start - -Check the file /etc/exports and make sure root has read permission. Check the -binaries and make sure they are executable. Make sure your kernel was -compiled with NFS server support. You may need to reinstall your binaries if -none of these ideas helps. ------------------------------------------------------------------------------ - -7.10. File Corruption When Using Multiple Clients - -If a file has been modified within one second of its previous modification -and left the same size, it will continue to generate the same inode number. -Because of this, constant reads and writes to a file by multiple clients may -cause file corruption. Fixing this bug requires changes deep within the -filesystem layer, and therefore it is a 2.5 item. ------------------------------------------------------------------------------ - -8. Using Linux NFS with Other OSes - -Every operating system, Linux included, has quirks and deviations in the -behavior of its NFS implementation -- sometimes because the protocols are -vague, sometimes because they leave gaping security holes. Linux will work -properly with all major vendors' NFS implementations, as far as we know. -However, there may be extra steps involved to make sure the two OSes are -communicating clearly with one another. This section details those steps. - -In general, it is highly ill-advised to attempt to use a Linux machine with a -kernel before 2.2.18 as an NFS server for non-Linux clients. Implementations -with older kernels may work fine as clients; however if you are using one of -these kernels and get stuck, the first piece of advice we would give is to -upgrade your kernel and see if the problems go away. The user-space NFS -implementations also do not work well with non-Linux clients. - -Following is a list of known issues for using Linux together with major -operating systems. ------------------------------------------------------------------------------ - -8.1. AIX - -8.1.1. Linux Clients and AIX Servers - -The format for the /etc/exports file for our example in Section 3 is: - /usr slave1.foo.com:slave2.foo.com,access=slave1.foo.com:slave2.foo.com - /home slave1.foo.com:slave2.foo.com,rw=slave1.foo.com:slave2.foo.com - ------------------------------------------------------------------------------ - -8.1.2. AIX clients and Linux Servers - -AIX uses the file /etc/filesystems instead of /etc/fstab. A sample entry, -based on the example in Section 4, looks like this: -/mnt/home: - dev = "/home" - vfs = nfs - nodename = master.foo.com - mount = true - options = bg,hard,intr,rsize=1024,wsize=1024,vers=2,proto=udp - account = false - - - i. Version 4.3.2 of AIX, and possibly earlier versions as well, requires - that file systems be exported with the insecure option, which causes NFS - to listen to requests from insecure ports (i.e., ports above 1024, to - which non-root users can bind). Older versions of AIX do not seem to - require this. - -ii. AIX clients will default to mounting version 3 NFS over TCP. If your - Linux server does not support this, then you may need to specify vers=2 - and/or proto=udp in your mount options. - -iii. Using netmasks in /etc/exports seems to sometimes cause clients to lose - mounts when another client is reset. This can be fixed by listing out - hosts explicitly. - -iv. Apparently automount in AIX 4.3.2 is rather broken. - - ------------------------------------------------------------------------------ -8.2. BSD - -8.2.1. BSD servers and Linux clients - -BSD kernels tend to work better with larger block sizes. ------------------------------------------------------------------------------ - -8.2.2. Linux servers and BSD clients - -Some versions of BSD may make requests to the server from insecure ports, in -which case you will need to export your volumes with the insecure option. See -the man page for exports(5) for more details. ------------------------------------------------------------------------------ - -8.3. Tru64 Unix - -8.3.1. Tru64 Unix Servers and Linux Clients - -In general, Tru64 Unix servers work quite smoothly with Linux clients. The -format for the /etc/exports file for our example in Section 3 is: - -/usr slave1.foo.com:slave2.foo.com \ - -access=slave1.foo.com:slave2.foo.com \ - -/home slave1.foo.com:slave2.foo.com \ - -rw=slave1.foo.com:slave2.foo.com \ - -root=slave1.foo.com:slave2.foo.com - - -(The root option is listed in the last entry for informational purposes only; -its use is not recommended unless necessary.) - -Tru64 checks the /etc/exports file every time there is a mount request so you -do not need to run the exportfs command; in fact on many versions of Tru64 -Unix the command does not exist. ------------------------------------------------------------------------------ - -8.3.2. Linux Servers and Tru64 Unix Clients - -There are two issues to watch out for here. First, Tru64 Unix mounts using -Version 3 NFS by default. You will see mount errors if your Linux server does -not support Version 3 NFS. Second, in Tru64 Unix 4.x, NFS locking requests -are made by daemon. You will therefore need to specify the insecure_locks -option on all volumes you export to a Tru64 Unix 4.x client; see the exports -man pages for details. ------------------------------------------------------------------------------ - -8.4. HP-UX - -8.4.1. HP-UX Servers and Linux Clients - -A sample /etc/exports entry on HP-UX looks like this: -/usr -ro,access=slave1.foo.com:slave2.foo.com -/home -rw=slave1.foo.com:slave2.fo.com:root=slave1.foo.com:slave2.foo.com - -(The root option is listed in the last entry for informational purposes only; -its use is not recommended unless necessary.) ------------------------------------------------------------------------------ - -8.4.2. Linux Servers and HP-UX Clients - -HP-UX diskless clients will require at least a kernel version 2.2.19 (or -patched 2.2.18) for device files to export correctly. Also, any exports to an -HP-UX client will need to be exported with the insecure_locks option. ------------------------------------------------------------------------------ - -8.5. IRIX - -8.5.1. IRIX Servers and Linux Clients - -A sample /etc/exports entry on IRIX looks like this: -/usr -ro,access=slave1.foo.com:slave2.foo.com -/home -rw=slave1.foo.com:slave2.fo.com:root=slave1.foo.com:slave2.foo.com - -(The root option is listed in the last entry for informational purposes only; -its use is not recommended unless necessary.) - -There are reportedly problems when using the nohide option on exports to -linux 2.2-based systems. This problem is fixed in the 2.4 kernel. As a -workaround, you can export and mount lower-down file systems separately. - -As of Kernel 2.4.17, there continue to be several minor interoperability -issues that may require a kernel upgrade. In particular: - -  * Make sure that Trond Myklebust's seekdir (or dir) kernel patch is - applied. The latest version (for 2.4.17) is located at: - - [http://www.fys.uio.no/~trondmy/src/2.4.17/linux-2.4.17-seekdir.dif] - http://www.fys.uio.no/~trondmy/src/2.4.17/linux-2.4.17-seekdir.dif - -  * IRIX servers do not always use the same fsid attribute field across - reboots, which results in inode number mismatch errors on a Linux client - if the mounted IRIX server reboots. A patch is available from: - - [http://www.geocrawler.com/lists/3/SourceForge/789/0/7777454/] http:// - www.geocrawler.com/lists/3/SourceForge/789/0/7777454/ - -  * Linux kernels v2.4.9 and above have problems reading large directories - (hundreds of files) from exported IRIX XFS file systems that were made - with naming version=1. The reason for the problem can be found at: - - [http://www.geocrawler.com/archives/3/789/2001/9/100/6531172/] http:// - www.geocrawler.com/archives/3/789/2001/9/100/6531172/ - - The naming version can be found by using (on the IRIX server): - xfs_growfs -n mount_point - - - The workaround is to export these file systems using the -32bitclients - option in the /etc/exports file. The fix is to convert the file system to - 'naming version=2'. Unfortunately the only way to do this is by a backup/ - mkfs/restore. - - mkfs_xfs on IRIX 6.5.14 (and above) creates naming version=2 XFS file - systems by default. On IRIX 6.5.5 to 6.5.13, use: - mkfs_xfs -n version=2 device - - - Versions of IRIX prior to 6.5.5 do not support naming version=2 XFS file - systems. - - ------------------------------------------------------------------------------ -8.5.2. IRIX clients and Linux servers - -Irix versions up to 6.5.12 have problems mounting file systems exported from -Linux boxes - the mount point "gets lost," e.g., - # mount linux:/disk1 /mnt - # cd /mnt/xyz/abc - # pwd - /xyz/abc - - -This is known IRIX bug (SGI bug 815265 - IRIX not liking file handles of less -than 32 bytes), which is fixed in IRIX 6.5.13. If it is not possible to -upgrade to IRIX 6.5.13, then the unofficial workaround is to force the Linux -nfsd to always use 32 byte file handles. - -A number of patches exist - see: - -  * [http://www.geocrawler.com/archives/3/789/2001/8/50/6371896/] http:// - www.geocrawler.com/archives/3/789/2001/8/50/6371896/ - -  * [http://oss.sgi.com/projects/xfs/mail_archive/0110/msg00006.html] http:// - oss.sgi.com/projects/xfs/mail_archive/0110/msg00006.html - - ------------------------------------------------------------------------------ -8.6. Solaris - -8.6.1. Solaris Servers - -Solaris has a slightly different format on the server end from other -operating systems. Instead of /etc/exports, the configuration file is /etc/ -dfs/dfstab. Entries are of the form of a share command, where the syntax for -the example in Section 3 would look like -share -o rw=slave1,slave2 -d "Master Usr" /usr - -and instead of running exportfs after editing, you run shareall. - -Solaris servers are especially sensitive to packet size. If you are using a -Linux client with a Solaris server, be sure to set rsize and wsize to 32768 -at mount time. - -Finally, there is an issue with root squashing on Solaris: root gets mapped -to the user noone, which is not the same as the user nobody. If you are -having trouble with file permissions as root on the client machine, be sure -to check that the mapping works as you expect. ------------------------------------------------------------------------------ - -8.6.2. Solaris Clients - -Solaris clients will regularly produce the following message: -+---------------------------------------------------------------------------+ -|svc: unknown program 100227 (me 100003) | -| | -+---------------------------------------------------------------------------+ - -This happens because Solaris clients, when they mount, try to obtain ACL -information - which Linux obviously does not have. The messages can safely be -ignored. - -There are two known issues with diskless Solaris clients: First, a kernel -version of at least 2.2.19 is needed to get /dev/null to export correctly. -Second, the packet size may need to be set extremely small (i.e., 1024) on -diskless sparc clients because the clients do not know how to assemble -packets in reverse order. This can be done from /etc/bootparams on the -clients. ------------------------------------------------------------------------------ - -8.7. SunOS - -SunOS only has NFS Version 2 over UDP. ------------------------------------------------------------------------------ - -8.7.1. SunOS Servers - -On the server end, SunOS uses the most traditional format for its /etc/ -exports file. The example in Section 3 would look like: -/usr -access=slave1.foo.com,slave2.foo.com -/home -rw=slave1.foo.com,slave2.foo.com, root=slave1.foo.com,slave2.foo.com - - -Again, the root option is listed for informational purposes and is not -recommended unless necessary. ------------------------------------------------------------------------------ - -8.7.2. SunOS Clients - -Be advised that SunOS makes all NFS locking requests as daemon, and therefore -you will need to add the insecure_locks option to any volumes you export to a -SunOS machine. See the exports man page for details. - - diff --git a/LDP/guide/docbook/Linux-Networking/NTP.xml b/LDP/guide/docbook/Linux-Networking/NTP.xml deleted file mode 100644 index c6ccd356..00000000 --- a/LDP/guide/docbook/Linux-Networking/NTP.xml +++ /dev/null @@ -1,416 +0,0 @@ - - -NTP - - -Time synchorinisation is generally considered important in the computing -environment. There are a number of reasons why this is important: it makes -sure your scheduled cron tasks on your various servers run well together, -it allows better use of log files between various machines to help -troubleshoot problems, and synchronised, correct logs are also useful if -your servers are ever attacked by crackers (either to report the attempt -to organisations such as AusCERT or in court to use against the bad guys). -Users who have overclocked their machine might also use time synchronisation -techniques to bring the time on their machines back to an accurate figure -at regular intervals, say every 20 minutes of so. This section contains an -overview of time keeping under Linux and some information about NTP, a -protocol which can be used to accurately reset the time across a computer -network. - - -2. How Linux Keeps Track of Time - -2.1. Basic Strategies - - -A Linux system actually has two clocks: One is the battery powered -"Real Time Clock" (also known as the "RTC", "CMOS clock", or "Hardware -clock") which keeps track of time when the system is turned off but is -not used when the system is running. The other is the "system clock" -(sometimes called the "kernel clock" or "software clock") which is a -software counter based on the timer interrupt. It does not exist when -the system is not running, so it has to be initialized from the RTC -(or some other time source) at boot time. References to "the clock" in -the ntpd documentation refer to the system clock, not the RTC. - - - -The two clocks will drift at different rates, so they will gradually -drift apart from each other, and also away from the "real" time. The -simplest way to keep them on time is to measure their drift rates and -apply correction factors in software. Since the RTC is only used when -the system is not running, the correction factor is applied when the -clock is read at boot time, using clock(8) or hwclock(8). The system -clock is corrected by adjusting the rate at which the system time is -advanced with each timer interrupt, using adjtimex(8). - - - -A crude alternative to adjtimex(8) is to have chron run clock(8) or -hwclock(8) periodically to sync the system time to the (corrected) -RTC. This was recommended in the clock(8) man page, and it works if -you do it often enough that you don't cause large "jumps" in the -system time, but adjtimex(8) is a more elegant solution. Some -applications may complain if the time jumps backwards. - - - -The next step up in accuracy is to use a program like ntpd to read the -time periodically from a network time server or radio clock, and -continuously adjust the rate of the system clock so that the times -always match, without causing sudden "jumps" in the system time. If -you always have a network connection at boot time, you can ignore the -RTC completely and use ntpdate (which comes with the ntpd package) to -initialize the system clock from a time server-- either a local server -on a LAN, or a remote server on the internet. But if you sometimes -don't have a network connection, or if you need the time to be -accurate during the boot sequence before the network is active, then -you need to maintain the time in the RTC as well. - - -2.2. Potential Conflicts - - -It might seem obvious that if you're using a program like ntpd, you -would want to sync the RTC to the (corrected) system clock. But this -turns out to be a bad idea if the system is going to stay shut down -longer than a few minutes, because it interferes with the programs -that apply the correction factor to the RTC at boot time. - - - -If the system runs 24/7 and is always rebooted immediately whenever -it's shut down, then you can just set the RTC from the system clock -right before you reboot. The RTC won't drift enough to make a -difference in the time it takes to reboot, so you don't need to know -its drift rate. - - - -Of course the system may go down unexpectedly, so some versions of the -kernel sync the RTC to the system clock every 11 minutes if the system -clock has been adjusted by another program. The RTC won't drift enough -in 11 minutes to make any difference, but if the system is down long -enough for the RTC to drift significantly, then you have a problem: -the programs that apply the drift correction to the RTC need to know -*exactly* when it was last reset, and the kernel doesn't record that -information anywhere. - - - -Some unix "traditionalists" might wonder why anyone would run a linux -system less than 24/7, but some of us run dual-boot systems with -another OS running some of the time, or run Linux on laptops that have -to be shut down to conserve battery power when they're not being used. -Other people just don't like to leave machines running unattended for -long periods of time (even though we've heard all the arguments in -favor of it). So the "every 11 minutes" feature becomes a bug. - - - -This "feature/bug" appears to behave differently in different versions -of the kernel (and possibly in different versions of xntpd and ntpd as -well), so if you're running both ntpd and hwclock you may need to test -your system to see what it actually does. If you can't keep the kernel -from resetting the RTC, you might have to run without a correction -factor on the RTC. - - - -The part of the kernel that controls this can be found in -/usr/src/linux-2.0.34/arch/i386/kernel/time.c (where the version -number in the path will be the version of the kernel you're running). -If the variable time_status is set to TIME_OK then the kernel will -write the system time to the RTC every 11 minutes, otherwise it leaves -the RTC alone. Calls to adjtimex(2) (as used by ntpd and timed, for -example) may turn this on. Calls to settimeofday(2) will set -time_status to TIME_UNSYNC, which tells the kernel not to adjust the -RTC. I have not found any real documentation on this. - - - -I've heard reports that some versions of the kernel may have problems -with "sleep modes" that shut down the CPU to save energy. The best -solution is to keep your kernel up to date, and refer any problems to -the people who maintain the kernel. - - - -If you get bizarre results from the RTC you may have a hardware -problem. Some RTC chips include a lithium battery that can run down, -and some motherboards have an option for an external battery (be sure -the jumper is set correctly). The same battery maintains the CMOS RAM, -but the clock takes more power and is likely to fail first. Bizarre -results from the system clock may mean there is a problem with -interrupts. - - -2.3. Should the RTC use Local Time or UTC, and What About DST? - - -The Linux "system clock" actually just counts the number of seconds -past Jan 1, 1970, and is always in UTC (or GMT, which is technically -different but close enough that casual users tend to use both terms -interchangeably). UTC does not change as DST comes and goes-- what -changes is the conversion between UTC and local time. The translation -to local time is done by library functions that are linked into the -application programs. - - - -This has two consequences: First, any application that needs to know -the local time also needs to know what time zone you're in, and -whether DST is in effect or not (see the next section for more on time -zones). Second, there is no provision in the kernel to change either -the system clock or the RTC as DST comes and goes, because UTC doesn't -change. Therefore, machines that only run Linux should have the RTC -set to UTC, not local time. - - - -However, many people run dual-boot systems with other OS's that expect -the RTC to contain the local time, so hwclock needs to know whether -your RTC is in local time or UTC, which it then converts to seconds -past Jan 1, 1970 (UTC). This still does not provide for seasonal -changes to the RTC, so the change must be made by the other OS (this -is the one exception to the rule against letting more than one program -change the time in the RTC). - - - -Unfortunately, there are no flags in the RTC or the CMOS RAM to -indicate standard time vs DST, so each OS stores this information -someplace where the other OS's can't find it. This means that hwclock -must assume that the RTC always contains the correct local time, even -if the other OS has not been run since the most recent seasonal time -change. - - - -If Linux is running when the seasonal time change occurs, the system -clock is unaffected and applications will make the correct conversion. -But if linux has to be rebooted for any reason, the system clock will -be set to the time in the RTC, which will be off by one hour until the -other OS (usually Windows) has a chance to run. - - - -There is no way around this, but Linux doesn't crash very often, so -the most likely reason to reboot on a dual-boot system is to run the -other OS anyway. But beware if you're one of those people who shuts -down Linux whenever you won't be using it for a while-- if you haven't -had a chance to run the other OS since the last time change, the RTC -will be off by an hour until you do. - - - -Some other documents have stated that setting the RTC to UTC allows -Linux to take care of DST properly. This is not really wrong, but it -doesn't tell the whole story-- as long as you don't reboot, it does -not matter which time is in the RTC (or even if the RTC's battery -dies). Linux will maintain the correct time either way, until the next -reboot. In theory, if you only reboot once a year (which is not -unreasonable for Linux), DST could come and go and you'd never notice -that the RTC had been wrong for several months, because the system -clock would have stayed correct all along. But since you can't predict -when you'll want to reboot, it's better to have the RTC set to UTC if -you're not running another OS that requires local time. - - - -The Dallas Semiconductor RTC chip (which is a drop-in replacement for -the Motorola chip used in the IBM AT and clones) actually has the -ability to do the DST conversion by itself, but this feature is not -used because the changeover dates are hard-wired into the chip and -can't be changed. Current versions change on the first Sunday in April -and the last Sunday in October, but earlier versions used different -dates (and obviously this doesn't work in countries that use other -dates). Also, the RTC is often integrated into the motherboard's -"chipset" (rather than being a separate chip) and I don't know if they -all have this ability. - - -2.4. How Linux keeps Track of Time Zones - - -You probably set your time zone correctly when you installed Linux. -But if you have to change it for some reason, or if the local laws -regarding DST have changed (as they do frequently in some countries), -then you'll need to know how to change it. If your system time is off -by some exact number of hours, you may have a time zone problem (or a -DST problem). - - - -Time zone and DST information is stored in /usr/share/zoneinfo (or -/usr/lib/zoneinfo on older systems). The local time zone is -determined by a symbolic link from /etc/localtime to one of these -files. The way to change your timezone is to change the link. If -your local DST dates have changed, you'll have to edit the file. - - - -You can also use the TZ environment variable to change the current -time zone, which is handy of you're logged in remotely to a machine in -another time zone. Also see the man pages for tzset and tzfile. -This is nicely summarized at - - - -2.5. The Bottom Line - - -If you don't need sub-second accuracy, hwclock(8) and adjtimex(8) may -be all you need. It's easy to get enthused about time servers and -radio clocks and so on, but I ran the old clock(8) program for years -with excellent results. On the other hand, if you have several -machines on a LAN it can be handy (and sometimes essential) to have -them automatically sync their clocks to each other. And the other -stuff can be fun to play with even if you don't really need it. - - - -On machines that only run Linux, set the RTC to UTC (or GMT). On -dual-boot systems that require local time in the RTC, be aware that if -you have to reboot Linux after the seasonal time change, the clock may -be temporarily off by one hour, until you have a chance to run the -other OS. If you run more than two OS's, be sure only one of them is -trying to adjust for DST. - - - -NTP is a standard method of synchronising time on a client from a remote -server across the network. NTP clients are typically installed on servers. -NTP is a standard method of synchronising time across a network of -computers. NTP clients are typically installed on servers. -Most business class ISPs provide NTP servers. Otherwise, there are a -number of free NTP servers in Australia: - - - -The Univeristy of Melbourne ntp.cs.mu.oz.au -University of Adelaide ntp.saard.net -CSIRO Marine Labs, Tasmania ntp.ml.csiro.au -CSIRO National Measurements Laboratory, Sydney ntp.syd.dms.csiro.au - - - -Xntpd (NTPv3) has been replaced by ntpd (NTPv4); the earlier version -is no longer being maintained. - - - -Ntpd is the standard program for synchronizing clocks across a -network, and it comes with a list of public time servers you can -connect to. It can be a little more complicated to set up, but if -you're interested in this kind of thing I highly recommend that you -take a look at it. - - - -The "home base" for information on ntpd is the NTP website at - which also includes links to all -kinds of interesting time-related stuff (including software for other -OS's). Some linux distributions include ntpd on the CD. There is a -list of public time servers at -. - - - -A relatively new feature in ntpd is a "burst mode" which is designed -for machines that have only intermittent dial-up access to the -internet. - - - -Ntpd includes drivers for quite a few radio clocks (although some -appear to be better supported than others). Most radio clocks are -designed for commercial use and cost thousands of dollars, but there -are some cheaper alternatives (discussed in later sections). In the -past most were WWV or WWVB receivers, but now most of them seem to be -GPS receivers. NIST has a PDF file that lists manufacturers of radio -clocks on their website at - (near the bottom of -the page). The NTP website also includes many links to manufacturers -of radio clocks at and -. Either list may -or may not be up to date at any given time :-). The list of drivers -for ntpd is at -. - - - -Ntpd also includes drivers for several dial-up time services. These -are all long-distance (toll) calls, so be sure to calculate the effect -on your phone bill before using them. - - -3.4. Chrony - - -Xntpd was originally written for machines that have a full-time -connection to a network time server or radio clock. In theory it can -also be used with machines that are only connected intermittently, but -Richard Curnow couldn't get it to work the way he wanted it to, so he -wrote "chrony" as an alternative for those of us who only have network -access when we're dialed in to an ISP (this is the same problem that -ntpd's new "burst mode" was designed to solve). The current version -of chrony includes drift correction for the RTC, for machines that are -turned off for long periods of time. - - - -You can get more information from Richard Curnow's website at - or . -There are also two chrony mailing lists, one for announcements and one -for discussion by users. For information send email to chrony-users- -subscribe@egroups.com or chrony-announce-subscribe@egroups.com - - - -Chrony is normally distributed as source code only, but Debian has -been including a binary in their "unstable" collection. The source -file is also available at the usual Linux archive sites. - - -3.5. Clockspeed - - -Another option is the clockspeed program by DJ Bernstein. It gets the -time from a network time server and simply resets the system clock -every three seconds. It can also be used to synchronize several -machines on a LAN. - - - -I've sometimes had trouble reaching his website at -, so if you get a DNS error try again -on another day. I'll try to update this section if I get some better -information. - - - -Note -You must be logged in as "root" to run any program that affects -the RTC or the system time, which includes most of the programs -described here. If you normally use a graphical interface for -everything, you may also need to learn some basic unix shell -commands. - - - -Note -If you run more than one OS on your machine, you should only let -one of them set the RTC, so they don't confuse each other. The -exception is the twice-a-year adjustment for Daylight Saving(s) -Time. - - - -If you run a dual-boot system that spends a lot of time running -Windows, you may want to check out some of the clock software -available for that OS instead. Follow the links on the NTP website at -. - - - diff --git a/LDP/guide/docbook/Linux-Networking/Protocols-and-Standards.xml b/LDP/guide/docbook/Linux-Networking/Protocols-and-Standards.xml index 93f9baf1..abd5ede1 100644 --- a/LDP/guide/docbook/Linux-Networking/Protocols-and-Standards.xml +++ b/LDP/guide/docbook/Linux-Networking/Protocols-and-Standards.xml @@ -3310,3 +3310,85 @@ and up-to-date IPv6 implementation. ----------------------------------------------------------------------------- + + + +STRIP + + +STRIP (Starnode Radio IP) is a protocol designed specifically for +a range of Metricom radio modems for a research project being +conducted by Stanford University called the MosquitoNet Project. +There is a lot of interesting reading here, even if you aren't +directly interested in the project. + + + +The Metricom radios connect to a serial port, employ spread spectrum +technology and are typically capable of about 100kbps. Information on +the Metricom radios is available from the: Metricom Web Server. + + + +At present the standard network tools and utilities do not support the +STRIP driver, so you will have to download some customized tools from +the MosquitoNet web server. Details on what software you need is +available at the: MosquitoNet STRIP Page. + + + +A summary of configuration is that you use a modified slattach program +to set the line discipline of a serial tty device to STRIP and then +configure the resulting `st[0-9]' device as you would for ethernet +with one important exception, for technical reasons STRIP does not +support the ARP protocol, so you must manually configure the ARP +entries for each of the hosts on your subnet. This shouldn't prove too +onerous. STRIP device names are `st0', `st1', etc.... The relevant +kernel compilation options are given below. + + + + + Kernel Compile Options: + + Network device support ---> + [*] Network device support + .... + [*] Radio network interfaces + < > STRIP (Metricom starmode radio IP) + + + + + + + +WaveLAN + + +The WaveLAN card is a spread spectrum wireless lan card. The card +looks very like an ethernet card in practice and is configured in much +the same way. + + + +You can get information on the Wavelan card from wavelan.com. + + + +Wavelan device names are `eth0', `eth1', etc. + + + + Kernel Compile Options: + + Network device support ---> + [*] Network device support + .... + [*] Radio network interfaces + .... + <*> WaveLAN support + + + + diff --git a/LDP/guide/docbook/Linux-Networking/Proxy-Caching.xml b/LDP/guide/docbook/Linux-Networking/Proxy-Caching.xml deleted file mode 100644 index c6a65f48..00000000 --- a/LDP/guide/docbook/Linux-Networking/Proxy-Caching.xml +++ /dev/null @@ -1,2223 +0,0 @@ - - - 8.11. Proxy Server - - The term proxy means "to do something on behalf of someone else." In - networking terms, a proxy server computer can act on the behalf of - several clients. An HTTP proxy is a machine that receives requests for - web pages from another machine (Machine A). The proxy gets the page - requested and returns the result to Machine A. The proxy may have a - cache with the requested pages, so if another machine asks for the - same page the copy in the cache will be returned instead. This allows - efficient use of bandwidth resources and less response time. As a side - effect, as client machines are not directly connected to the outside - world this is a way of securing the internal network. A well- - configured proxy can be as effective as a good firewall. - - Several proxy servers exist for Linux. One popular solution is the - Apache proxy module. A more complete and robust implementation of an - HTTP proxy is SQUID. - ˇ Apache - - ˇ Squid - -Proxy-Caching - - -When a web browser retreives information from the Internet, it stores a copy of that information -in a cache on the local machine. When a user requests that information in future, the browser will -check to seee if the original source has updated; if not, the browser will simply use the cached version -rather than fetch the data again. - -By doing this, there is less information that needs to be downloadded, which makes the connection seem responsive -to users and reduces bandwidth costs. - -But if there are many browsers accessing the Internet through the same connection, it makes better sense to have -a single, centralised cache so that once a single machine has requested some information, the next -machine to try and download that information can also access it more quickly. This is the -theory behind the proxy cache. Squid is by far the most popular cache used on the Web, and can also be used -to accelerate Web serving. - -Although Squid is useful for an ISP, large businesses or even a small office can afford to use Squid to -speed up transfers and save money, and it can easily be used to the same effect in a home with a few -flatmates sharing a cable or ADSL connection. - - -Traffic Control HOWTO - -Version 1.0.1 - -Martin A. Brown - - [http://www.securepipe.com/] SecurePipe, Inc. -Network Administration - - - -"Nov 2003" -Revision History -Revision 1.0.1 2003-11-17 Revised by: MAB -Added link to Leonardo Balliache's documentation -Revision 1.0 2003-09-24 Revised by: MAB -reviewed and approved by TLDP -Revision 0.7 2003-09-14 Revised by: MAB -incremental revisions, proofreading, ready for TLDP -Revision 0.6 2003-09-09 Revised by: MAB -minor editing, corrections from Stef Coene -Revision 0.5 2003-09-01 Revised by: MAB -HTB section mostly complete, more diagrams, LARTC pre-release -Revision 0.4 2003-08-30 Revised by: MAB -added diagram -Revision 0.3 2003-08-29 Revised by: MAB -substantial completion of classless, software, rules, elements and components -sections -Revision 0.2 2003-08-23 Revised by: MAB -major work on overview, elements, components and software sections -Revision 0.1 2003-08-15 Revised by: MAB -initial revision (outline complete) - - - Traffic control encompasses the sets of mechanisms and operations by which -packets are queued for transmission/reception on a network interface. The -operations include enqueuing, policing, classifying, scheduling, shaping and -dropping. This HOWTO provides an introduction and overview of the -capabilities and implementation of traffic control under Linux. - -Š 2003, Martin A. Brown - - - Permission is granted to copy, distribute and/or modify this document - under the terms of the GNU Free Documentation License, Version 1.1 or any - later version published by the Free Software Foundation; with no - invariant sections, with no Front-Cover Texts, with no Back-Cover Text. A - copy of the license is located at [http://www.gnu.org/licenses/fdl.html] - http://www.gnu.org/licenses/fdl.html. - ------------------------------------------------------------------------------ -Table of Contents -1. Introduction to Linux Traffic Control - 1.1. Target audience and assumptions about the reader - 1.2. Conventions - 1.3. Recommended approach - 1.4. Missing content, corrections and feedback - - -2. Overview of Concepts - 2.1. What is it? - 2.2. Why use it? - 2.3. Advantages - 2.4. Disdvantages - 2.5. Queues - 2.6. Flows - 2.7. Tokens and buckets - 2.8. Packets and frames - - -3. Traditional Elements of Traffic Control - 3.1. Shaping - 3.2. Scheduling - 3.3. Classifying - 3.4. Policing - 3.5. Dropping - 3.6. Marking - - -4. Components of Linux Traffic Control - 4.1. qdisc - 4.2. class - 4.3. filter - 4.4. classifier - 4.5. policer - 4.6. drop - 4.7. handle - - -5. Software and Tools - 5.1. Kernel requirements - 5.2. iproute2 tools (tc) - 5.3. tcng, Traffic Control Next Generation - 5.4. IMQ, Intermediate Queuing device - - -6. Classless Queuing Disciplines (qdiscs) - 6.1. FIFO, First-In First-Out (pfifo and bfifo) - 6.2. pfifo_fast, the default Linux qdisc - 6.3. SFQ, Stochastic Fair Queuing - 6.4. ESFQ, Extended Stochastic Fair Queuing - 6.5. GRED, Generic Random Early Drop - 6.6. TBF, Token Bucket Filter - - -7. Classful Queuing Disciplines (qdiscs) - 7.1. HTB, Hierarchical Token Bucket - 7.2. PRIO, priority scheduler - 7.3. CBQ, Class Based Queuing - - -8. Rules, Guidelines and Approaches - 8.1. General Rules of Linux Traffic Control - 8.2. Handling a link with a known bandwidth - 8.3. Handling a link with a variable (or unknown) bandwidth - 8.4. Sharing/splitting bandwidth based on flows - 8.5. Sharing/splitting bandwidth based on IP - - -9. Scripts for use with QoS/Traffic Control - 9.1. wondershaper - 9.2. ADSL Bandwidth HOWTO script (myshaper) - 9.3. htb.init - 9.4. tcng.init - 9.5. cbq.init - - -10. Diagram - 10.1. General diagram - - -11. Annotated Traffic Control Links - -1. Introduction to Linux Traffic Control - - Linux offers a very rich set of tools for managing and manipulating the -transmission of packets. The larger Linux community is very familiar with the -tools available under Linux for packet mangling and firewalling (netfilter, -and before that, ipchains) as well as hundreds of network services which can -run on the operating system. Few inside the community and fewer outside the -Linux community are aware of the tremendous power of the traffic control -subsystem which has grown and matured under kernels 2.2 and 2.4. - - This HOWTO purports to introduce the concepts of traffic control, the -traditional elements (in general), the components of the Linux traffic -control implementation and provide some guidelines . This HOWTO represents -the collection, amalgamation and synthesis of the [http://lartc.org/howto/] -LARTC HOWTO, documentation from individual projects and importantly the LARTC -mailing list over a period of study. - - The impatient soul, who simply wishes to experiment right now, is -recommended to the [http://tldp.org/HOWTO/Traffic-Control-tcng-HTB-HOWTO/] -Traffic Control using tcng and HTB HOWTO and [http://lartc.org/howto/] LARTC -HOWTO for immediate satisfaction. - - ------------------------------------------------------------------------------ - -1.1. Target audience and assumptions about the reader - - The target audience for this HOWTO is the network administrator or savvy -home user who desires an introduction to the field of traffic control and an -overview of the tools available under Linux for implementing traffic control. - - I assume that the reader is comfortable with UNIX concepts and the command -line and has a basic knowledge of IP networking. Users who wish to implement -traffic control may require the ability to patch, compile and install a -kernel or software package [1]. For users with newer kernels (2.4.20+, see -also Section 5.1), however, the ability to install and use software may be -all that is required. - - Broadly speaking, this HOWTO was written with a sophisticated user in mind, -perhaps one who has already had experience with traffic control under Linux. -I assume that the reader may have no prior traffic control experience. ------------------------------------------------------------------------------ - -1.2. Conventions - - This text was written in [http://www.docbook.org/] DocBook ([http:// -www.docbook.org/xml/4.2/index.html] version 4.2) with vim. All formatting has -been applied by [http://xmlsoft.org/XSLT/] xsltproc based on DocBook XSL and -LDP XSL stylesheets. Typeface formatting and display conventions are similar -to most printed and electronically distributed technical documentation. ------------------------------------------------------------------------------ - -1.3. Recommended approach - - I strongly recommend to the eager reader making a first foray into the -discipline of traffic control, to become only casually familiar with the tc -command line utility, before concentrating on tcng. The tcng software package -defines an entire language for describing traffic control structures. At -first, this language may seem daunting, but mastery of these basics will -quickly provide the user with a much wider ability to employ (and deploy) -traffic control configurations than the direct use of tc would afford. - - Where possible, I'll try to prefer describing the behaviour of the Linux -traffic control system in an abstract manner, although in many cases I'll -need to supply the syntax of one or the other common systems for defining -these structures. I may not supply examples in both the tcng language and the -tc command line, so the wise user will have some familiarity with both. - - ------------------------------------------------------------------------------ - -1.4. Missing content, corrections and feedback - - There is content yet missing from this HOWTO. In particular, the following -items will be added at some point to this documentation. - -  *  A description and diagram of GRED, WRR, PRIO and CBQ. - -  *  A section of examples. - -  *  A section detailing the classifiers. - -  *  A section discussing the techniques for measuring traffic. - -  *  A section covering meters. - -  *  More details on tcng. - - - I welcome suggestions, corrections and feedback at . All errors and omissions are strictly my fault. Although I have made every -effort to verify the factual correctness of the content presented herein, I -cannot accept any responsibility for actions taken under the influence of -this documentation. - - ------------------------------------------------------------------------------ - -2. Overview of Concepts - - This section will introduce traffic control and examine reasons for it, -identify a few advantages and disadvantages and introduce key concepts used -in traffic control. ------------------------------------------------------------------------------ - -2.1. What is it? - - Traffic control is the name given to the sets of queuing systems and -mechanisms by which packets are received and transmitted on a router. This -includes deciding which (and whether) packets to accept at what rate on the -input of an interface and determining which packets to transmit in what order -at what rate on the output of an interface. - - In the overwhelming majority of situations, traffic control consists of a -single queue which collects entering packets and dequeues them as quickly as -the hardware (or underlying device) can accept them. This sort of queue is a -FIFO. - -Note The default qdisc under Linux is the pfifo_fast, which is slightly more - complex than the FIFO. - - There are examples of queues in all sorts of software. The queue is a way -of organizing the pending tasks or data (see also Section 2.5). Because -network links typically carry data in a serialized fashion, a queue is -required to manage the outbound data packets. - - In the case of a desktop machine and an efficient webserver sharing the -same uplink to the Internet, the following contention for bandwidth may -occur. The web server may be able to fill up the output queue on the router -faster than the data can be transmitted across the link, at which point the -router starts to drop packets (its buffer is full!). Now, the desktop machine -(with an interactive application user) may be faced with packet loss and high -latency. Note that high latency sometimes leads to screaming users! By -separating the internal queues used to service these two different classes of -application, there can be better sharing of the network resource between the -two applications. - - Traffic control is the set of tools which allows the user to have granular -control over these queues and the queuing mechanisms of a networked device. -The power to rearrange traffic flows and packets with these tools is -tremendous and can be complicated, but is no substitute for adequate -bandwidth. - - The term Quality of Service (QoS) is often used as a synonym for traffic -control. ------------------------------------------------------------------------------ - -2.2. Why use it? - - Packet-switched networks differ from circuit based networks in one very -important regard. A packet-switched network itself is stateless. A -circuit-based network (such as a telephone network) must hold state within -the network. IP networks are stateless and packet-switched networks by -design; in fact, this statelessness is one of the fundamental strengths of -IP. - - The weakness of this statelessness is the lack of differentiation between -types of flows. In simplest terms, traffic control allows an administrator to -queue packets differently based on attributes of the packet. It can even be -used to simulate the behaviour of a circuit-based network. This introduces -statefulness into the stateless network. - - There are many practical reasons to consider traffic control, and many -scenarios in which using traffic control makes sense. Below are some examples -of common problems which can be solved or at least ameliorated with these -tools. - - The list below is not an exhaustive list of the sorts of solutions -available to users of traffic control, but introduces the types of problems -that can be solved by using traffic control to maximize the usability of a -network connection. - -Common traffic control solutions - -  *  Limit total bandwidth to a known rate; TBF, HTB with child class(es). - -  *  Limit the bandwidth of a particular user, service or client; HTB - classes and classifying with a filter. traffic. - -  *  Maximize TCP throughput on an asymmetric link; prioritize transmission - of ACK packets, wondershaper. - -  *  Reserve bandwidth for a particular application or user; HTB with - children classes and classifying. - -  *  Prefer latency sensitive traffic; PRIO inside an HTB class. - -  *  Managed oversubscribed bandwidth; HTB with borrowing. - -  *  Allow equitable distribution of unreserved bandwidth; HTB with - borrowing. - -  *  Ensure that a particular type of traffic is dropped; policer attached - to a filter with a drop action. - - - Remember, too that sometimes, it is simply better to purchase more -bandwidth. Traffic control does not solve all problems! - - ------------------------------------------------------------------------------ - -2.3. Advantages - - When properly employed, traffic control should lead to more predictable -usage of network resources and less volatile contention for these resources. -The network then meets the goals of the traffic control configuration. Bulk -download traffic can be allocated a reasonable amount of bandwidth even as -higher priority interactive traffic is simultaneously serviced. Even low -priority data transfer such as mail can be allocated bandwidth without -tremendously affecting the other classes of traffic. - - In a larger picture, if the traffic control configuration represents policy -which has been communicated to the users, then users (and, by extension, -applications) know what to expect from the network. - - ------------------------------------------------------------------------------ - -2.4. Disdvantages - - - - Complexity is easily one of the most significant disadvantages of using -traffic control. There are ways to become familiar with traffic control tools -which ease the learning curve about traffic control and its mechanisms, but -identifying a traffic control misconfiguration can be quite a challenge. - - Traffic control when used appropriately can lead to more equitable -distribution of network resources. It can just as easily be installed in an -inappropriate manner leading to further and more divisive contention for -resources. - - The computing resources required on a router to support a traffic control -scenario need to be capable of handling the increased cost of maintaining the -traffic control structures. Fortunately, this is a small incremental cost, -but can become more significant as the configuration grows in size and -complexity. - - For personal use, there's no training cost associated with the use of -traffic control, but a company may find that purchasing more bandwidth is a -simpler solution than employing traffic control. Training employees and -ensuring depth of knowledge may be more costly than investing in more -bandwidth. - - Although traffic control on packet-switched networks covers a larger -conceptual area, you can think of traffic control as a way to provide [some -of] the statefulness of a circuit-based network to a packet-switched network. ------------------------------------------------------------------------------ - -2.5. Queues - - Queues form the backdrop for all of traffic control and are the integral -concept behind scheduling. A queue is a location (or buffer) containing a -finite number of items waiting for an action or service. In networking, a -queue is the place where packets (our units) wait to be transmitted by the -hardware (the service). In the simplest model, packets are transmitted in a -first-come first-serve basis [2]. In the discipline of computer networking -(and more generally computer science), this sort of a queue is known as a -FIFO. - - Without any other mechanisms, a queue doesn't offer any promise for traffic -control. There are only two interesting actions in a queue. Anything entering -a queue is enqueued into the queue. To remove an item from a queue is to -dequeue that item. - - A queue becomes much more interesting when coupled with other mechanisms -which can delay packets, rearrange, drop and prioritize packets in multiple -queues. A queue can also use subqueues, which allow for complexity of -behaviour in a scheduling operation. - - From the perspective of the higher layer software, a packet is simply -enqueued for transmission, and the manner and order in which the enqueued -packets are transmitted is immaterial to the higher layer. So, to the higher -layer, the entire traffic control system may appear as a single queue [3]. It -is only by examining the internals of this layer that the traffic control -structures become exposed and available. ------------------------------------------------------------------------------ - -2.6. Flows - - A flow is a distinct connection or conversation between two hosts. Any -unique set of packets between two hosts can be regarded as a flow. Under TCP -the concept of a connection with a source IP and port and destination IP and -port represents a flow. A UDP flow can be similarly defined. - - Traffic control mechanisms frequently separate traffic into classes of -flows which can be aggregated and transmitted as an aggregated flow (consider -DiffServ). Alternate mechanisms may attempt to divide bandwidth equally based -on the individual flows. - - Flows become important when attempting to divide bandwidth equally among a -set of competing flows, especially when some applications deliberately build -a large number of flows. ------------------------------------------------------------------------------ - -2.7. Tokens and buckets - - Two of the key underpinnings of a shaping mechanisms are the interrelated -concepts of tokens and buckets. - - In order to control the rate of dequeuing, an implementation can count the -number of packets or bytes dequeued as each item is dequeued, although this -requires complex usage of timers and measurements to limit accurately. -Instead of calculating the current usage and time, one method, used widely in -traffic control, is to generate tokens at a desired rate, and only dequeue -packets or bytes if a token is available. - - Consider the analogy of an amusement park ride with a queue of people -waiting to experience the ride. Let's imagine a track on which carts traverse -a fixed track. The carts arrive at the head of the queue at a fixed rate. In -order to enjoy the ride, each person must wait for an available cart. The -cart is analogous to a token and the person is analogous to a packet. Again, -this mechanism is a rate-limiting or shaping mechanism. Only a certain number -of people can experience the ride in a particular period. - - To extend the analogy, imagine an empty line for the amusement park ride -and a large number of carts sitting on the track ready to carry people. If a -large number of people entered the line together many (maybe all) of them -could experience the ride because of the carts available and waiting. The -number of carts available is a concept analogous to the bucket. A bucket -contains a number of tokens and can use all of the tokens in bucket without -regard for passage of time. - - And to complete the analogy, the carts on the amusement park ride (our -tokens) arrive at a fixed rate and are only kept available up to the size of -the bucket. So, the bucket is filled with tokens according to the rate, and -if the tokens are not used, the bucket can fill up. If tokens are used the -bucket will not fill up. Buckets are a key concept in supporting bursty -traffic such as HTTP. - - The TBF qdisc is a classical example of a shaper (the section on TBF -includes a diagram which may help to visualize the token and bucket -concepts). The TBF generates rate tokens and only transmits packets when a -token is available. Tokens are a generic shaping concept. - - In the case that a queue does not need tokens immediately, the tokens can -be collected until they are needed. To collect tokens indefinitely would -negate any benefit of shaping so tokens are collected until a certain number -of tokens has been reached. Now, the queue has tokens available for a large -number of packets or bytes which need to be dequeued. These intangible tokens -are stored in an intangible bucket, and the number of tokens that can be -stored depends on the size of the bucket. - - This also means that a bucket full of tokens may be available at any -instant. Very predictable regular traffic can be handled by small buckets. -Larger buckets may be required for burstier traffic, unless one of the -desired goals is to reduce the burstiness of the flows. - - In summary, tokens are generated at rate, and a maximum of a bucket's worth -of tokens may be collected. This allows bursty traffic to be handled, while -smoothing and shaping the transmitted traffic. - - The concepts of tokens and buckets are closely interrelated and are used in -both TBF (one of the classless qdiscs) and HTB (one of the classful qdiscs). -Within the tcng language, the use of two- and three-color meters is -indubitably a token and bucket concept. ------------------------------------------------------------------------------ - -2.8. Packets and frames - - The terms for data sent across network changes depending on the layer the -user is examining. This document will rather impolitely (and incorrectly) -gloss over the technical distinction between packets and frames although they -are outlined here. - - The word frame is typically used to describe a layer 2 (data link) unit of -data to be forwarded to the next recipient. Ethernet interfaces, PPP -interfaces, and T1 interfaces all name their layer 2 data unit a frame. The -frame is actually the unit on which traffic control is performed. - - A packet, on the other hand, is a higher layer concept, representing layer -3 (network) units. The term packet is preferred in this documentation, -although it is slightly inaccurate. ------------------------------------------------------------------------------ - -3. Traditional Elements of Traffic Control - - ------------------------------------------------------------------------------ - -3.1. Shaping - - Shapers delay packets to meet a desired rate. - - Shaping is the mechanism by which packets are delayed before transmission -in an output queue to meet a desired output rate. This is one of the most -common desires of users seeking bandwidth control solutions. The act of -delaying a packet as part of a traffic control solution makes every shaping -mechanism into a non-work-conserving mechanism, meaning roughly: "Work is -required in order to delay packets." - - Viewed in reverse, a non-work-conserving queuing mechanism is performing a -shaping function. A work-conserving queuing mechanism (see PRIO) would not be -capable of delaying a packet. - - Shapers attempt to limit or ration traffic to meet but not exceed a -configured rate (frequently measured in packets per second or bits/bytes per -second). As a side effect, shapers can smooth out bursty traffic [4]. One of -the advantages of shaping bandwidth is the ability to control latency of -packets. The underlying mechanism for shaping to a rate is typically a token -and bucket mechanism. See also Section 2.7 for further detail on tokens and -buckets. ------------------------------------------------------------------------------ - -3.2. Scheduling - - Schedulers arrange and/or rearrange packets for output. - - Scheduling is the mechanism by which packets are arranged (or rearranged) -between input and output of a particular queue. The overwhelmingly most -common scheduler is the FIFO (first-in first-out) scheduler. From a larger -perspective, any set of traffic control mechanisms on an output queue can be -regarded as a scheduler, because packets are arranged for output. - - Other generic scheduling mechanisms attempt to compensate for various -networking conditions. A fair queuing algorithm (see SFQ) attempts to prevent -any single client or flow from dominating the network usage. A round-robin -algorithm (see WRR) gives each flow or client a turn to dequeue packets. -Other sophisticated scheduling algorithms attempt to prevent backbone -overload (see GRED) or refine other scheduling mechanisms (see ESFQ). ------------------------------------------------------------------------------ - -3.3. Classifying - - Classifiers sort or separate traffic into queues. - - Classifying is the mechanism by which packets are separated for different -treatment, possibly different output queues. During the process of accepting, -routing and transmitting a packet, a networking device can classify the -packet a number of different ways. Classification can include marking the -packet, which usually happens on the boundary of a network under a single -administrative control or classification can occur on each hop individually. - - The Linux model (see Section 4.3) allows for a packet to cascade across a -series of classifiers in a traffic control structure and to be classified in -conjunction with policers (see also Section 4.5). ------------------------------------------------------------------------------ - -3.4. Policing - - Policers measure and limit traffic in a particular queue. - - Policing, as an element of traffic control, is simply a mechanism by which -traffic can be limited. Policing is most frequently used on the network -border to ensure that a peer is not consuming more than its allocated -bandwidth. A policer will accept traffic to a certain rate, and then perform -an action on traffic exceeding this rate. A rather harsh solution is to drop -the traffic, although the traffic could be reclassified instead of being -dropped. - - A policer is a yes/no question about the rate at which traffic is entering -a queue. If the packet is about to enter a queue below a given rate, take one -action (allow the enqueuing). If the packet is about to enter a queue above a -given rate, take another action. Although the policer uses a token bucket -mechanism internally, it does not have the capability to delay a packet as a -shaping mechanism does. ------------------------------------------------------------------------------ - -3.5. Dropping - - Dropping discards an entire packet, flow or classification. - - Dropping a packet is a mechanism by which a packet is discarded. - - ------------------------------------------------------------------------------ - -3.6. Marking - - Marking is a mechanism by which the packet is altered. - -Note This is not fwmark. The iptablestarget MARKand the ipchains--markare - used to modify packet metadata, not the packet itself. - - Traffic control marking mechanisms install a DSCP on the packet itself, -which is then used and respected by other routers inside an administrative -domain (usually for DiffServ). ------------------------------------------------------------------------------ - -4. Components of Linux Traffic Control - - - - - - - - -Table 1. Correlation between traffic control elements and Linux components -+-------------------+-------------------------------------------------------+ -|traditional element|Linux component | -+-------------------+-------------------------------------------------------+ -|shaping |The class offers shaping capabilities. | -+-------------------+-------------------------------------------------------+ -|scheduling |A qdisc is a scheduler. Schedulers can be simple such | -| |as the FIFO or complex, containing classes and other | -| |qdiscs, such as HTB. | -+-------------------+-------------------------------------------------------+ -|classifying |The filter object performs the classification through | -| |the agency of a classifier object. Strictly speaking, | -| |Linux classifiers cannot exist outside of a filter. | -+-------------------+-------------------------------------------------------+ -|policing |A policer exists in the Linux traffic control | -| |implementation only as part of a filter. | -+-------------------+-------------------------------------------------------+ -|dropping |To drop traffic requires a filter with a policer which | -| |uses "drop" as an action. | -+-------------------+-------------------------------------------------------+ -|marking |The dsmark qdisc is used for marking. | -+-------------------+-------------------------------------------------------+ ------------------------------------------------------------------------------ - -4.1. qdisc - - Simply put, a qdisc is a scheduler (Section 3.2). Every output interface -needs a scheduler of some kind, and the default scheduler is a FIFO. Other -qdiscs available under Linux will rearrange the packets entering the -scheduler's queue in accordance with that scheduler's rules. - - The qdisc is the major building block on which all of Linux traffic control -is built, and is also called a queuing discipline. - - The classful qdiscs can contain classes, and provide a handle to which to -attach filters. There is no prohibition on using a classful qdisc without -child classes, although this will usually consume cycles and other system -resources for no benefit. - - The classless qdiscs can contain no classes, nor is it possible to attach -filter to a classless qdisc. Because a classless qdisc contains no children -of any kind, there is no utility to classifying. This means that no filter -can be attached to a classless qdisc. - - A source of terminology confusion is the usage of the terms root qdisc and -ingress qdisc. These are not really queuing disciplines, but rather locations -onto which traffic control structures can be attached for egress (outbound -traffic) and ingress (inbound traffic). - - Each interface contains both. The primary and more common is the egress -qdisc, known as the root qdisc. It can contain any of the queuing disciplines -(qdiscs) with potential classes and class structures. The overwhelming -majority of documentation applies to the root qdisc and its children. Traffic -transmitted on an interface traverses the egress or root qdisc. - - For traffic accepted on an interface, the ingress qdisc is traversed. With -its limited utility, it allows no child class to be created, and only exists -as an object onto which a filter can be attached. For practical purposes, the -ingress qdisc is merely a convenient object onto which to attach a policer to -limit the amount of traffic accepted on a network interface. - - In short, you can do much more with an egress qdisc because it contains a -real qdisc and the full power of the traffic control system. An ingress qdisc -can only support a policer. The remainder of the documentation will concern -itself with traffic control structures attached to the root qdisc unless -otherwise specified. ------------------------------------------------------------------------------ - -4.2. class - - Classes only exist inside a classful qdisc (e.g., HTB and CBQ). Classes are -immensely flexible and can always contain either multiple children classes or -a single child qdisc [5]. There is no prohibition against a class containing -a classful qdisc itself, which facilitates tremendously complex traffic -control scenarios. - - Any class can also have an arbitrary number of filters attached to it, -which allows the selection of a child class or the use of a filter to -reclassify or drop traffic entering a particular class. - - A leaf class is a terminal class in a qdisc. It contains a qdisc (default -FIFO) and will never contain a child class. Any class which contains a child -class is an inner class (or root class) and not a leaf class. ------------------------------------------------------------------------------ - -4.3. filter - - The filter is the most complex component in the Linux traffic control -system. The filter provides a convenient mechanism for gluing together -several of the key elements of traffic control. The simplest and most obvious -role of the filter is to classify (see Section 3.3) packets. Linux filters -allow the user to classify packets into an output queue with either several -different filters or a single filter. - -  *  A filter must contain a classifier phrase. - -  *  A filter may contain a policer phrase. - - - Filters can be attached either to classful qdiscs or to classes, however -the enqueued packet always enters the root qdisc first. After the filter -attached to the root qdisc has been traversed, the packet may be directed to -any subclasses (which can have their own filters) where the packet may -undergo further classification. - - ------------------------------------------------------------------------------ - -4.4. classifier - - Filter objects, which can be manipulated using tc, can use several -different classifying mechanisms, the most common of which is the u32 -classifier. The u32 classifier allows the user to select packets based on -attributes of the packet. - - The classifiers are tools which can be used as part of a filter to identify -characteristics of a packet or a packet's metadata. The Linux classfier -object is a direct analogue to the basic operation and elemental mechanism of -traffic control classifying. ------------------------------------------------------------------------------ - -4.5. policer - - This elemental mechanism is only used in Linux traffic control as part of a -filter. A policer calls one action above and another action below the -specified rate. Clever use of policers can simulate a three-color meter. See -also Section 10. - - Although both policing and shaping are basic elements of traffic control -for limiting bandwidth usage a policer will never delay traffic. It can only -perform an action based on specified criteria. See also Example 5. - - - - ------------------------------------------------------------------------------ - -4.6. drop - - This basic traffic control mechanism is only used in Linux traffic control -as part of a policer. Any policer attached to any filter could have a drop -action. - -Note The only place in the Linux traffic control system where a packet can be - explicitly dropped is a policer. A policer can limit packets enqueued at - a specific rate, or it can be configured to drop all traffic matching a - particular pattern [6]. - - There are, however, places within the traffic control system where a packet -may be dropped as a side effect. For example, a packet will be dropped if the -scheduler employed uses this method to control flows as the GRED does. - - Also, a shaper or scheduler which runs out of its allocated buffer space -may have to drop a packet during a particularly bursty or overloaded period. - - ------------------------------------------------------------------------------ - -4.7. handle - - Every class and classful qdisc (see also Section 7) requires a unique -identifier within the traffic control structure. This unique identifier is -known as a handle and has two constituent members, a major number and a minor -number. These numbers can be assigned arbitrarily by the user in accordance -with the following rules [7]. - - - -The numbering of handles for classes and qdiscs - -major - This parameter is completely free of meaning to the kernel. The user - may use an arbitrary numbering scheme, however all objects in the traffic - control structure with the same parent must share a major handle number. - Conventional numbering schemes start at 1 for objects attached directly - to the root qdisc. - -minor - This parameter unambiguously identifies the object as a qdisc if minor - is 0. Any other value identifies the object as a class. All classes - sharing a parent must have unique minor numbers. - - - The special handle ffff:0 is reserved for the ingress qdisc. - - The handle is used as the target in classid and flowid phrases of tc filter -statements. These handles are external identifiers for the objects, usable by -userland applications. The kernel maintains internal identifiers for each -object. ------------------------------------------------------------------------------ - -5. Software and Tools - - ------------------------------------------------------------------------------ - -5.1. Kernel requirements - - Many distributions provide kernels with modular or monolithic support for -traffic control (Quality of Service). Custom kernels may not already provide -support (modular or not) for the required features. If not, this is a very -brief listing of the required kernel options. - - The user who has little or no experience compiling a kernel is recommended -to Kernel HOWTO. Experienced kernel compilers should be able to determine -which of the below options apply to the desired configuration, after reading -a bit more about traffic control and planning. - - -Example 1. Kernel compilation options [8] -# -# QoS and/or fair queueing -# -CONFIG_NET_SCHED=y -CONFIG_NET_SCH_CBQ=m -CONFIG_NET_SCH_HTB=m -CONFIG_NET_SCH_CSZ=m -CONFIG_NET_SCH_PRIO=m -CONFIG_NET_SCH_RED=m -CONFIG_NET_SCH_SFQ=m -CONFIG_NET_SCH_TEQL=m -CONFIG_NET_SCH_TBF=m -CONFIG_NET_SCH_GRED=m -CONFIG_NET_SCH_DSMARK=m -CONFIG_NET_SCH_INGRESS=m -CONFIG_NET_QOS=y -CONFIG_NET_ESTIMATOR=y -CONFIG_NET_CLS=y -CONFIG_NET_CLS_TCINDEX=m -CONFIG_NET_CLS_ROUTE4=m -CONFIG_NET_CLS_ROUTE=y -CONFIG_NET_CLS_FW=m -CONFIG_NET_CLS_U32=m -CONFIG_NET_CLS_RSVP=m -CONFIG_NET_CLS_RSVP6=m -CONFIG_NET_CLS_POLICE=y - - - A kernel compiled with the above set of options will provide modular -support for almost everything discussed in this documentation. The user may -need to modprobe module before using a given feature. Again, the confused -user is recommended to the Kernel HOWTO, as this document cannot adequately -address questions about the use of the Linux kernel. ------------------------------------------------------------------------------ - -5.2. iproute2 tools (tc) - - iproute2 is a suite of command line utilities which manipulate kernel -structures for IP networking configuration on a machine. For technical -documentation on these tools, see the iproute2 documentation and for a more -expository discussion, the documentation at [http://linux-ip.net/] -linux-ip.net. Of the tools in the iproute2 package, the binary tc is the only -one used for traffic control. This HOWTO will ignore the other tools in the -suite. - - - Because it interacts with the kernel to direct the creation, deletion and -modification of traffic control structures, the tc binary needs to be -compiled with support for all of the qdiscs you wish to use. In particular, -the HTB qdisc is not supported yet in the upstream iproute2 package. See -Section 7.1 for more information. - - The tc tool performs all of the configuration of the kernel structures -required to support traffic control. As a result of its many uses, the -command syntax can be described (at best) as arcane. The utility takes as its -first non-option argument one of three Linux traffic control components, -qdisc, class or filter. - - -Example 2. tc command usage -[root@leander]# tc -Usage: tc [ OPTIONS ] OBJECT { COMMAND | help } -where OBJECT := { qdisc | class | filter } - OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] } - - - Each object accepts further and different options, and will be incompletely -described and documented below. The hints in the examples below are designed -to introduce the vagaries of tc command line syntax. For more examples, -consult the [http://lartc.org/howto/] LARTC HOWTO. For even better -understanding, consult the kernel and iproute2 code. - - -Example 3. tc qdisc -[root@leander]# tc qdisc add \ (1) -> dev eth0 \ (2) -> root \ (3) -> handle 1:0 \ (4) -> htb (5) - - -(1) Add a queuing discipline. The verb could also be del. -(2) Specify the device onto which we are attaching the new queuing - discipline. -(3) This means "egress" to tc. The word root must be used, however. Another - qdisc with limited functionality, the ingress qdisc can be attached to - the same device. -(4) The handle is a user-specified number of the form major:minor. The - minor number for any queueing discipline handle must always be zero (0). - An acceptable shorthand for a qdisc handle is the syntax "1:", where the - minor number is assumed to be zero (0) if not specified. -(5) This is the queuing discipline to attach, HTB in this example. Queuing - discipline specific parameters will follow this. In the example here, we - add no qdisc-specific parameters. - - Above was the simplest use of the tc utility for adding a queuing -discipline to a device. Here's an example of the use of tc to add a class to -an existing parent class. - - -Example 4. tc class -[root@leander]# tc class add \ (1) -> dev eth0 \ (2) -> parent 1:1 \ (3) -> classid 1:6 \ (4) -> htb \ (5) -> rate 256kbit \ (6) -> ceil 512kbit (7) - - -(1) Add a class. The verb could also be del. -(2) Specify the device onto which we are attaching the new class. -(3) Specify the parent handle to which we are attaching the new class. -(4) This is a unique handle (major:minor) identifying this class. The minor - number must be any non-zero (0) number. -(5) Both of the classful qdiscs require that any children classes be - classes of the same type as the parent. Thus an HTB qdisc will contain - HTB classes. -(6) (7) - This is a class specific parameter. Consult Section 7.1 for more detail - on these parameters. - - - - -Example 5. tc filter -[root@leander]# tc filter add \ (1) -> dev eth0 \ (2) -> parent 1:0 \ (3) -> protocol ip \ (4) -> prio 5 \ (5) -> u32 \ (6) -> match ip port 22 0xffff \ (7) -> match ip tos 0x10 0xff \ (8) -> flowid 1:6 \ (9) -> police \ (10) -> rate 32000bps \ (11) -> burst 10240 \ (12) -> mpu 0 \ (13) -> action drop/continue (14) - - -(1) Add a filter. The verb could also be del. -(2) Specify the device onto which we are attaching the new filter. -(3) Specify the parent handle to which we are attaching the new filter. -(4) This parameter is required. It's use should be obvious, although I - don't know more. -(5) The prio parameter allows a given filter to be preferred above another. - The pref is a synonym. -(6) This is a classifier, and is a required phrase in every tc filter - command. -(7) (8) - These are parameters to the classifier. In this case, packets with a - type of service flag (indicating interactive usage) and matching port 22 - will be selected by this statement. -(9) The flowid specifies the handle of the target class (or qdisc) to which - a matching filter should send its selected packets. -(10) - This is the policer, and is an optional phrase in every tc filter - command. -(11) The policer will perform one action above this rate, and another action - below (see action parameter). -(12) The burst is an exact analog to burst in HTB (burst is a buckets - concept). -(13) The minimum policed unit. To count all traffic, use an mpu of zero (0). -(14) The action indicates what should be done if the rate based on the - attributes of the policer. The first word specifies the action to take if - the policer has been exceeded. The second word specifies action to take - otherwise. - - As evidenced above, the tc command line utility has an arcane and complex -syntax, even for simple operations such as these examples show. It should -come as no surprised to the reader that there exists an easier way to -configure Linux traffic control. See the next section, Section 5.3. ------------------------------------------------------------------------------ - -5.3. tcng, Traffic Control Next Generation - - FIXME; sing the praises of tcng. See also [http://tldp.org/HOWTO/ -Traffic-Control-tcng-HTB-HOWTO/] Traffic Control using tcng and HTB HOWTO -and tcng documentation. - - Traffic control next generation (hereafter, tcng) provides all of the power -of traffic control under Linux with twenty percent of the headache. - - ------------------------------------------------------------------------------ - -5.4. IMQ, Intermediate Queuing device - - - - FIXME; must discuss IMQ. See also Patrick McHardy's website on [http:// -trash.net/~kaber/imq/] IMQ. - - ------------------------------------------------------------------------------ - -6. Classless Queuing Disciplines (qdiscs) - - Each of these queuing disciplines can be used as the primary qdisc on an -interface, or can be used inside a leaf class of a classful qdiscs. These are -the fundamental schedulers used under Linux. Note that the default scheduler -is the pfifo_fast. - - ------------------------------------------------------------------------------ - -6.1. FIFO, First-In First-Out (pfifo and bfifo) - -Note This is not the default qdisc on Linux interfaces. Be certain to see - Section 6.2 for the full details on the default (pfifo_fast) qdisc. - - The FIFO algorithm forms the basis for the default qdisc on all Linux -network interfaces (pfifo_fast). It performs no shaping or rearranging of -packets. It simply transmits packets as soon as it can after receiving and -queuing them. This is also the qdisc used inside all newly created classes -until another qdisc or a class replaces the FIFO. - -[fifo-qdisc] - - A real FIFO qdisc must, however, have a size limit (a buffer size) to -prevent it from overflowing in case it is unable to dequeue packets as -quickly as it receives them. Linux implements two basic FIFO qdiscs, one -based on bytes, and one on packets. Regardless of the type of FIFO used, the -size of the queue is defined by the parameter limit. For a pfifo the unit is -understood to be packets and for a bfifo the unit is understood to be bytes. - - -Example 6. Specifying a limit for a packet or byte FIFO -[root@leander]# cat bfifo.tcc -/* - * make a FIFO on eth0 with 10kbyte queue size - * - */ - -dev eth0 { - egress { - fifo (limit 10kB ); - } -} -[root@leander]# tcc < bfifo.tcc -# ================================ Device eth0 ================================ - -tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0 -tc qdisc add dev eth0 handle 2:0 parent 1:0 bfifo limit 10240 -[root@leander]# cat pfifo.tcc -/* - * make a FIFO on eth0 with 30 packet queue size - * - */ - -dev eth0 { - egress { - fifo (limit 30p ); - } -} -[root@leander]# tcc < pfifo.tcc -# ================================ Device eth0 ================================ - -tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0 -tc qdisc add dev eth0 handle 2:0 parent 1:0 pfifo limit 30 - ------------------------------------------------------------------------------ - -6.2. pfifo_fast, the default Linux qdisc - - The pfifo_fast qdisc is the default qdisc for all interfaces under Linux. -Based on a conventional FIFO qdisc, this qdisc also provides some -prioritization. It provides three different bands (individual FIFOs) for -separating traffic. The highest priority traffic (interactive flows) are -placed into band 0 and are always serviced first. Similarly, band 1 is always -emptied of pending packets before band 2 is dequeued. - -[pfifo_fast-qdisc] - - There is nothing configurable to the end user about the pfifo_fast qdisc. -For exact details on the priomap and use of the ToS bits, see the pfifo-fast -section of the LARTC HOWTO. ------------------------------------------------------------------------------ - -6.3. SFQ, Stochastic Fair Queuing - - The SFQ qdisc attempts to fairly distribute opportunity to transmit data to -the network among an arbitrary number of flows. It accomplishes this by using -a hash function to separate the traffic into separate (internally maintained) -FIFOs which are dequeued in a round-robin fashion. Because there is the -possibility for unfairness to manifest in the choice of hash function, this -function is altered periodically. Perturbation (the parameter perturb) sets -this periodicity. - -[sfq-qdisc] - - -Example 7. Creating an SFQ -[root@leander]# cat sfq.tcc -/* - * make an SFQ on eth0 with a 10 second perturbation - * - */ - -dev eth0 { - egress { - sfq( perturb 10s ); - } -} -[root@leander]# tcc < sfq.tcc -# ================================ Device eth0 ================================ - -tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0 -tc qdisc add dev eth0 handle 2:0 parent 1:0 sfq perturb 10 - - - Unfortunately, some clever software (e.g. Kazaa and eMule among others) -obliterate the benefit of this attempt at fair queuing by opening as many TCP -sessions (flows) as can be sustained. In many networks, with well-behaved -users, SFQ can adequately distribute the network resources to the contending -flows, but other measures may be called for when obnoxious applications have -invaded the network. - - See also Section 6.4 for an SFQ qdisc with more exposed parameters for the -user to manipulate. ------------------------------------------------------------------------------ - -6.4. ESFQ, Extended Stochastic Fair Queuing - - Conceptually, this qdisc is no different than SFQ although it allows the -user to control more parameters than its simpler cousin. This qdisc was -conceived to overcome the shortcoming of SFQ identified above. By allowing -the user to control which hashing algorithm is used for distributing access -to network bandwidth, it is possible for the user to reach a fairer real -distribution of bandwidth. - - -Example 8. ESFQ usage -Usage: ... esfq [ perturb SECS ] [ quantum BYTES ] [ depth FLOWS ] - [ divisor HASHBITS ] [ limit PKTS ] [ hash HASHTYPE] - -Where: -HASHTYPE := { classic | src | dst } - - - FIXME; need practical experience and/or attestation here. ------------------------------------------------------------------------------ - -6.5. GRED, Generic Random Early Drop - - FIXME; I have never used this. Need practical experience or attestation. - - Theory declares that a RED algorithm is useful on a backbone or core -network, but not as useful near the end-user. See the section on flows to see -a general discussion of the thirstiness of TCP. ------------------------------------------------------------------------------ - -6.6. TBF, Token Bucket Filter - - This qdisc is built on tokens and buckets. It simply shapes traffic -transmitted on an interface. To limit the speed at which packets will be -dequeued from a particular interface, the TBF qdisc is the perfect solution. -It simply slows down transmitted traffic to the specified rate. - - Packets are only transmitted if there are sufficient tokens available. -Otherwise, packets are deferred. Delaying packets in this fashion will -introduce an artificial latency into the packet's round trip time. - -[tbf-qdisc] - - -Example 9. Creating a 256kbit/s TBF -[root@leander]# cat tbf.tcc -/* - * make a 256kbit/s TBF on eth0 - * - */ - -dev eth0 { - egress { - tbf( rate 256 kbps, burst 20 kB, limit 20 kB, mtu 1514 B ); - } -} -[root@leander]# tcc < tbf.tcc -# ================================ Device eth0 ================================ - -tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0 -tc qdisc add dev eth0 handle 2:0 parent 1:0 tbf burst 20480 limit 20480 mtu 1514 rate 32000bps - - - ------------------------------------------------------------------------------ - -7. Classful Queuing Disciplines (qdiscs) - - The flexibility and control of Linux traffic control can be unleashed -through the agency of the classful qdiscs. Remember that the classful queuing -disciplines can have filters attached to them, allowing packets to be -directed to particular classes and subqueues. - - There are several common terms to describe classes directly attached to the -root qdisc and terminal classes. Classess attached to the root qdisc are -known as root classes, and more generically inner classes. Any terminal class -in a particular queuing discipline is known as a leaf class by analogy to the -tree structure of the classes. Besides the use of figurative language -depicting the structure as a tree, the language of family relationships is -also quite common. ------------------------------------------------------------------------------ - -7.1. HTB, Hierarchical Token Bucket - - HTB uses the concepts of tokens and buckets along with the class-based -system and filters to allow for complex and granular control over traffic. -With a complex borrowing model, HTB can perform a variety of sophisticated -traffic control techniques. One of the easiest ways to use HTB immediately is -that of shaping. - - By understanding tokens and buckets or by grasping the function of TBF, HTB -should be merely a logical step. This queuing discipline allows the user to -define the characteristics of the tokens and bucket used and allows the user -to nest these buckets in an arbitrary fashion. When coupled with a -classifying scheme, traffic can be controlled in a very granular fashion. - - - - Below is example output of the syntax for HTB on the command line with the -tc tool. Although the syntax for tcng is a language of its own, the rules for -HTB are the same. - - -Example 10. tc usage for HTB -Usage: ... qdisc add ... htb [default N] [r2q N] - default minor id of class to which unclassified packets are sent {0} - r2q DRR quantums are computed as rate in Bps/r2q {10} - debug string of 16 numbers each 0-3 {0} - -... class add ... htb rate R1 burst B1 [prio P] [slot S] [pslot PS] - [ceil R2] [cburst B2] [mtu MTU] [quantum Q] - rate rate allocated to this class (class can still borrow) - burst max bytes burst which can be accumulated during idle period {computed} - ceil definite upper class rate (no borrows) {rate} - cburst burst but for ceil {computed} - mtu max packet size we create rate map for {1600} - prio priority of leaf; lower are served first {0} - quantum how much bytes to serve from leaf at once {use r2q} - -TC HTB version 3.3 - - - ------------------------------------------------------------------------------ - -7.1.1. Software requirements - - Unlike almost all of the other software discussed, HTB is a newer queuing -discipline and your distribution may not have all of the tools and capability -you need to use HTB. The kernel must support HTB; kernel version 2.4.20 and -later support it in the stock distribution, although earlier kernel versions -require patching. To enable userland support for HTB, see [http:// -luxik.cdi.cz/~devik/qos/htb/] HTB for an iproute2 patch to tc. ------------------------------------------------------------------------------ - -7.1.2. Shaping - - One of the most common applications of HTB involves shaping transmitted -traffic to a specific rate. - - All shaping occurs in leaf classes. No shaping occurs in inner or root -classes as they only exist to suggest how the borrowing model should -distribute available tokens. - - - - ------------------------------------------------------------------------------ - -7.1.3. Borrowing - - A fundamental part of the HTB qdisc is the borrowing mechanism. Children -classes borrow tokens from their parents once they have exceeded rate. A -child class will continue to attempt to borrow until it reaches ceil, at -which point it will begin to queue packets for transmission until more tokens -/ctokens are available. As there are only two primary types of classes which -can be created with HTB the following table and diagram identify the various -possible states and the behaviour of the borrowing mechanisms. - - - - -Table 2. HTB class states and potential actions taken -+------+-----+--------------+-----------------------------------------------+ -|type |class|HTB internal |action taken | -|of |state|state | | -|class | | | | -+------+-----+--------------+-----------------------------------------------+ -|leaf |< |HTB_CAN_SEND |Leaf class will dequeue queued bytes up to | -| |rate | |available tokens (no more than burst packets) | -+------+-----+--------------+-----------------------------------------------+ -|leaf |> |HTB_MAY_BORROW|Leaf class will attempt to borrow tokens/ | -| |rate,| |ctokens from parent class. If tokens are | -| |< | |available, they will be lent in quantum | -| |ceil | |increments and the leaf class will dequeue up | -| | | |to cburst bytes | -+------+-----+--------------+-----------------------------------------------+ -|leaf |> |HTB_CANT_SEND |No packets will be dequeued. This will cause | -| |ceil | |packet delay and will increase latency to meet | -| | | |the desired rate. | -+------+-----+--------------+-----------------------------------------------+ -|inner,|< |HTB_CAN_SEND |Inner class will lend tokens to children. | -|root |rate | | | -+------+-----+--------------+-----------------------------------------------+ -|inner,|> |HTB_MAY_BORROW|Inner class will attempt to borrow tokens/ | -|root |rate,| |ctokens from parent class, lending them to | -| |< | |competing children in quantum increments per | -| |ceil | |request. | -+------+-----+--------------+-----------------------------------------------+ -|inner,|> |HTB_CANT_SEND |Inner class will not attempt to borrow from its| -|root |ceil | |parent and will not lend tokens/ctokens to | -| | | |children classes. | -+------+-----+--------------+-----------------------------------------------+ - - This diagram identifies the flow of borrowed tokens and the manner in which -tokens are charged to parent classes. In order for the borrowing model to -work, each class must have an accurate count of the number of tokens used by -itself and all of its children. For this reason, any token used in a child or -leaf class is charged to each parent class until the root class is reached. - - Any child class which wishes to borrow a token will request a token from -its parent class, which if it is also over its rate will request to borrow -from its parent class until either a token is located or the root class is -reached. So the borrowing of tokens flows toward the leaf classes and the -charging of the usage of tokens flows toward the root class. - -[htb-borrow] - - Note in this diagram that there are several HTB root classes. Each of these -root classes can simulate a virtual circuit. ------------------------------------------------------------------------------ - -7.1.4. HTB class parameters - - - -default - An optional parameter with every HTB qdisc object, the default default - is 0, which cause any unclassified traffic to be dequeued at hardware - speed, completely bypassing any of the classes attached to the root - qdisc. - -rate - Used to set the minimum desired speed to which to limit transmitted - traffic. This can be considered the equivalent of a committed information - rate (CIR), or the guaranteed bandwidth for a given leaf class. - -ceil - Used to set the maximum desired speed to which to limit the transmitted - traffic. The borrowing model should illustrate how this parameter is - used. This can be considered the equivalent of "burstable bandwidth". - -burst - This is the size of the rate bucket (see Tokens and buckets). HTB will - dequeue burst bytes before awaiting the arrival of more tokens. - -cburst - This is the size of the ceil bucket (see Tokens and buckets). HTB will - dequeue cburst bytes before awaiting the arrival of more ctokens. - -quantum - This is a key parameter used by HTB to control borrowing. Normally, the - correct quantum is calculated by HTB, not specified by the user. Tweaking - this parameter can have tremendous effects on borrowing and shaping under - contention, because it is used both to split traffic between children - classes over rate (but below ceil) and to transmit packets from these - same classes. - -r2q - Also, usually calculated for the user, r2q is a hint to HTB to help - determine the optimal quantum for a particular class. - -mtu - - -prio - - - - ------------------------------------------------------------------------------ - -7.1.5. Rules - - Below are some general guidelines to using HTB culled from [http:// -docum.org/] http://docum.org/ and the LARTC mailing list. These rules are -simply a recommendation for beginners to maximize the benefit of HTB until -gaining a better understanding of the practical application of HTB. - - - -  *  Shaping with HTB occurs only in leaf classes. See also Section 7.1.2. - -  *  Because HTB does not shape in any class except the leaf class, the sum - of the rates of leaf classes should not exceed the ceil of a parent - class. Ideally, the sum of the rates of the children classes would match - the rate of the parent class, allowing the parent class to distribute - leftover bandwidth (ceil - rate) among the children classes. - - This key concept in employing HTB bears repeating. Only leaf classes - actually shape packets; packets are only delayed in these leaf classes. - The inner classes (all the way up to the root class) exist to define how - borrowing/lending occurs (see also Section 7.1.3). - -  *  The quantum is only only used when a class is over rate but below ceil. - -  *  The quantum should be set at MTU or higher. HTB will dequeue a single - packet at least per service opportunity even if quantum is too small. In - such a case, it will not be able to calculate accurately the real - bandwidth consumed [9]. - -  *  Parent classes lend tokens to children in increments of quantum, so for - maximum granularity and most instantaneously evenly distributed - bandwidth, quantum should be as low as possible while still no less than - MTU. - -  *  A distinction between tokens and ctokens is only meaningful in a leaf - class, because non-leaf classes only lend tokens to child classes. - -  *  HTB borrowing could more accurately be described as "using". - - - ------------------------------------------------------------------------------ - -7.2. PRIO, priority scheduler - - The PRIO classful qdisc works on a very simple precept. When it is ready to -dequeue a packet, the first class is checked for a packet. If there's a -packet, it gets dequeued. If there's no packet, then the next class is -checked, until the queuing mechanism has no more classes to check. - - This section will be completed at a later date. ------------------------------------------------------------------------------ - -7.3. CBQ, Class Based Queuing - - CBQ is the classic implementation (also called venerable) of a traffic -control system. This section will be completed at a later date. - - ------------------------------------------------------------------------------ - -8. Rules, Guidelines and Approaches - - ------------------------------------------------------------------------------ - -8.1. General Rules of Linux Traffic Control - - There are a few general rules which ease the study of Linux traffic -control. Traffic control structures under Linux are the same whether the -initial configuration has been done with tcng or with tc. - -  *  Any router performing a shaping function should be the bottleneck on - the link, and should be shaping slightly below the maximum available link - bandwidth. This prevents queues from forming in other routers, affording - maximum control of packet latency/deferral to the shaping device. - -  *  A device can only shape traffic it transmits [10]. Because the traffic - has already been received on an input interface, the traffic cannot be - shaped. A traditional solution to this problem is an ingress policer. - -  *  Every interface must have a qdisc. The default qdisc (the pfifo_fast - qdisc) is used when another qdisc is not explicitly attached to the - interface. - -  *  One of the classful qdiscs added to an interface with no children - classes typically only consumes CPU for no benefit. - -  *  Any newly created class contains a FIFO. This qdisc can be replaced - explicitly with any other qdisc. The FIFO qdisc will be removed - implicitly if a child class is attached to this class. - -  *  Classes directly attached to the root qdisc can be used to simulate - virtual circuits. - -  *  A filter can be attached to classes or one of the classful qdiscs. - - - - - - - - - ------------------------------------------------------------------------------ - -8.2. Handling a link with a known bandwidth - - HTB is an ideal qdisc to use on a link with a known bandwidth, because the -innermost (root-most) class can be set to the maximum bandwidth available on -a given link. Flows can be further subdivided into children classes, allowing -either guaranteed bandwidth to particular classes of traffic or allowing -preference to specific kinds of traffic. - - - - ------------------------------------------------------------------------------ - -8.3. Handling a link with a variable (or unknown) bandwidth - - In theory, the PRIO scheduler is an ideal match for links with variable -bandwidth, because it is a work-conserving qdisc (which means that it -provides no shaping). In the case of a link with an unknown or fluctuating -bandwidth, the PRIO scheduler simply prefers to dequeue any available packet -in the highest priority band first, then falling to the lower priority -queues. - - - - ------------------------------------------------------------------------------ - -8.4. Sharing/splitting bandwidth based on flows - - Of the many types of contention for network bandwidth, this is one of the -easier types of contention to address in general. By using the SFQ qdisc, -traffic in a particular queue can be separated into flows, each of which will -be serviced fairly (inside that queue). Well-behaved applications (and users) -will find that using SFQ and ESFQ are sufficient for most sharing needs. - - The Achilles heel of these fair queuing algorithms is a misbehaving user or -application which opens many connections simultaneously (e.g., eMule, -eDonkey, Kazaa). By creating a large number of individual flows, the -application can dominate slots in the fair queuing algorithm. Restated, the -fair queuing algorithm has no idea that a single application is generating -the majority of the flows, and cannot penalize the user. Other methods are -called for. - - ------------------------------------------------------------------------------ - -8.5. Sharing/splitting bandwidth based on IP - - For many administrators this is the ideal method of dividing bandwidth -amongst their users. Unfortunately, there is no easy solution, and it becomes -increasingly complex with the number of machine sharing a network link. - - To divide bandwidth equitably between N IP addresses, there must be N -classes. - - ------------------------------------------------------------------------------ - -9. Scripts for use with QoS/Traffic Control - - - - - - ------------------------------------------------------------------------------ - -9.1. wondershaper - - More to come, see [http://lartc.org/wondershaper/] wondershaper. ------------------------------------------------------------------------------ - -9.2. ADSL Bandwidth HOWTO script (myshaper) - - More to come, see [http://www.tldp.org/HOWTO/ -ADSL-Bandwidth-Management-HOWTO/implementation.html] myshaper. ------------------------------------------------------------------------------ - -9.3. htb.init - - More to come, see htb.init. ------------------------------------------------------------------------------ - -9.4. tcng.init - - More to come, see tcng.init. ------------------------------------------------------------------------------ - -9.5. cbq.init - - More to come, see cbq.init. ------------------------------------------------------------------------------ - -10. Diagram - - - - ------------------------------------------------------------------------------ - -10.1. General diagram - - Below is a general diagram of the relationships of the components of a -classful queuing discipline (HTB pictured). A larger version of the diagram -is [http://linux-ip.net/traffic-control/htb-class.png] available. - - - - -Example 11. An example HTB tcng configuration -/* - * - * possible mock up of diagram shown at - * http://linux-ip.net/traffic-control/htb-class.png - * - */ - -$m_web = trTCM ( - cir 512 kbps, /* commited information rate */ - cbs 10 kB, /* burst for CIR */ - pir 1024 kbps, /* peak information rate */ - pbs 10 kB /* burst for PIR */ - ) ; - -dev eth0 { - egress { - - class ( <$web> ) if tcp_dport == PORT_HTTP && __trTCM_green( $m_web ); - class ( <$bulk> ) if tcp_dport == PORT_HTTP && __trTCM_yellow( $m_web ); - drop if __trTCM_red( $m_web ); - class ( <$bulk> ) if tcp_dport == PORT_SSH ; - - htb () { /* root qdisc */ - - class ( rate 1544kbps, ceil 1544kbps ) { /* root class */ - - $web = class ( rate 512kbps, ceil 512kbps ) { sfq ; } ; - $bulk = class ( rate 512kbps, ceil 1544kbps ) { sfq ; } ; - - } - } - } -} - - -[htb-class] - - ------------------------------------------------------------------------------ - -11. Annotated Traffic Control Links - - This section identifies a number of links to documentation about traffic -control and Linux traffic control software. Each link will be listed with a -brief description of the content at that site. - -  *  HTB site, HTB user guide and HTB theory (Martin "devik" Devera) - - Hierarchical Token Bucket, HTB, is a classful queuing discipline. - Widely used and supported it is also fairly well documented in the user - guide and at [http://www.docum.org/] Stef Coene's site (see below). - -  *  General Quality of Service docs (Leonardo Balliache) - - - There is a good deal of understandable and introductory documentation on - his site, and in particular has some excellent overview material. See in - particular, the detailed [http://opalsoft.net/qos/DS.htm] Linux QoS - document among others. -  *  tcng (Traffic Control Next Generation) and tcng manual (Werner - Almesberger) - - The tcng software includes a language and a set of tools for creating - and testing traffic control structures. In addition to generating tc - commands as output, it is also capable of providing output for non-Linux - applications. A key piece of the tcng suite which is ignored in this - documentation is the tcsim traffic control simulator. - - The user manual provided with the tcng software has been converted to - HTML with latex2html. The distribution comes with the TeX documentation. - -  *  iproute2 and iproute2 manual (Alexey Kuznetsov) - - This is a the source code for the iproute2 suite, which includes the - essential tc binary. Note, that as of - iproute2-2.4.7-now-ss020116-try.tar.gz, the package did not support HTB, - so a patch available from the [http://luxik.cdi.cz/~devik/qos/htb/] HTB - site will be required. - - The manual documents the entire suite of tools, although the tc utility - is not adequately documented here. The ambitious reader is recommended to - the LARTC HOWTO after consuming this introduction. - -  *  Documentation, graphs, scripts and guidelines to traffic control under - Linux (Stef Coene) - - Stef Coene has been gathering statistics and test results, scripts and - tips for the use of QoS under Linux. There are some particularly useful - graphs and guidelines available for implementing traffic control at - Stef's site. - -  *  [http://lartc.org/howto/] LARTC HOWTO (bert hubert, et. al.) - - The Linux Advanced Routing and Traffic Control HOWTO is one of the key - sources of data about the sophisticated techniques which are available - for use under Linux. The Traffic Control Introduction HOWTO should - provide the reader with enough background in the language and concepts of - traffic control. The LARTC HOWTO is the next place the reader should look - for general traffic control information. - -  *  Guide to IP Networking with Linux (Martin A. Brown) - - Not directly related to traffic control, this site includes articles - and general documentation on the behaviour of the Linux IP layer. - -  *  Werner Almesberger's Papers - - Werner Almesberger is one of the main developers and champions of - traffic control under Linux (he's also the author of tcng, above). One of - the key documents describing the entire traffic control architecture of - the Linux kernel is his Linux Traffic Control - Implementation Overview - which is available in [http://www.almesberger.net/cv/papers/tcio8.pdf] - PDF or [http://www.almesberger.net/cv/papers/tcio8.ps.gz] PS format. - -  *  Linux DiffServ project - - Mercilessly snipped from the main page of the DiffServ site... - - Differentiated Services (short: Diffserv) is an architecture for - providing different types or levels of service for network traffic. - One key characteristic of Diffserv is that flows are aggregated in - the network, so that core routers only need to distinguish a - comparably small number of aggregated flows, even if those flows - contain thousands or millions of individual flows. - - -Notes - -[1] See Section 5 for more details on the use or installation of a - particular traffic control mechanism, kernel or command line utility. -[2] This queueing model has long been used in civilized countries to - distribute scant food or provisions equitably. William Faulkner is - reputed to have walked to the front of the line for to fetch his share - of ice, proving that not everybody likes the FIFO model, and providing - us a model for considering priority queuing. -[3] Similarly, the entire traffic control system appears as a queue or - scheduler to the higher layer which is enqueuing packets into this - layer. -[4] This smoothing effect is not always desirable, hence the HTB parameters - burst and cburst. -[5] A classful qdisc can only have children classes of its type. For - example, an HTB qdisc can only have HTB classes as children. A CBQ qdisc - cannot have HTB classes as children. -[6] In this case, you'll have a filter which uses a classifier to select the - packets you wish to drop. Then you'll use a policer with a with a drop - action like this police rate 1bps burst 1 action drop/drop. -[7] I do not know the range nor base of these numbers. I believe they are - u32 hexadecimal, but need to confirm this. -[8] The options listed in this example are taken from a 2.4.20 kernel source - tree. The exact options may differ slightly from kernel release to - kernel release depending on patches and new schedulers and classifiers. -[9] HTB will report bandwidth usage in this scenario incorrectly. It will - calculate the bandwidth used by quantum instead of the real dequeued - packet size. This can skew results quickly. -[10] In fact, the Intermediate Queuing Device (IMQ) simulates an output - device onto which traffic control structures can be attached. This - clever solution allows a networking device to shape ingress traffic in - the same fashion as egress traffic. Despite the apparent contradiction - of the rule, IMQ appears as a device to the kernel. Thus, there has been - no violation of the rule, but rather a sneaky reinterpretation of that - rule. - -ProxyARP Subnetting HOWTO - -Bob Edwards - - Robert.Edwards@anu.edu.au - - v2.0, 27 August 2000 - - This HOWTO discusses using Proxy Address Resolution Protocol (ARP) - with subnetting in order to make a small network of machines visible - on another Internet Protocol (IP) subnet (I call it sub-subnetting). - This makes all the machines on the local network (network 0 from now - on) appear as if they are connected to the main network (network 1). - - This is only relevent if all machines are connected by Ethernet or - ether devices (ie. it won't work for SLIP/PPP/CSLIP etc.) - _________________________________________________________________ - - Table of Contents - 1. [1]Acknowledgements - 2. [2]Why use Proxy ARP with subnetting? - 3. [3]How Proxy ARP with subnetting works - 4. [4]Setting up Proxy ARP with subnetting - 5. [5]Other alternatives to Proxy ARP with subnetting - 6. [6]Other Applications of Proxy ARP with subnetting - 7. [7]Copying conditions - -1. Acknowledgements - - This document, and my Proxy ARP implementation could not have been - made possible without the help of: - - * Andrew Tridgell, who implemented the subnetting options for arp in - Linux, and who personally assisted me in getting it working - * the Proxy-ARP mini-HOWTO, by Al Longyear - * the Multiple-Ethernet mini-HOWTO, by Don Becker - * the arp(8) source code and man page by Fred N. van Kempen and - Bernd Eckenfels - _________________________________________________________________ - -2. Why use Proxy ARP with subnetting? - - The applications for using Proxy ARP with subnetting are fairly - specific. - - In my case, I had a wireless Ethernet card that plugs into an 8-bit - ISA slot. I wanted to use this card to provide connectivity for a - number of machines at once. Being an ISA card, I could use it on a - Linux machine, after I had written an appropriate device driver for it - - this is the subject of another document. From here, it was only - necessary to add a second Ethernet interface to the Linux machine and - then use some mechanism to join the two networks together. - - For the purposes of discussion, let network 0 be the local Ethernet - connected to the Linux box via an NE-2000 clone Ethernet interface on - eth0. Network 1 is the main network connected via the wireless - Ethernet card on eth1. Machine A is the Linux box with both - interfaces. Machine B is any TCP/IP machine on network 0 and machine C - is likewise on network 1. - - Normally, to provide the connectivity, I would have done one of the - following: - - * Used the IP-Bridge software (see the Bridge mini-HOWTO) to bridge - the traffic between the two network interfaces. Unfortunately, the - wireless Ethernet interface cannot be put into "Promiscuous" mode - (ie. it can't see all packets on network 1). This is mainly due to - the lower bandwidth of the wireless Ethernet (2MBit/sec) meaning - that we don't want to carry any traffic not specifically destined - to another wireless Ethernet machine - in our case machine A - or - broadcasts. Also, bridging is rather CPU intensive! - * Alternatively, use subnets and an IP-router to pass packets - between the two networks (see the IP-Subnetworking mini-HOWTO). - This is a protocol specific solution, where the Linux kernel can - handle the Internet Protocol (IP) packets, but other protocols - (such as AppleTalk) need extra software to route. This also - requires the allocation of a new IP subnet (network) number, which - is not always an option. - - In my case, getting a new subnet (network) number was not an option, - so I wanted a solution that allowed all the machines on network 0 to - appear as if they were on network 1. This is where Proxy ARP comes in. - Other solutions are used to connect other (non-IP) protocols, such as - netatalk to provide AppleTalk routing. - _________________________________________________________________ - -3. How Proxy ARP with subnetting works - - The Proxy ARP is actually only used to get packets from network 1 to - network 0. To get packets back the other way, the normal IP routing - functionality is employed. - - In my case, network 1 has an 8-bit subnet mask (255.255.255.0). I have - chosen the subnet mask for network 0 to be 4-bit (255.255.255.240), - allowing 14 IP nodes on network 0 (2 ^ 4 = 16, less two for the all - zeros and all ones cases). Note that any size of subnet mask up to, - but not including, the size of the mask of the other network is - allowable here (eg. 2, 3, 4, 5, 6 or 7 bits in this case - for one - bit, just use normal Proxy ARP!) - - All the IP numbers for network 0 (16 in total) appear in network 1 as - a subset. Note that it is very important, in this case, not to allow - any machine connected directly to network 1 to have an IP number in - this range! In my case, I have "reserved" the IP numbers of network 1 - ending in 64 .. 79 for network 0. In this case, the IP numbers ending - in 64 and 79 can't actually be used by nodes - 79 is the broadcast - address for network 0. - - Machine A is allocated two IP numbers, one within the network 0 range - for it's real Ethernet interface (eth0) and the other within the - network 1 range, but outside of the network 0 range, for the wireless - Ethernet interface (eth1). - - Say machine C (on network 1) wants to send a packet to machine B (on - network 0). Because the IP number of machine B makes it look to - machine C as though it is on the same physical network, machine C will - use the Address Resolution Protocol (ARP) to send a broadcast message - on network 1 requesting the machine with the IP number of machine B to - respond with it's hardware (Ethernet or MAC layer) address. Machine B - won't see this request, as it isn't actually on network 1, but machine - A, on both networks, will see it. - - The first bit of magic now happens as the Linux kernel arp code on - machine A, with a properly configured Proxy ARP with subnetting entry, - determines that the ARP request has come in on the network 1 interface - (eth1) and that the IP number being ARP'd for is in the subnet range - for network 0. Machine A then sends it's own hardware (Ethernet) - address back to machine C as an ARP response packet. - - Machine C then updates it's ARP cache with an entry for machine B, but - with the hardware (Ethernet) address of machine A (in this case, the - wireless Ethernet interface). Machine C can now send the packet for - machine B to this hardware (Ethernet) address, and machine A receives - it. - - Machine A notices that the destination IP number in the packet is that - of machine B, not itself. Machine A's Linux kernel IP routing code - attempts to forward the packet to machine B by looking at it's routing - tables to determine which interface contains the network number for - machine B. However, the IP number for machine B is valid for both the - network 0 interface (eth0), and for the network 1 interface (eth1). - - At this point, something else clever happens. Because the subnet mask - for the network 0 interface has more 1 bits (it is more specific) than - the subnet mask for the network 1 interface, the Linux kernel routing - code will match the IP number for machine B to the network 0 - interface, and not keep looking for the potential match with the - network 1 interface (the one the packet came in on). - - Now machine A needs to find out the "real" hardware (Ethernet) address - for machine B (assuming that it doesn't already have it in the ARP - cache). Machine A uses an ARP request, but this time the Linux kernel - arp code notes that the request isn't coming from the network 1 - interface (eth1), and so doesn't respond with the Proxy address of - eth1. Instead, it sends the ARP request on the network 0 interface - (eth0), where machine B will see it and respond with it's own (real) - hardware (Ethernet) address. Now machine A can send the packet (from - machine C) onto machine B. - - Machine B gets the packet from machine C (via machine A) and then - wants to send back a response. This time, machine B notices that - machine C in on a different subnet (machine B's subnet mask of - 255.255.255.240 excludes all machines not in the network 0 IP address - range). Machine B is setup with a "default" route to machine A's - network 0 IP number and sends the packet to machine A. This time, - machine A's Linux kernel routing code determines the destination IP - number (of machine C) as being on network 1 and sends the packet onto - machine C via Ethernet interface eth1. - - Similar (less complicated) things occur for packets originating from - and destined to machine A from other machines on either of the two - networks. - - Similarly, it should be obvious that if another machine (D) on network - 0 ARP's for machine B, machine A will receive the ARP request on it's - network 0 interface (eth0) and won't respond to the request as it is - set up to only Proxy on it's network 1 interface (eth1). - - Note also that all of machines B and C (and D) are not required to do - anything unusual, IP-wise. In my case, there is a mixture of Suns, - Macs and PC/Windoze 95 machines on network 0 all connecting through - Linux machine A to the rest of the world. - - Finally, note that once the hardware (Ethernet) addresses are - discovered by each of machines A, B, C (and D), they are placed in the - ARP cache and subsequent packet transfers occur without the ARP - overhead. The ARP caches normally expire entries after 5 minutes of - non-activity. - _________________________________________________________________ - -4. Setting up Proxy ARP with subnetting - - I set up Proxy ARP with subnetting on a Linux kernel version 2.0.30 - machine, but I am told that the code works right back to some kernel - version in the 1.2.x era. - - The first thing to note is that the ARP code is in two parts: the part - inside the kernel that sends and receives ARP requests and responses - and updates the ARP cache etc.; and other part is the arp(8) command - that allows the super user to modify the ARP cache manually and anyone - to examine it. - - The first problem I had was that the arp(8) command that came with my - Slackware 3.1 distribution was ancient (1994 era!!!) and didn't - communicate with the kernel arp code correctly at all (mainly - evidenced by the strange output that it gave for "arp -a"). - - The arp(8) command in "net-tools-1.33a" available from a variety of - places, including (from the README file that came with it) - [8]ftp.linux.org.uk:/pub/linux/Networking/base/ works properly and - includes new man pages that explain stuff a lot better than the older - arp(8) man page. - - Armed with a decent arp(8) command, all the changes I made were in the - /etc/rc.d/rc.inet1 script (for Slackware - probably different for - other flavours). First of all, we need to change the broadcast - address, network number and netmask of eth0: - -NETMASK=255.255.255.240 # for a 4-bit host part -NETWORK=x.y.z.64 # our new network number (replace x.y.z with your net) -BROADCAST=x.y.z.79 # in my case - - Then a line needs to be added to configure the second Ethernet port - (after any module loading that might be required to load the driver - code): - -/sbin/ifconfig eth1 (name on net 1) broadcast (x.y.z.255) netmask 255.255.255.0 - - Then we add a route for the new interface: - -/sbin/route add -net (x.y.z.0) netmask 255.255.255.0 - - And you will probably need to change the default gateway to the one - for network 1. - - At this point, it is appropriate to add the Proxy ARP entry: - -/sbin/arp -i eth1 -Ds ${NETWORK} eth1 netmask ${NETMASK} pub - - This tells ARP to add a static entry (the s) to the cache for network - ${NETWORK}. The -D tells ARP to use the same hardware address as - interface eth1 (the second eth1), thus saving us from having to look - up the hardware address for eth1 and hardcoding it in. The netmask - option tells ARP that we want to use subnetting (ie. Proxy for all (IP - number) & ${NETMASK} == ${NETWORK} & ${NETMASK}). The pub option tells - ARP to publish this ARP entry, ie. it is a Proxy entry, so respond on - behalf of these IP numbers. The -i eth1 option tells ARP to only - respond to requests that come in on interface eth1. - - Hopefully, at this point, when the machine is rebooted, all the - machines on network 0 will appear to be on network 1. You can check - that the Proxy ARP with subnetting entry has been correctly installed - on machine A. On my machine (names changed to protect the innocent) it - is: - -bash$ /sbin/arp -an -Address HWtype HWaddress Flags Mask Iface -x.y.z.1 ether 00:00:0C:13:6F:17 C * eth1 -x.y.z.65 ether 00:40:05:49:77:01 C * eth0 -x.y.z.67 ether 08:00:20:0B:79:47 C * eth0 -x.y.z.5 ether 00:00:3B:80:18:E5 C * eth1 -x.y.z.64 ether 00:40:96:20:CD:D2 CMP 255.255.255.240 eth1 - - Alternatively, you can examine the /proc/net/arp file with eg. cat(1). - - The last line is the proxy entry for the subnet. The CMP flags - indicate that it is a static (Manually entered) entry and that it is - to be Published. The entry is only going to reply to ARP requests on - eth1 where the requested IP number, once masked, matches the network - number, also masked. Note that arp(8) has automatically determined the - hardware address of eth1 and inserted this for the address to use (the - -Ds option). - - Likewise, it is probably prudent to check that the routing table has - been set up correctly. Here is mine (again, the names are changed to - protect the innocent): - -#/bin/netstat -rn -Kernel routing table -Destination Gateway Genmask Flags Metric Ref Use Iface -x.y.z.64 0.0.0.0 255.255.255.240 U 0 0 71 eth0 -x.y.z.0 0.0.0.0 255.255.255.0 U 0 0 389 eth1 -127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 7 lo -0.0.0.0 x.y.z.1 0.0.0.0 UG 1 0 573 eth1 - - Alternatively, you can examine the /proc/net/route file with eg. - cat(1). - - Note that the first entry is a proper subset of the second, but the - routing table has ranked them in netmask order, so the eth0 entry will - be checked before the eth1 entry. - _________________________________________________________________ - -5. Other alternatives to Proxy ARP with subnetting - - There are several other alternatives to using Proxy ARP with - subnetting in this situation, apart from the ones mentioned about - (bridging and straight routing): - - * IP-Masquerading (see the IP-Masquerade mini-HOWTO), in which - network 0 is "hidden" behind machine A from the rest of the - Internet. As machines on network 0 attempt to connect outside - through machine A, it re-addresses the source address and port - number of the packets and makes them look like they are coming - from itself, rather than from the machine on the hidden network 0. - This is an elegant solution, although it prevents any machine on - network 1 from initiating a connection to any machine on network - 0, as the machines on network 0 effectively don't exist outside of - network 0. This effectively increases security of the machines on - network 0, but is also means that servers on network 1 cannot - check the identity of clients on network 0 using IP numbers (eg. - NFS servers use IP hostnames for access to mountable file - systems). - * Another option is IP in IP tunneling, which isn't supported on all - platforms (such as Macs and Windoze machines) so I opted not to go - this way. - * Use Proxy ARP without subnetting. This is certainly possible, it - just means that a separate entry needs to be created for each - machine on network 0, instead of a single entry for all machines - (current and future) on network 0. - * Possibly IP Aliasing might also be useful here, but I haven't - looked into this at all. - _________________________________________________________________ - -6. Other Applications of Proxy ARP with subnetting - - There is only one other application that I know about that uses Proxy - ARP with subnetting, also here at the Australian National University. - It is the one that Andrew Tridgell originally wrote the subnetting - extensions to Proxy ARP for. However, Andrew reliably informs me that - there are, in fact, several other sites around the world using it as - well (I don't have any details). - - The other A.N.U. application involves a teaching lab set up to teach - students how to configure machines to use TCP/IP, including setting up - the gateway. The network used is a Class C network, and Andrew needed - to "subnet" it for security, traffic control and the educational - reason mentioned above. He did this using Proxy ARP, and then decided - that a single entry in the ARP cache for the whole subnet would be - faster and cleaner than one for each host on the subnet. Voila...Proxy - ARP with subnetting! - _________________________________________________________________ - -7. Copying conditions - - Copyright 1997 by Bob Edwards <[9]Robert.Edwards@anu.edu.au> - - Voice: (+61) 2 6249 4090 - - Unless otherwise stated, Linux HOWTO documents are copyrighted by - their respective authors. Linux HOWTO documents may be reproduced and - distributed in whole or in part, in any medium physical or electronic, - as long as this copyright notice is retained on all copies. Commercial - redistribution is allowed and encouraged; however, the author would - like to be notified of any such distributions. All translations, - derivative works, or aggregate works incorporating any Linux HOWTO - documents must be covered under this copyright notice. That is, you - may not produce a derivative work from a HOWTO and impose additional - restrictions on its distribution. Exceptions to these rules may be - granted under certain conditions; please contact the Linux HOWTO - coordinator at the address given below. In short, we wish to promote - dissemination of this information through as many channels as - possible. However, we do wish to retain copyright on the HOWTO - documents, and would like to be notified of any plans to redistribute - the HOWTOs. If you have questions, please contact the Linux HOWTO - coordinator, at <[10]linux-howto@metalab.unc.edu> via email. - -References - - 1. Proxy-ARP-Subnet.html#INTRO - 2. Proxy-ARP-Subnet.html#WHY - 3. Proxy-ARP-Subnet.html#HOW - 4. Proxy-ARP-Subnet.html#SETUP - 5. Proxy-ARP-Subnet.html#ALTERNATIVES - 6. Proxy-ARP-Subnet.html#APPLICATIONS - 7. Proxy-ARP-Subnet.html#COPYING - 8. ftp://ftp.linux.org.uk/pub/linux/Networking/base/ - 9. mailto:Robert.Edwards@anu.edu.au - 10. mailto:linux-howto@metalab.unc.edu - - diff --git a/LDP/guide/docbook/Linux-Networking/SSH.xml b/LDP/guide/docbook/Linux-Networking/SSH.xml deleted file mode 100644 index aa76b0ae..00000000 --- a/LDP/guide/docbook/Linux-Networking/SSH.xml +++ /dev/null @@ -1,45 +0,0 @@ - - -SSH - - -The Secure Shell, or SSH, provides a way of running command line and -graphical applications, and transferring files, over an encrypted -connection. SSH uses up to 2,048-bit encryption with a variety of -cryptographic schemes to make sure that if a cracker intercepts your -connection, all they can see is useless gibberish. It is both a -protocol and a suite of small command line applications which can be -used for various functions. - - - -SSH replaces the old Telnet application, and can be used for secure -remote administration of machines across the Internet. However, it -has more features. - - - -SSH increases the ease of running applications remotely by setting up -permissions automatically. If you can log into a machine, it allows you -to run a graphical application on it, unlike Telnet, which requires users -to type lots of geeky xhost and xauth commands. SSH also has inbuild -compression, which allows your graphic applications to run much faster -over the network. - - - -SCP (Secure Copy) and SFTP (Secure FTP) allow transfer of files over the -remote link, either via SSH's own command line utilities or graphical tools -like Gnome's GFTP. Like Telnet, SSH is cross-platform. You can find SSH -servers and clients for Linux, Unix, all flavours of Windows, BeOS, PalmOS, -Java and Embedded OSes used in routers. - - - -Encrypted remote shell sessions are available through SSH - (http://www.ssh.fi/sshprotocols2/index.html - ) thus effectively -allowing secure remote administration. - - - diff --git a/LDP/guide/docbook/Linux-Networking/STRIP.xml b/LDP/guide/docbook/Linux-Networking/STRIP.xml deleted file mode 100644 index 9b56fd34..00000000 --- a/LDP/guide/docbook/Linux-Networking/STRIP.xml +++ /dev/null @@ -1,49 +0,0 @@ - - -STRIP - - -STRIP (Starnode Radio IP) is a protocol designed specifically for -a range of Metricom radio modems for a research project being -conducted by Stanford University called the MosquitoNet Project. -There is a lot of interesting reading here, even if you aren't -directly interested in the project. - - - -The Metricom radios connect to a serial port, employ spread spectrum -technology and are typically capable of about 100kbps. Information on -the Metricom radios is available from the: Metricom Web Server. - - - -At present the standard network tools and utilities do not support the -STRIP driver, so you will have to download some customized tools from -the MosquitoNet web server. Details on what software you need is -available at the: MosquitoNet STRIP Page. - - - -A summary of configuration is that you use a modified slattach program -to set the line discipline of a serial tty device to STRIP and then -configure the resulting `st[0-9]' device as you would for ethernet -with one important exception, for technical reasons STRIP does not -support the ARP protocol, so you must manually configure the ARP -entries for each of the hosts on your subnet. This shouldn't prove too -onerous. STRIP device names are `st0', `st1', etc.... The relevant -kernel compilation options are given below. - - - - - Kernel Compile Options: - - Network device support ---> - [*] Network device support - .... - [*] Radio network interfaces - < > STRIP (Metricom starmode radio IP) - - - - diff --git a/LDP/guide/docbook/Linux-Networking/Samba.xml b/LDP/guide/docbook/Linux-Networking/Samba.xml deleted file mode 100644 index 2e71a944..00000000 --- a/LDP/guide/docbook/Linux-Networking/Samba.xml +++ /dev/null @@ -1,76 +0,0 @@ - - - 8.11. SAMBA - `NetBEUI', `NetBios', `CIFS' support. - - SAMBA is an implementation of the Session Management Block protocol. - Samba allows Microsoft and other systems to mount and use your disks - and printers. - - SAMBA and its configuration are covered in detail in the SMB-HOWTO. - - 5.2. Windows Environment - - Samba is a suite of applications that allow most Unices (and in - particular Linux) to integrate into a Microsoft network both as a - client and a server. Acting as a server it allows Windows 95, Windows - for Workgroups, DOS and Windows NT clients to access Linux files and - printing services. It can completely replace Windows NT for file and - printing services, including the automatic downloading of printer - drivers to clients. Acting as a client allows the Linux workstation to - mount locally exported windows file shares. - - According to the SAMBA Meta-FAQ: - - "Many users report that compared to other SMB implementations Samba is more stable, - faster, and compatible with more clients. Administrators of some large installations say - that Samba is the only SMB server available which will scale to many tens of thousands - of users without crashing" - - ˇ Samba project home page - - ˇ SMB HOWTO - - ˇ Printing HOWTO - - - -samba - - - -A LanManager like file and printer server for Unix. The Samba software suite is a collection of programs that implements the SMB protocol for unix systems, allowing you to serve files and printers to Windows, NT, OS/2 and DOS clients. This protocol is sometimes also referred to as the LanManager or NetBIOS protocol. This package contains all the components necessary to turn your Debian GNU/Linux box into a powerful file and printer server. Currently, the Samba Debian packages consist of the following: samba - A LanManager like file and printer server for Unix. samba-common - Samba common files used by both the server and the client. smbclient - A LanManager like simple client for Unix. swat - Samba Web Administration Tool samba-doc - Samba documentation. smbfs - Mount and umount commands for the smbfs (kernels 2.0.x and above). libpam-smbpass - pluggable authentication module for SMB password database libsmbclient - Shared library that allows applications to talk to SMB servers libsmbclient-dev - libsmbclient shared libraries winbind: Service to resolve user and group information from Windows NT servers It is possible to install a subset of these packages depending on your particular needs. For example, to access other SMB servers you should only need the smbclient and samba-common packages. From Debian 3.0r0 APT -http://www.tldp.org/LDP/Linux-Dictionary/html/index.html - - - - - - -Samba - - - -A lot of emphasis has been placed on peaceful coexistence between UNIX and Windows. Unfortunately, the two systems come from very different cultures and they have difficulty getting along without mediation. ...and that, of course, is Samba's job. Samba <http://samba.org/> runs on UNIX platforms, but speaks to Windows clients like a native. It allows a UNIX system to move into a Windows ``Network Neighborhood'' without causing a stir. Windows users can happily access file and print services without knowing or caring that those services are being offered by a UNIX host. All of this is managed through a protocol suite which is currently known as the ``Common Internet File System,'' or CIFS <http://www.cifs.com>. This name was introduced by Microsoft, and provides some insight into their hopes for the future. At the heart of CIFS is the latest incarnation of the Server Message Block (SMB) protocol, which has a long and tedious history. Samba is an open source CIFS implementation, and is available for free from the http://samba.org/ mirror sites. Samba and Windows are not the only ones to provide CIFS networking. OS/2 supports SMB file and print sharing, and there are commercial CIFS products for Macintosh and other platforms (including several others for UNIX). Samba has been ported to a variety of non-UNIX operating systems, including VMS, AmigaOS, and NetWare. CIFS is also supported on dedicated file server platforms from a variety of vendors. In other words, this stuff is all over the place. From Rute-Users-Guide -http://www.tldp.org/LDP/Linux-Dictionary/html/index.html - - - - - - -Samba - - - -Samba adds Windows-networking support to UNIX. Whereas NFS is the most popular protocol for sharing files among UNIX machines, SMB is the most popular protocol for sharing files among Windows machines. The Samba package adds the ability for UNIX systems to interact with Windows systems. Key point: The Samba package comprises the following: smbd The Samba service allowing other machines (often Windows) to read files from a UNIX machine. nmbd Provides support for NetBIOS. Logically, the SMB protocol is layered on top of NetBIOS, which is in turn layered on top of TCP/IP. smbmount An extension to the mount program that allows a UNIX machine to connect to another machine implicitly. Files can be accessed as if they were located on the local machines. smbclient Allows files to be access through SMB in an explicity manner. This is a command-line tool much like the FTP tool that allows files to be copied. Unlike smbmount, files cannot be accessed as if they were local. smb.conf The configuration file for Samba. From Hacking-Lexicon -http://www.tldp.org/LDP/Linux-Dictionary/html/index.html - - - - - Samba Authenticated Gateway HOWTO - Ricardo Alexandre Mattar - v1.2, 2004-05-21 - - diff --git a/LDP/guide/docbook/Linux-Networking/Services.xml b/LDP/guide/docbook/Linux-Networking/Services.xml new file mode 100644 index 00000000..a859c538 --- /dev/null +++ b/LDP/guide/docbook/Linux-Networking/Services.xml @@ -0,0 +1,5940 @@ + + +Services + + + + + + + + +Database + + +Most databases are supported under Linux, including Oracle, DB2, Sybase, Informix, MySQL, PostgreSQL, +InterBase and Paradox. Databases, and the Structures Query Language they work with, are complex, and this +chapter has neither the space or depth to deal with them. Read the next section on PHP to learn how to set +a dynamically generated Web portal in about five minutes. + +We'll be using MySQL because it's extremely fast, capable of handling large databases (200G databases aren't +unheard of), and has recently been made open source. It also works well with PHP. While currently +lacking transaction support (due to speed concerns), a future version of MySQL will have this opt + + +* Connecting to MS SQL 6.x+ via Openlink/PHP/ODBC mini-HOWTO + +* Sybase Adaptive Server Anywhere for Linux HOWTO + + + + + +DHCP + + +Endeavouring to maintain static IP addressing to maintain static IP addressing +information, such as IP addresses, subnet masks, DNS names and other +information on client machines can be difficult. Documentation becomes lost or +out-of-date, and network reconfigurations require details to be modified +manually on every machine. + + + +DHCP (Dynamic Host Configuration Protocol) solves this problem by providing +arbitrary information (including IP addressing) to clients upon request. +Almost all client OSes support it and it is standard in most large networks. + + + +The impact that it has is most prevalent it eases network administration, +especially in large networks or networks which have lots of mobile users. + + +2. DHCP protocol + + DHCP (Dynamic Host Configuration Protocol), is used to control + vital networking parameters of hosts (running clients) with the help + of a server. DHCP is backward compatible with BOOTP. For more + information see RFC 2131 (old RFC 1541) and other. (See Internet + Resources section at the end of the document). You can also read + [32]http://web.syr.edu/~jmwobus/comfaqs/dhcp.faq.html. + +4.5. Other interesting documents + + Linux Magazine has a pretty good article in their April issue called + [62]Network Nirvana: How to make Network Configuration as easy as DHCP + that discusses the set up for DHCP. + +References + + 1. DHCP.html#AEN17 + 2. DHCP.html#AEN19 + 3. DHCP.html#AEN24 + 4. DHCP.html#AEN41 + 5. DHCP.html#AEN45 + 6. DHCP.html#AEN64 + 7. DHCP.html#AEN69 + 8. DHCP.html#AEN74 + 9. DHCP.html#AEN77 + 10. DHCP.html#SLACKWARE + 11. DHCP.html#REDHAT6 + 12. DHCP.html#AEN166 + 13. DHCP.html#AEN183 + 14. DHCP.html#DEBIAN + 15. DHCP.html#AEN230 + 16. DHCP.html#NAMESERVER + 17. DHCP.html#AEN293 + 18. DHCP.html#TROUBLESHOOTING + 19. DHCP.html#AEN355 + 20. DHCP.html#AEN369 + 21. DHCP.html#DHCPSERVER + 22. DHCP.html#AEN382 + 23. DHCP.html#AEN403 + 24. DHCP.html#AEN422 + 25. DHCP.html#AEN440 + 26. http://www.oswg.org/oswg-nightly/DHCP.html + 27. http://www.linux.org.tw/CLDP/mini/DHCP.html + 28. http://www.linux.or.jp/JF/JFdocs/DHCP.html + 29. ftp://cuates.pue.upaep.mx/pub/linux/LuCAS/DHCP-mini-Como/ + 30. mailto:vuksan-feedback@veus.hr + 31. http://www.opencontent.org/opl.shtml + 32. http://web.syr.edu/~jmwobus/comfaqs/dhcp.faq.html + 33. mailto:sergei@phystech.com + 34. ftp://ftp.phystech.com/pub/ + 35. http://www.cps.msu.edu/~dunham/out/ + 36. ftp://metalab.unc.edu/pub/Linux/system/network/daemons + 37. ftp://ftp.phystech.com/pub/ + 38. DHCP.html#NAMESERVER + 39. DHCP.html#LINUXPPC-RH6 + 40. mailto:alexander.stevenson@home.com + 41. DHCP.html#NAMESERVER + 42. ftp://ftp.redhat.com/pub/redhat/redhat-4.2/i386/RedHat/RPMS/dhcpcd-0.6-2.i386.rpm + 43. DHCP.html#SLACKWARE + 44. mailto:nothing@cc.gatech.edu + 45. DHCP.html#NAMESERVER + 46. http://ftp.debian.org/debian/dists/slink/main/binary-i386/net/ + 47. DHCP.html#SLACKWARE + 48. mailto:heiko@os.inf.tu-dresden.de + 49. DHCP.html#NAMESERVER + 50. DHCP.html#REDHAT6 + 51. ftp://ftp.linuxppc.org/ + 52. ftp://ftp.phystech.com/pub/dhcpcd-1.3.17-pl9.tar.gz + 53. DHCP.html#TROUBLESHOOTING + 54. mailto:nothing@cc.gatech.edu + 55. DHCP.html#ERROR3 + 56. ftp://vanbuer.ddns.org/pub/ + 57. DHCP.html#DHCPSERVER + 58. mailto:mellon@isc.org + 59. ftp://ftp.isc.org/isc/dhcp/ + 60. http://www.kde.org/ + 61. ftp://ftp.us.kde.org/pub/kde/unstable/apps/network/ + 62. http://www.linux-mag.com/2000-04/networknirvana_01.html + + + + + +DNS + + + Setting Up Your New Domain Mini-HOWTO. + + + + + +FTP + + +File Transport Protocol (FTP) is an efficient way to transfer files between +machines across networks and clients and servers exist for almost all platforms +making FTP the most convenient (and therefore popular) method of transferring +files. FTP was first developed by the University of California, Berkeley for +inclusion in 4.2BSD (Berkeley Unix). The RFC (Request for Comments) +documents for the protocol is now known as RFC 959 and is available at +ftp://nic.merit.edu/documents/rfc/rfc0959.txt. + + + +There are two typical modes of running an FTP server - either anonymously or +account-based. Anonymous FTP servers are by far the most popular; they allow +any machine to access the FTP server and the files stored on it with the same +permissions. No usernames or passwords are transmitted down the wire. +Account-based FTP allows users to login with real usernames and passwords. +While it provides greater access control than anonymous FTP, transmitting real +usernames and password unencrypted over the Internet is generally avoided for +security reasons. + + + +An FTP client is the userland application that provides access to FTP +servers. There are many FTP clients available. Some are graphical, and +some are text-based. + + +* FTP HOWTO + + + + + +LDAP + +Information about installing, configuring, running and maintaining a LDAP +(Lightweight Directory Access Protocol) Server on a Linux machine is +presented on this section. This section also presents details about how to +create LDAP databases, how to add, how to update and how to delete +information on the directory. This paper is mostly based on the University of +Michigan LDAP information pages and on the OpenLDAP Administrator's Guide. + + + + + +NFS + +NFS (Network File System) + +The TCP/IP suite's equivalent of file sharing. This protocol operates at the Process/Application +layer of the DOD model, similar to the application layer of the OSI model. + +SLIP (Serial Line Internet Protocol) and PPP (Point-to-Point Protocol) + +Two protocols commonly used for dial-up access to the Internet. They are typically used with +TCP/IP; while SLIP works only with TCP/IP, PPP can be used with other protocols. + +SLIP was the first protocol for dial-up Internet access. It opeates at the physical layer of the +OSI model, and provides a simple interface to a UNIX or other dial-up host for Internet access. +SLIP does not provide security, so authentication is handled through prompts before initiating +the SLIP connection. + +PPP is a more recent development. It operates at the physical and data link layers of the OSI +model. In addition to the features of SLIP, PPP supports data compression, security (authentication), +and error control. PPP can also dynamically assign network addresses. + +Since PPP provides easier authentication and better security, it should be used for dial-up connections +whenever possible. However, you may need to use SLIRP to communicate with dial-up servers (particularly +older UNIC machines and dedicated hardware servers) that don't support PPP. + +> Start Config-HOWTO + +2.15. Automount Points + +If you don't like the mounting/unmounting thing, consider using autofs(5). You tell the autofs daemon what to automount and where starting with a file, /etc/auto.master. Its structure is simple: + + +/misc/etc/auto.misc +/mnt/etc/auto.mnt + +In this example you tell autofs to automount media in /misc and /mnt, while the mountpoints are specified in/etc/auto.misc and /etc/auto.mnt. An example of /etc/auto.misc: + + +# an NFS export +server -romy.buddy.net:/pub/export +# removable media +cdrom -fstype=iso9660,ro:/dev/hdb +floppy-fstype=auto:/dev/fd0 + +Start the automounter. From now on, whenever you try to access the inexistent mount point /misc/cdrom, il will be created and the CD-ROM will be mounted. + +>End Config-HOWTO + + 5.4. Unix Environment + + The preferred way to share files in a Unix networking environment is + through NFS. NFS stands for Network File Sharing and it is a protocol + originally developed by Sun Microsystems. It is a way to share files + between machines as if they were local. A client "mounts" a filesystem + "exported" by an NFS server. The mounted filesystem will appear to the + client machine as if it was part of the local filesystem. + + It is possible to mount the root filesystem at startup time, thus + allowing diskless clients to boot up and access all files from a + server. In other words, it is possible to have a fully functional + computer without a hard disk. + + Coda is a network filesystem (like NFS) that supports disconnected + operation, persistant caching, among other goodies. It's included in + 2.2.x kernels. Really handy for slow or unreliable networks and + laptops. + + NFS-related documents: + + ˇ http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root.html + + ˇ http://metalab.unc.edu/mdw/HOWTO/Diskless-HOWTO.html + + ˇ http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root-Client-mini- + HOWTO/index.html + + ˇ http://www.redhat.com/support/docs/rhl/NFS-Tips/NFS-Tips.html + + ˇ http://metalab.unc.edu/mdw/HOWTO/NFS-HOWTO.html + + CODA can be found at: http://www.coda.cs.cmu.edu/ + + + 5.4. Unix Environment + + The preferred way to share files in a Unix networking environment is + through NFS. NFS stands for Network File Sharing and it is a protocol + originally developed by Sun Microsystems. It is a way to share files + between machines as if they were local. A client "mounts" a filesystem + "exported" by an NFS server. The mounted filesystem will appear to the + client machine as if it was part of the local filesystem. + + It is possible to mount the root filesystem at startup time, thus + allowing diskless clients to boot up and access all files from a + server. In other words, it is possible to have a fully functional + computer without a hard disk. + + Coda is a network filesystem (like NFS) that supports disconnected + operation, persistant caching, among other goodies. It's included in + 2.2.x kernels. Really handy for slow or unreliable networks and + laptops. + + NFS-related documents: + + ˇ http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root.html + + ˇ http://metalab.unc.edu/mdw/HOWTO/Diskless-HOWTO.html + + ˇ http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root-Client-mini- + HOWTO/index.html + + ˇ http://www.redhat.com/support/docs/rhl/NFS-Tips/NFS-Tips.html + + ˇ http://metalab.unc.edu/mdw/HOWTO/NFS-HOWTO.html + + CODA can be found at: http://www.coda.cs.cmu.edu/ + +Samba is the Linux implementation of SMB under Linux. NFS is the Unix equivalent - a way to import and +export local files to and from remote machines. Like SMB, NFS sends information including user +passwords unencrypted, is its best to limit it to within your local network. + +As you know, all storage in Linux is visible within a single tree structure, and new hard disks, +CD-ROMs, Zip drives and other spaces are mounted on a particular directory. NFS shares are also +attached to the system in this manner. NFS is included in most Linux kernels, and the tools +necessary to be an NFS server and client come in most distributions. + +However, users of Linux kernel 2.2 hoping to use NFS may wish to upgrade to +kernel 2.4; while the earlier version of Linux NFS did work well, it was far slower than +most other Unix implementations of this protocol. + +>Start Config-HOWTO +2.15. Automount Points + +If you don't like the mounting/unmounting thing, consider using autofs(5). You tell the autofs daemon what to automount and where starting with a file, /etc/auto.master. Its structure is simple: + + +/misc/etc/auto.misc +/mnt/etc/auto.mnt + +In this example you tell autofs to automount media in /misc and /mnt, while the mountpoints are specified in/etc/auto.misc and /etc/auto.mnt. An example of /etc/auto.misc: + + +# an NFS export +server -romy.buddy.net:/pub/export +# removable media +cdrom -fstype=iso9660,ro:/dev/hdb +floppy-fstype=auto:/dev/fd0 + +Start the automounter. From now on, whenever you try to access the inexistent mount point /misc/cdrom, il will be created and the CD-ROM will be mounted. +>End Config-HOWTO + +> Linux NFS-HOWTO +> NFS-Root mini-HOWTO +> NFS-Root-Client Mini-HOWTO +> The Linux NIS(YP)/NYS/NIS+ HOWTO + + +Linux NFS-HOWTO + +Tavis Barr + +         tavis dot barr at liu dot edu +        + +Nicolai Langfeldt + +         janl at linpro dot no +        + +Seth Vidal + +        skvidal at phy dot duke dot edu +       + +Tom McNeal + +        trmcneal at attbi dot com +       + +2002-08-25 +Revision History +Revision v3.1 2002-08-25 Revised by: tavis +Typo in firewalling section in 3.0 +Revision v3.0 2002-07-16 Revised by: tavis +Updates plus additions to performance, security +----------------------------------------------------------------------------- + +Table of Contents +1. Preamble + 1.1. Legal stuff + 1.2. Disclaimer + 1.3. Feedback + 1.4. Translation + 1.5. Dedication + + +2. Introduction + 2.1. What is NFS? + 2.2. What is this HOWTO and what is it not? + 2.3. Knowledge Pre-Requisites + 2.4. Software Pre-Requisites: Kernel Version and nfs-utils + 2.5. Where to get help and further information + + +3. Setting Up an NFS Server + 3.1. Introduction to the server setup + 3.2. Setting up the Configuration Files + 3.3. Getting the services started + 3.4. Verifying that NFS is running + 3.5. Making changes to /etc/exports later on + + +4. Setting up an NFS Client + 4.1. Mounting remote directories + 4.2. Getting NFS File Systems to Be Mounted at Boot Time + 4.3. Mount options + + +5. Optimizing NFS Performance + 5.1. Setting Block Size to Optimize Transfer Speeds + 5.2. Packet Size and Network Drivers + 5.3. Overflow of Fragmented Packets + 5.4. NFS over TCP + 5.5. Timeout and Retransmission Values + 5.6. Number of Instances of the NFSD Server Daemon + 5.7. Memory Limits on the Input Queue + 5.8. Turning Off Autonegotiation of NICs and Hubs + 5.9. Synchronous vs. Asynchronous Behavior in NFS + 5.10. Non-NFS-Related Means of Enhancing Server Performance + + +6. Security and NFS + 6.1. The portmapper + 6.2. Server security: nfsd and mountd + 6.3. Client Security + 6.4. NFS and firewalls (ipchains and netfilter) + 6.5. Tunneling NFS through SSH + 6.6. Summary + + +7. Troubleshooting + 7.1. Unable to See Files on a Mounted File System + 7.2. File requests hang or timeout waiting for access to the file. + 7.3. Unable to mount a file system + 7.4. I do not have permission to access files on the mounted volume. + 7.5. When I transfer really big files, NFS takes over all the CPU cycles + on the server and it screeches to a halt. + 7.6. Strange error or log messages + 7.7. Real permissions don't match what's in /etc/exports. + 7.8. Flaky and unreliable behavior + 7.9. nfsd won't start + 7.10. File Corruption When Using Multiple Clients + + +8. Using Linux NFS with Other OSes + 8.1. AIX + 8.2. BSD + 8.3. Tru64 Unix + 8.4. HP-UX + 8.5. IRIX + 8.6. Solaris + 8.7. SunOS + + + +1. Preamble + +1.1. Legal stuff + +Copyright (c) <2002> by Tavis Barr, Nicolai Langfeldt, Seth Vidal, and Tom +McNeal. This material may be distributed only subject to the terms and +conditions set forth in the Open Publication License, v1.0 or later (the +latest version is presently available at [http://www.opencontent.org/openpub +/] http://www.opencontent.org/openpub/). +----------------------------------------------------------------------------- + +1.2. Disclaimer + +This document is provided without any guarantees, including merchantability +or fitness for a particular use. The maintainers cannot be responsible if +following instructions in this document leads to damaged equipment or data, +angry neighbors, strange habits, divorce, or any other calamity. +----------------------------------------------------------------------------- + +1.3. Feedback + +This will never be a finished document; we welcome feedback about how it can +be improved. As of February 2002, the Linux NFS home page is being hosted at +[http://nfs.sourceforge.net] http://nfs.sourceforge.net. Check there for +mailing lists, bug fixes, and updates, and also to verify who currently +maintains this document. +----------------------------------------------------------------------------- + +1.4. Translation + +If you are able to translate this document into another language, we would be +grateful and we will also do our best to assist you. Please notify the +maintainers. +----------------------------------------------------------------------------- + +1.5. Dedication + +NFS on Linux was made possible by a collaborative effort of many people, but +a few stand out for special recognition. The original version was developed +by Olaf Kirch and Alan Cox. The version 3 server code was solidified by Neil +Brown, based on work from Saadia Khan, James Yarbrough, Allen Morris, H.J. +Lu, and others (including himself). The client code was written by Olaf Kirch +and updated by Trond Myklebust. The version 4 lock manager was developed by +Saadia Khan. Dave Higgen and H.J. Lu both have undertaken the thankless job +of extensive maintenance and bug fixes to get the code to actually work the +way it was supposed to. H.J. has also done extensive development of the +nfs-utils package. Of course this dedication is leaving many people out. + +The original version of this document was developed by Nicolai Langfeldt. It +was heavily rewritten in 2000 by Tavis Barr and Seth Vidal to reflect +substantial changes in the workings of NFS for Linux developed between the +2.0 and 2.4 kernels. It was edited again in February 2002, when Tom McNeal +made substantial additions to the performance section. Thomas Emmel, Neil +Brown, Trond Myklebust, Erez Zadok, and Ion Badulescu also provided valuable +comments and contributions. +----------------------------------------------------------------------------- + +2. Introduction + +2.1. What is NFS? + +The Network File System (NFS) was developed to allow machines to mount a disk +partition on a remote machine as if it were on a local hard drive. This +allows for fast, seamless sharing of files across a network. + +It also gives the potential for unwanted people to access your hard drive +over the network (and thereby possibly read your email and delete all your +files as well as break into your system) if you set it up incorrectly. So +please read the Security section of this document carefully if you intend to +implement an NFS setup. + +There are other systems that provide similar functionality to NFS. Samba +([http://www.samba.org] http://www.samba.org) provides file services to +Windows clients. The Andrew File System from IBM ([http://www.transarc.com/ +Product/EFS/AFS/index.html] http://www.transarc.com/Product/EFS/AFS/ +index.html), recently open-sourced, provides a file sharing mechanism with +some additional security and performance features. The Coda File System +([http://www.coda.cs.cmu.edu/] http://www.coda.cs.cmu.edu/) is still in +development as of this writing but is designed to work well with disconnected +clients. Many of the features of the Andrew and Coda file systems are slated +for inclusion in the next version of NFS (Version 4) ([http://www.nfsv4.org] +http://www.nfsv4.org). The advantage of NFS today is that it is mature, +standard, well understood, and supported robustly across a variety of +platforms. +----------------------------------------------------------------------------- + +2.2. What is this HOWTO and what is it not? + +This HOWTO is intended as a complete, step-by-step guide to setting up NFS +correctly and effectively. Setting up NFS involves two steps, namely +configuring the server and then configuring the client. Each of these steps +is dealt with in order. The document then offers some tips for people with +particular needs and hardware setups, as well as security and troubleshooting +advice. + +This HOWTO is not a description of the guts and underlying structure of NFS. +For that you may wish to read Linux NFS and Automounter Administration by +Erez Zadok (Sybex, 2001). The classic NFS book, updated and still quite +useful, is Managing NFS and NIS by Hal Stern, published by O'Reilly & +Associates, Inc. A much more advanced technical description of NFS is +available in NFS Illustrated by Brent Callaghan. + +This document is also not intended as a complete reference manual, and does +not contain an exhaustive list of the features of Linux NFS. For that, you +can look at the man pages for nfs(5), exports(5), mount(8), fstab(5), nfsd(8) +, lockd(8), statd(8), rquotad(8), and mountd(8). + +It will also not cover PC-NFS, which is considered obsolete (users are +encouraged to use Samba to share files with Windows machines) or NFS Version +4, which is still in development. +----------------------------------------------------------------------------- + +2.3. Knowledge Pre-Requisites + +You should know some basic things about TCP/IP networking before reading this +HOWTO; if you are in doubt, read the Networking- Overview-HOWTO. +----------------------------------------------------------------------------- + +2.4. Software Pre-Requisites: Kernel Version and nfs-utils + +The difference between Version 2 NFS and version 3 NFS will be explained +later on; for now, you might simply take the suggestion that you will need +NFS Version 3 if you are installing a dedicated or high-volume file server. +NFS Version 2 should be fine for casual use. + +NFS Version 2 has been around for quite some time now (at least since the 1.2 +kernel series) however you will need a kernel version of at least 2.2.18 if +you wish to do any of the following: + +  * Mix Linux NFS with other operating systems' NFS + +  * Use file locking reliably over NFS + +  * Use NFS Version 3. + + +There are also patches available for kernel versions above 2.2.14 that +provide the above functionality. Some of them can be downloaded from the +Linux NFS homepage. If your kernel version is 2.2.14- 2.2.17 and you have the +source code on hand, you can tell if these patches have been added because +NFS Version 3 server support will be a configuration option. However, unless +you have some particular reason to use an older kernel, you should upgrade +because many bugs have been fixed along the way. Kernel 2.2.19 contains some +additional locking improvements over 2.2.18. + +Version 3 functionality will also require the nfs-utils package of at least +version 0.1.6, and mount version 2.10m or newer. However because nfs-utils +and mount are fully backwards compatible, and because newer versions have +lots of security and bug fixes, there is no good reason not to install the +newest nfs-utils and mount packages if you are beginning an NFS setup. + +All 2.4 and higher kernels have full NFS Version 3 functionality. + +In all cases, if you are building your own kernel, you will need to select +NFS and NFS Version 3 support at compile time. Most (but not all) standard +distributions come with kernels that support NFS version 3. + +Handling files larger than 2 GB will require a 2.4x kernel and a 2.2.x +version of glibc. + +All kernels after 2.2.18 support NFS over TCP on the client side. As of this +writing, server-side NFS over TCP only exists in a buggy form as an +experimental option in the post-2.2.18 series; patches for 2.4 and 2.5 +kernels have been introduced starting with 2.4.17 and 2.5.6. The patches are +believed to be stable, though as of this writing they are relatively new and +have not seen widespread use or integration into the mainstream 2.4 kernel. + +Because so many of the above functionalities were introduced in kernel +version 2.2.18, this document was written to be consistent with kernels above +this version (including 2.4.x). If you have an older kernel, this document +may not describe your NFS system correctly. + +As we write this document, NFS version 4 has only recently been finalized as +a protocol, and no implementations are considered production-ready. It will +not be dealt with here. +----------------------------------------------------------------------------- + +2.5. Where to get help and further information + +As of November 2000, the Linux NFS homepage is at [http:// +nfs.sourceforge.net] http://nfs.sourceforge.net. Please check there for NFS +related mailing lists as well as the latest version of nfs-utils, NFS kernel +patches, and other NFS related packages. + +When you encounter a problem or have a question not covered in this manual, +the faq or the man pages, you should send a message to the nfs mailing list +(). To best help the developers and other users +help you assess your problem you should include: + +  * the version of nfs-utils you are using + +  * the version of the kernel and any non-stock applied kernels. + +  * the distribution of linux you are using + +  * the version(s) of other operating systems involved. + + +It is also useful to know the networking configuration connecting the hosts. + +If your problem involves the inability mount or export shares please also +include: + +  * a copy of your /etc/exports file + +  * the output of rpcinfo -p localhost run on the server + +  * the output of rpcinfo -p servername run on the client + + +Sending all of this information with a specific question, after reading all +the documentation, is the best way to ensure a helpful response from the +list. + +You may also wish to look at the man pages for nfs(5), exports(5), mount(8), +fstab(5), nfsd(8), lockd(8), statd(8), rquotad(8), and mountd(8). +----------------------------------------------------------------------------- + +3. Setting Up an NFS Server + +3.1. Introduction to the server setup + +It is assumed that you will be setting up both a server and a client. If you +are just setting up a client to work off of somebody else's server (say in +your department), you can skip to Section 4. However, every client that is +set up requires modifications on the server to authorize that client (unless +the server setup is done in a very insecure way), so even if you are not +setting up a server you may wish to read this section to get an idea what +kinds of authorization problems to look out for. + +Setting up the server will be done in two steps: Setting up the configuration +files for NFS, and then starting the NFS services. +----------------------------------------------------------------------------- + +3.2. Setting up the Configuration Files + +There are three main configuration files you will need to edit to set up an +NFS server: /etc/exports, /etc/hosts.allow, and /etc/hosts.deny. Strictly +speaking, you only need to edit /etc/exports to get NFS to work, but you +would be left with an extremely insecure setup. You may also need to edit +your startup scripts; see Section 3.3.3 for more on that. +----------------------------------------------------------------------------- + +3.2.1. /etc/exports + +This file contains a list of entries; each entry indicates a volume that is +shared and how it is shared. Check the man pages (man exports) for a complete +description of all the setup options for the file, although the description +here will probably satistfy most people's needs. + +An entry in /etc/exports will typically look like this: + directory machine1(option11,option12) machine2(option21,option22) + +where + +directory + the directory that you want to share. It may be an entire volume though + it need not be. If you share a directory, then all directories under it + within the same file system will be shared as well. + +machine1 and machine2 + client machines that will have access to the directory. The machines may + be listed by their DNS address or their IP address (e.g., + machine.company.com or 192.168.0.8). Using IP addresses is more reliable + and more secure. If you need to use DNS addresses, and they do not seem + to be resolving to the right machine, see Section 7.3. + +optionxx + the option listing for each machine will describe what kind of access + that machine will have. Important options are: + +   + ro: The directory is shared read only; the client machine will not be + able to write to it. This is the default. + +   + rw: The client machine will have read and write access to the + directory. + +   + no_root_squash: By default, any file request made by user root on the + client machine is treated as if it is made by user nobody on the + server. (Excatly which UID the request is mapped to depends on the + UID of user "nobody" on the server, not the client.) If + no_root_squash is selected, then root on the client machine will have + the same level of access to the files on the system as root on the + server. This can have serious security implications, although it may + be necessary if you want to perform any administrative work on the + client machine that involves the exported directories. You should not + specify this option without a good reason. + +   + no_subtree_check: If only part of a volume is exported, a routine + called subtree checking verifies that a file that is requested from + the client is in the appropriate part of the volume. If the entire + volume is exported, disabling this check will speed up transfers. + +   + sync: By default, all but the most recent version (version 1.11) of + the exportfs command will use async behavior, telling a client + machine that a file write is complete - that is, has been written to + stable storage - when NFS has finished handing the write over to the + filesysytem. This behavior may cause data corruption if the server + reboots, and the sync option prevents this. See Section 5.9 for a + complete discussion of sync and async behavior. + + + +Suppose we have two client machines, slave1 and slave2, that have IP +addresses 192.168.0.1 and 192.168.0.2, respectively. We wish to share our +software binaries and home directories with these machines. A typical setup +for /etc/exports might look like this: ++---------------------------------------------------------------------------+ +| /usr/local 192.168.0.1(ro) 192.168.0.2(ro) | +| /home 192.168.0.1(rw) 192.168.0.2(rw) | +| | ++---------------------------------------------------------------------------+ + +Here we are sharing /usr/local read-only to slave1 and slave2, because it +probably contains our software and there may not be benefits to allowing +slave1 and slave2 to write to it that outweigh security concerns. On the +other hand, home directories need to be exported read-write if users are to +save work on them. + +If you have a large installation, you may find that you have a bunch of +computers all on the same local network that require access to your server. +There are a few ways of simplifying references to large numbers of machines. +First, you can give access to a range of machines at once by specifying a +network and a netmask. For example, if you wanted to allow access to all the +machines with IP addresses between 192.168.0.0 and 192.168.0.255 then you +could have the entries: ++---------------------------------------------------------------------------+ +| /usr/local 192.168.0.0/255.255.255.0(ro) | +| /home 192.168.0.0/255.255.255.0(rw) | +| | ++---------------------------------------------------------------------------+ + +See the [http://www.linuxdoc.org/HOWTO/Networking-Overview-HOWTO.html] +Networking-Overview HOWTO for further information about how netmasks work, +and you may also wish to look at the man pages for init and hosts.allow. + +Second, you can use NIS netgroups in your entry. To specify a netgroup in +your exports file, simply prepend the name of the netgroup with an "@". See +the [http://www.linuxdoc.org/HOWTO/NIS-HOWTO.html] NIS HOWTO for details on +how netgroups work. + +Third, you can use wildcards such as *.foo.com or 192.168. instead of +hostnames. There were problems with wildcard implementation in the 2.2 kernel +series that were fixed in kernel 2.2.19. + +However, you should keep in mind that any of these simplifications could +cause a security risk if there are machines in your netgroup or local network +that you do not trust completely. + +A few cautions are in order about what cannot (or should not) be exported. +First, if a directory is exported, its parent and child directories cannot be +exported if they are in the same filesystem. However, exporting both should +not be necessary because listing the parent directory in the /etc/exports +file will cause all underlying directories within that file system to be +exported. + +Second, it is a poor idea to export a FAT or VFAT (i.e., MS-DOS or Windows 95 +/98) filesystem with NFS. FAT is not designed for use on a multi-user +machine, and as a result, operations that depend on permissions will not work +well. Moreover, some of the underlying filesystem design is reported to work +poorly with NFS's expectations. + +Third, device or other special files may not export correctly to non-Linux +clients. See Section 8 for details on particular operating systems. +----------------------------------------------------------------------------- + +3.2.2. /etc/hosts.allow and /etc/hosts.deny + +These two files specify which computers on the network can use services on +your machine. Each line of the file contains a single entry listing a service +and a set of machines. When the server gets a request from a machine, it does +the following: + +  * It first checks hosts.allow to see if the machine matches a description + listed in there. If it does, then the machine is allowed access. + +  * If the machine does not match an entry in hosts.allow, the server then + checks hosts.deny to see if the client matches a listing in there. If it + does then the machine is denied access. + +  * If the client matches no listings in either file, then it is allowed + access. + + +In addition to controlling access to services handled by inetd (such as +telnet and FTP), this file can also control access to NFS by restricting +connections to the daemons that provide NFS services. Restrictions are done +on a per-service basis. + +The first daemon to restrict access to is the portmapper. This daemon +essentially just tells requesting clients how to find all the NFS services on +the system. Restricting access to the portmapper is the best defense against +someone breaking into your system through NFS because completely unauthorized +clients won't know where to find the NFS daemons. However, there are two +things to watch out for. First, restricting portmapper isn't enough if the +intruder already knows for some reason how to find those daemons. And second, +if you are running NIS, restricting portmapper will also restrict requests to +NIS. That should usually be harmless since you usually want to restrict NFS +and NIS in a similar way, but just be cautioned. (Running NIS is generally a +good idea if you are running NFS, because the client machines need a way of +knowing who owns what files on the exported volumes. Of course there are +other ways of doing this such as syncing password files. See the [http:// +www.linuxdoc.org/HOWTO/NIS-HOWTO.html] NIS HOWTO for information on setting +up NIS.) + +In general it is a good idea with NFS (as with most internet services) to +explicitly deny access to IP addresses that you don't need to allow access +to. + +The first step in doing this is to add the followng entry to /etc/hosts.deny: + ++---------------------------------------------------------------------------+ +| portmap:ALL | +| | ++---------------------------------------------------------------------------+ + +Starting with nfs-utils 0.2.0, you can be a bit more careful by controlling +access to individual daemons. It's a good precaution since an intruder will +often be able to weasel around the portmapper. If you have a newer version of +nfs-utils, add entries for each of the NFS daemons (see the next section to +find out what these daemons are; for now just put entries for them in +hosts.deny): + ++---------------------------------------------------------------------------+ +| lockd:ALL | +| mountd:ALL | +| rquotad:ALL | +| statd:ALL | +| | ++---------------------------------------------------------------------------+ + +Even if you have an older version of nfs-utils, adding these entries is at +worst harmless (since they will just be ignored) and at best will save you +some trouble when you upgrade. Some sys admins choose to put the entry ALL: +ALL in the file /etc/hosts.deny, which causes any service that looks at these +files to deny access to all hosts unless it is explicitly allowed. While this +is more secure behavior, it may also get you in trouble when you are +installing new services, you forget you put it there, and you can't figure +out for the life of you why they won't work. + +Next, we need to add an entry to hosts.allow to give any hosts access that we +want to have access. (If we just leave the above lines in hosts.deny then +nobody will have access to NFS.) Entries in hosts.allow follow the format + + ++---------------------------------------------------------------------------+ +| service: host [or network/netmask] , host [or network/netmask] | +| | ++---------------------------------------------------------------------------+ + +Here, host is IP address of a potential client; it may be possible in some +versions to use the DNS name of the host, but it is strongly discouraged. + +Suppose we have the setup above and we just want to allow access to +slave1.foo.com and slave2.foo.com, and suppose that the IP addresses of these +machines are 192.168.0.1 and 192.168.0.2, respectively. We could add the +following entry to /etc/hosts.allow: + + ++---------------------------------------------------------------------------+ +| portmap: 192.168.0.1 , 192.168.0.2 | +| | ++---------------------------------------------------------------------------+ + +For recent nfs-utils versions, we would also add the following (again, these +entries are harmless even if they are not supported): + + ++---------------------------------------------------------------------------+ +| lockd: 192.168.0.1 , 192.168.0.2 | +| rquotad: 192.168.0.1 , 192.168.0.2 | +| mountd: 192.168.0.1 , 192.168.0.2 | +| statd: 192.168.0.1 , 192.168.0.2 | +| | ++---------------------------------------------------------------------------+ + +If you intend to run NFS on a large number of machines in a local network, / +etc/hosts.allow also allows for network/netmask style entries in the same +manner as /etc/exports above. +----------------------------------------------------------------------------- + +3.3. Getting the services started + +3.3.1. Pre-requisites + +The NFS server should now be configured and we can start it running. First, +you will need to have the appropriate packages installed. This consists +mainly of a new enough kernel and a new enough version of the nfs-utils +package. See Section 2.4 if you are in doubt. + +Next, before you can start NFS, you will need to have TCP/IP networking +functioning correctly on your machine. If you can use telnet, FTP, and so on, +then chances are your TCP networking is fine. + +That said, with most recent Linux distributions you may be able to get NFS up +and running simply by rebooting your machine, and the startup scripts should +detect that you have set up your /etc/exports file and will start up NFS +correctly. If you try this, see Section 3.4 Verifying that NFS is running. If +this does not work, or if you are not in a position to reboot your machine, +then the following section will tell you which daemons need to be started in +order to run NFS services. If for some reason nfsd was already running when +you edited your configuration files above, you will have to flush your +configuration; see Section 3.5 for details. +----------------------------------------------------------------------------- + +3.3.2. Starting the Portmapper + +NFS depends on the portmapper daemon, either called portmap or rpc.portmap. +It will need to be started first. It should be located in /sbin but is +sometimes in /usr/sbin. Most recent Linux distributions start this daemon in +the boot scripts, but it is worth making sure that it is running before you +begin working with NFS (just type ps aux | grep portmap). +----------------------------------------------------------------------------- + +3.3.3. The Daemons + +NFS serving is taken care of by five daemons: rpc.nfsd, which does most of +the work; rpc.lockd and rpc.statd, which handle file locking; rpc.mountd, +which handles the initial mount requests, and rpc.rquotad, which handles user +file quotas on exported volumes. Starting with 2.2.18, lockd is called by +nfsd upon demand, so you do not need to worry about starting it yourself. +statd will need to be started separately. Most recent Linux distributions +will have startup scripts for these daemons. + +The daemons are all part of the nfs-utils package, and may be either in the / +sbin directory or the /usr/sbin directory. + +If your distribution does not include them in the startup scripts, then then +you should add them, configured to start in the following order: + +rpc.portmap +rpc.mountd, rpc.nfsd +rpc.statd, rpc.lockd (if necessary), and rpc.rquotad + +The nfs-utils package has sample startup scripts for RedHat and Debian. If +you are using a different distribution, in general you can just copy the +RedHat script, but you will probably have to take out the line that says: ++---------------------------------------------------------------------------+ +| . ../init.d/functions | +| | ++---------------------------------------------------------------------------+ +to avoid getting error messages. +----------------------------------------------------------------------------- + +3.4. Verifying that NFS is running + +To do this, query the portmapper with the command rpcinfo -p to find out what +services it is providing. You should get something like this: ++---------------------------------------------------------------------------+ +| program vers proto port | +| 100000 2 tcp 111 portmapper | +| 100000 2 udp 111 portmapper | +| 100011 1 udp 749 rquotad | +| 100011 2 udp 749 rquotad | +| 100005 1 udp 759 mountd | +| 100005 1 tcp 761 mountd | +| 100005 2 udp 764 mountd | +| 100005 2 tcp 766 mountd | +| 100005 3 udp 769 mountd | +| 100005 3 tcp 771 mountd | +| 100003 2 udp 2049 nfs | +| 100003 3 udp 2049 nfs | +| 300019 1 tcp 830 amd | +| 300019 1 udp 831 amd | +| 100024 1 udp 944 status | +| 100024 1 tcp 946 status | +| 100021 1 udp 1042 nlockmgr | +| 100021 3 udp 1042 nlockmgr | +| 100021 4 udp 1042 nlockmgr | +| 100021 1 tcp 1629 nlockmgr | +| 100021 3 tcp 1629 nlockmgr | +| 100021 4 tcp 1629 nlockmgr | +| | ++---------------------------------------------------------------------------+ + +This says that we have NFS versions 2 and 3, rpc.statd version 1, network +lock manager (the service name for rpc.lockd) versions 1, 3, and 4. There are +also different service listings depending on whether NFS is travelling over +TCP or UDP. Linux systems use UDP by default unless TCP is explicitly +requested; however other OSes such as Solaris default to TCP. + +If you do not at least see a line that says portmapper, a line that says nfs, +and a line that says mountd then you will need to backtrack and try again to +start up the daemons (see Section 7, Troubleshooting, if this still doesn't +work). + +If you do see these services listed, then you should be ready to set up NFS +clients to access files from your server. +----------------------------------------------------------------------------- + +3.5. Making changes to /etc/exports later on + +If you come back and change your /etc/exports file, the changes you make may +not take effect immediately. You should run the command exportfs -ra to force +nfsd to re-read the /etc/exports   file. If you can't find the exportfs +command, then you can kill nfsd with the -HUP flag (see the man pages for +kill for details). + +If that still doesn't work, don't forget to check hosts.allow to make sure +you haven't forgotten to list any new client machines there. Also check the +host listings on any firewalls you may have set up (see Section 7 and Section +6 for more details on firewalls and NFS). +----------------------------------------------------------------------------- + +4. Setting up an NFS Client + +4.1. Mounting remote directories + +Before beginning, you should double-check to make sure your mount program is +new enough (version 2.10m if you want to use Version 3 NFS), and that the +client machine supports NFS mounting, though most standard distributions do. +If you are using a 2.2 or later kernel with the /proc filesystem you can +check the latter by reading the file /proc/filesystems and making sure there +is a line containing nfs. If not, typing insmod nfs may make it magically +appear if NFS has been compiled as a module; otherwise, you will need to +build (or download) a kernel that has NFS support built in. In general, +kernels that do not have NFS compiled in will give a very specific error when +the mount command below is run. + +To begin using machine as an NFS client, you will need the portmapper running +on that machine, and to use NFS file locking, you will also need rpc.statd +and rpc.lockd running on both the client and the server. Most recent +distributions start those services by default at boot time; if yours doesn't, +see Section 3.2 for information on how to start them up. + +With portmap, lockd, and statd running, you should now be able to mount the +remote directory from your server just the way you mount a local hard drive, +with the mount command. Continuing our example from the previous section, +suppose our server above is called master.foo.com,and we want to mount the / +home directory on slave1.foo.com. Then, all we have to do, from the root +prompt on slave1.foo.com, is type: ++---------------------------------------------------------------------------+ +| # mount master.foo.com:/home /mnt/home | +| | ++---------------------------------------------------------------------------+ +and the directory /home on master will appear as the directory /mnt/home on +slave1. (Note that this assumes we have created the directory /mnt/home as an +empty mount point beforehand.) + +If this does not work, see the Troubleshooting section (Section 7). + +You can get rid of the file system by typing ++---------------------------------------------------------------------------+ +| # umount /mnt/home | +| | ++---------------------------------------------------------------------------+ +just like you would for a local file system. +----------------------------------------------------------------------------- + +4.2. Getting NFS File Systems to Be Mounted at Boot Time + +NFS file systems can be added to your /etc/fstab file the same way local file +systems can, so that they mount when your system starts up. The only +difference is that the file system type will be set to nfs and the dump and +fsck order (the last two entries) will have to be set to zero. So for our +example above, the entry in /etc/fstab would look like: + # device mountpoint fs-type options dump fsckorder + ... + master.foo.com:/home /mnt nfs rw 0 0 + ... + + +See the man pages for fstab if you are unfamiliar with the syntax of this +file. If you are using an automounter such as amd or autofs, the options in +the corresponding fields of your mount listings should look very similar if +not identical. + +At this point you should have NFS working, though a few tweaks may still be +necessary to get it to work well. You should also read Section 6 to be sure +your setup is reasonably secure. +----------------------------------------------------------------------------- + +4.3. Mount options + +4.3.1. Soft vs. Hard Mounting + +There are some options you should consider adding at once. They govern the +way the NFS client handles a server crash or network outage. One of the cool +things about NFS is that it can handle this gracefully. If you set up the +clients right. There are two distinct failure modes: + +soft + If a file request fails, the NFS client will report an error to the + process on the client machine requesting the file access. Some programs + can handle this with composure, most won't. We do not recommend using + this setting; it is a recipe for corrupted files and lost data. You + should especially not use this for mail disks --- if you value your mail, + that is. + +hard + The program accessing a file on a NFS mounted file system will hang when + the server crashes. The process cannot be interrupted or killed (except + by a "sure kill") unless you also specify intr. When the NFS server is + back online the program will continue undisturbed from where it was. We + recommend using hard,intr on all NFS mounted file systems. + + +Picking up the from previous example, the fstab entry would now look like: + # device mountpoint fs-type options dump fsckord + ... + master.foo.com:/home /mnt/home nfs rw,hard,intr 0 0 + ... + +----------------------------------------------------------------------------- + +4.3.2. Setting Block Size to Optimize Transfer Speeds + +The rsize and wsize mount options specify the size of the chunks of data that +the client and server pass back and forth to each other. + +The defaults may be too big or to small; there is no size that works well on +all or most setups. On the one hand, some combinations of Linux kernels and +network cards (largely on older machines) cannot handle blocks that large. On +the other hand, if they can handle larger blocks, a bigger size might be +faster. + +Getting the block size right is an important factor in performance and is a +must if you are planning to use the NFS server in a production environment. +See Section 5 for details. +----------------------------------------------------------------------------- + +5. Optimizing NFS Performance + +Careful analysis of your environment, both from the client and from the +server point of view, is the first step necessary for optimal NFS +performance. The first sections will address issues that are generally +important to the client. Later (Section 5.3 and beyond), server side issues +will be discussed. In both cases, these issues will not be limited +exclusively to one side or the other, but it is useful to separate the two in +order to get a clearer picture of cause and effect. + +Aside from the general network configuration - appropriate network capacity, +faster NICs, full duplex settings in order to reduce collisions, agreement in +network speed among the switches and hubs, etc. - one of the most important +client optimization settings are the NFS data transfer buffer sizes, +specified by the mount command options rsize and wsize. +----------------------------------------------------------------------------- + +5.1. Setting Block Size to Optimize Transfer Speeds + +The mount command options rsize and wsize specify the size of the chunks of +data that the client and server pass back and forth to each other. If no +rsize and wsize options are specified, the default varies by which version of +NFS we are using. The most common default is 4K (4096 bytes), although for +TCP-based mounts in 2.2 kernels, and for all mounts beginning with 2.4 +kernels, the server specifies the default block size. + +The theoretical limit for the NFS V2 protocol is 8K. For the V3 protocol, the +limit is specific to the server. On the Linux server, the maximum block size +is defined by the value of the kernel constant NFSSVC_MAXBLKSIZE, found in +the Linux kernel source file ./include/linux/nfsd/const.h. The current +maximum block size for the kernel, as of 2.4.17, is 8K (8192 bytes), but the +patch set implementing NFS over TCP/IP transport in the 2.4 series, as of +this writing, uses a value of 32K (defined in the patch as 32*1024) for the +maximum block size. + +All 2.4 clients currently support up to 32K block transfer sizes, allowing +the standard 32K block transfers across NFS mounts from other servers, such +as Solaris, without client modification. + +The defaults may be too big or too small, depending on the specific +combination of hardware and kernels. On the one hand, some combinations of +Linux kernels and network cards (largely on older machines) cannot handle +blocks that large. On the other hand, if they can handle larger blocks, a +bigger size might be faster. + +You will want to experiment and find an rsize and wsize that works and is as +fast as possible. You can test the speed of your options with some simple +commands, if your network environment is not heavily used. Note that your +results may vary widely unless you resort to using more complex benchmarks, +such as Bonnie, Bonnie++, or IOzone. + +The first of these commands transfers 16384 blocks of 16k each from the +special file /dev/zero (which if you read it just spits out zeros really +fast) to the mounted partition. We will time it to see how long it takes. So, +from the client machine, type: + # time dd if=/dev/zero of=/mnt/home/testfile bs=16k count=16384 + +This creates a 256Mb file of zeroed bytes. In general, you should create a +file that's at least twice as large as the system RAM on the server, but make +sure you have enough disk space! Then read back the file into the great black +hole on the client machine (/dev/null) by typing the following: + # time dd if=/mnt/home/testfile of=/dev/null bs=16k + +Repeat this a few times and average how long it takes. Be sure to unmount and +remount the filesystem each time (both on the client and, if you are zealous, +locally on the server as well), which should clear out any caches. + +Then unmount, and mount again with a larger and smaller block size. They +should be multiples of 1024, and not larger than the maximum block size +allowed by your system. Note that NFS Version 2 is limited to a maximum of +8K, regardless of the maximum block size defined by NFSSVC_MAXBLKSIZE; +Version 3 will support up to 64K, if permitted. The block size should be a +power of two since most of the parameters that would constrain it (such as +file system block sizes and network packet size) are also powers of two. +However, some users have reported better successes with block sizes that are +not powers of two but are still multiples of the file system block size and +the network packet size. + +Directly after mounting with a larger size, cd into the mounted file system +and do things like ls, explore the filesystem a bit to make sure everything +is as it should. If the rsize/wsize is too large the symptoms are very odd +and not 100% obvious. A typical symptom is incomplete file lists when doing +ls, and no error messages, or reading files failing mysteriously with no +error messages. After establishing that the given rsize/ wsize works you can +do the speed tests again. Different server platforms are likely to have +different optimal sizes. + +Remember to edit /etc/fstab to reflect the rsize/wsize you found to be the +most desirable. + +If your results seem inconsistent, or doubtful, you may need to analyze your +network more extensively while varying the rsize and wsize values. In that +case, here are several pointers to benchmarks that may prove useful: + +  * Bonnie [http://www.textuality.com/bonnie/] http://www.textuality.com/ + bonnie/ + +  * Bonnie++ [http://www.coker.com.au/bonnie++/] http://www.coker.com.au/ + bonnie++/ + +  * IOzone file system benchmark [http://www.iozone.org/] http:// + www.iozone.org/ + +  * The official NFS benchmark, SPECsfs97 [http://www.spec.org/osg/sfs97/] + http://www.spec.org/osg/sfs97/ + + +The easiest benchmark with the widest coverage, including an extensive spread +of file sizes, and of IO types - reads, & writes, rereads & rewrites, random +access, etc. - seems to be IOzone. A recommended invocation of IOzone (for +which you must have root privileges) includes unmounting and remounting the +directory under test, in order to clear out the caches between tests, and +including the file close time in the measurements. Assuming you've already +exported /tmp to everyone from the server foo, and that you've installed +IOzone in the local directory, this should work: + # echo "foo:/tmp /mnt/foo nfs rw,hard,intr,rsize=8192,wsize=8192 0 0" + >> /etc/fstab + # mkdir /mnt/foo + # mount /mnt/foo + # ./iozone -a -R -c -U /mnt/foo -f /mnt/foo/testfile > logfile + +The benchmark should take 2-3 hours at most, but of course you will need to +run it for each value of rsize and wsize that is of interest. The web site +gives full documentation of the parameters, but the specific options used +above are: + +  * -a Full automatic mode, which tests file sizes of 64K to 512M, using + record sizes of 4K to 16M + +  * -R Generate report in excel spreadsheet form (The "surface plot" option + for graphs is best) + +  * -c Include the file close time in the tests, which will pick up the NFS + version 3 commit time + +  * -U Use the given mount point to unmount and remount between tests; it + clears out caches + +  * -f When using unmount, you have to locate the test file in the mounted + file system + + +----------------------------------------------------------------------------- +5.2. Packet Size and Network Drivers + +While many Linux network card drivers are excellent, some are quite shoddy, +including a few drivers for some fairly standard cards. It is worth +experimenting with your network card directly to find out how it can best +handle traffic. + +Try pinging back and forth between the two machines with large packets using +the -f and -s options with ping (see ping(8) for more details) and see if a +lot of packets get dropped, or if they take a long time for a reply. If so, +you may have a problem with the performance of your network card. + +For a more extensive analysis of NFS behavior in particular, use the nfsstat +command to look at nfs transactions, client and server statistics, network +statistics, and so forth. The "-o net" option will show you the number of +dropped packets in relation to the total number of transactions. In UDP +transactions, the most important statistic is the number of retransmissions, +due to dropped packets, socket buffer overflows, general server congestion, +timeouts, etc. This will have a tremendously important effect on NFS +performance, and should be carefully monitored. Note that nfsstat does not +yet implement the -z option, which would zero out all counters, so you must +look at the current nfsstat counter values prior to running the benchmarks. + +To correct network problems, you may wish to reconfigure the packet size that +your network card uses. Very often there is a constraint somewhere else in +the network (such as a router) that causes a smaller maximum packet size +between two machines than what the network cards on the machines are actually +capable of. TCP should autodiscover the appropriate packet size for a +network, but UDP will simply stay at a default value. So determining the +appropriate packet size is especially important if you are using NFS over +UDP. + +You can test for the network packet size using the tracepath command: From +the client machine, just type tracepath server 2049 and the path MTU should +be reported at the bottom. You can then set the MTU on your network card +equal to the path MTU, by using the MTU option to ifconfig, and see if fewer +packets get dropped. See the ifconfig man pages for details on how to reset +the MTU. + +In addition, netstat -s will give the statistics collected for traffic across +all supported protocols. You may also look at /proc/net/snmp for information +about current network behavior; see the next section for more details. +----------------------------------------------------------------------------- + +5.3. Overflow of Fragmented Packets + +Using an rsize or wsize larger than your network's MTU (often set to 1500, in +many networks) will cause IP packet fragmentation when using NFS over UDP. IP +packet fragmentation and reassembly require a significant amount of CPU +resource at both ends of a network connection. In addition, packet +fragmentation also exposes your network traffic to greater unreliability, +since a complete RPC request must be retransmitted if a UDP packet fragment +is dropped for any reason. Any increase of RPC retransmissions, along with +the possibility of increased timeouts, are the single worst impediment to +performance for NFS over UDP. + +Packets may be dropped for many reasons. If your network topography is +complex, fragment routes may differ, and may not all arrive at the Server for +reassembly. NFS Server capacity may also be an issue, since the kernel has a +limit of how many fragments it can buffer before it starts throwing away +packets. With kernels that support the /proc filesystem, you can monitor the +files /proc/sys/net/ipv4/ipfrag_high_thresh and /proc/sys/net/ipv4/ +ipfrag_low_thresh. Once the number of unprocessed, fragmented packets reaches +the number specified by ipfrag_high_thresh (in bytes), the kernel will simply +start throwing away fragmented packets until the number of incomplete packets +reaches the number specified by ipfrag_low_thresh. + +Another counter to monitor is IP: ReasmFails in the file /proc/net/snmp; this +is the number of fragment reassembly failures. if it goes up too quickly +during heavy file activity, you may have problem. +----------------------------------------------------------------------------- + +5.4. NFS over TCP + +A new feature, available for both 2.4 and 2.5 kernels but not yet integrated +into the mainstream kernel at the time of this writing, is NFS over TCP. +Using TCP has a distinct advantage and a distinct disadvantage over UDP. The +advantage is that it works far better than UDP on lossy networks. When using +TCP, a single dropped packet can be retransmitted, without the retransmission +of the entire RPC request, resulting in better performance on lossy networks. +In addition, TCP will handle network speed differences better than UDP, due +to the underlying flow control at the network level. + +The disadvantage of using TCP is that it is not a stateless protocol like +UDP. If your server crashes in the middle of a packet transmission, the +client will hang and any shares will need to be unmounted and remounted. + +The overhead incurred by the TCP protocol will result in somewhat slower +performance than UDP under ideal network conditions, but the cost is not +severe, and is often not noticable without careful measurement. If you are +using gigabit ethernet from end to end, you might also investigate the usage +of jumbo frames, since the high speed network may allow the larger frame +sizes without encountering increased collision rates, particularly if you +have set the network to full duplex. +----------------------------------------------------------------------------- + +5.5. Timeout and Retransmission Values + +Two mount command options, timeo and retrans, control the behavior of UDP +requests when encountering client timeouts due to dropped packets, network +congestion, and so forth. The -o timeo option allows designation of the +length of time, in tenths of seconds, that the client will wait until it +decides it will not get a reply from the server, and must try to send the +request again. The default value is 7 tenths of a second. The -o retrans +option allows designation of the number of timeouts allowed before the client +gives up, and displays the Server not responding message. The default value +is 3 attempts. Once the client displays this message, it will continue to try +to send the request, but only once before displaying the error message if +another timeout occurs. When the client reestablishes contact, it will fall +back to using the correct retrans value, and will display the Server OK +message. + +If you are already encountering excessive retransmissions (see the output of +the nfsstat command), or want to increase the block transfer size without +encountering timeouts and retransmissions, you may want to adjust these +values. The specific adjustment will depend upon your environment, and in +most cases, the current defaults are appropriate. +----------------------------------------------------------------------------- + +5.6. Number of Instances of the NFSD Server Daemon + +Most startup scripts, Linux and otherwise, start 8 instances of nfsd. In the +early days of NFS, Sun decided on this number as a rule of thumb, and +everyone else copied. There are no good measures of how many instances are +optimal, but a more heavily-trafficked server may require more. You should +use at the very least one daemon per processor, but four to eight per +processor may be a better rule of thumb. If you are using a 2.4 or higher +kernel and you want to see how heavily each nfsd thread is being used, you +can look at the file /proc/net/rpc/nfsd. The last ten numbers on the th line +in that file indicate the number of seconds that the thread usage was at that +percentage of the maximum allowable. If you have a large number in the top +three deciles, you may wish to increase the number of nfsd instances. This is +done upon starting nfsd using the number of instances as the command line +option, and is specified in the NFS startup script (/etc/rc.d/init.d/nfs on +Red Hat) as RPCNFSDCOUNT. See the nfsd(8) man page for more information. +----------------------------------------------------------------------------- + +5.7. Memory Limits on the Input Queue + +On 2.2 and 2.4 kernels, the socket input queue, where requests sit while they +are currently being processed, has a small default size limit (rmem_default) +of 64k. This queue is important for clients with heavy read loads, and +servers with heavy write loads. As an example, if you are running 8 instances +of nfsd on the server, each will only have 8k to store write requests while +it processes them. In addition, the socket output queue - important for +clients with heavy write loads and servers with heavy read loads - also has a +small default size (wmem_default). + +Several published runs of the NFS benchmark [http://www.spec.org/osg/sfs97/] +SPECsfs specify usage of a much higher value for both the read and write +value sets, [rw]mem_default and [rw]mem_max. You might consider increasing +these values to at least 256k. The read and write limits are set in the proc +file system using (for example) the files /proc/sys/net/core/rmem_default and +/proc/sys/net/core/rmem_max. The rmem_default value can be increased in three +steps; the following method is a bit of a hack but should work and should not +cause any problems: + +  * Increase the size listed in the file: + # echo 262144 > /proc/sys/net/core/rmem_default + # echo 262144 > /proc/sys/net/core/rmem_max + +  * Restart NFS. For example, on Red Hat systems, + # /etc/rc.d/init.d/nfs restart + +  * You might return the size limits to their normal size in case other + kernel systems depend on it: + # echo 65536 > /proc/sys/net/core/rmem_default + # echo 65536 > /proc/sys/net/core/rmem_max + + +This last step may be necessary because machines have been reported to crash +if these values are left changed for long periods of time. +----------------------------------------------------------------------------- + +5.8. Turning Off Autonegotiation of NICs and Hubs + +If network cards auto-negotiate badly with hubs and switches, and ports run +at different speeds, or with different duplex configurations, performance +will be severely impacted due to excessive collisions, dropped packets, etc. +If you see excessive numbers of dropped packets in the nfsstat output, or +poor network performance in general, try playing around with the network +speed and duplex settings. If possible, concentrate on establishing a +100BaseT full duplex subnet; the virtual elimination of collisions in full +duplex will remove the most severe performance inhibitor for NFS over UDP. Be +careful when turning off autonegotiation on a card: The hub or switch that +the card is attached to will then resort to other mechanisms (such as +parallel detection) to determine the duplex settings, and some cards default +to half duplex because it is more likely to be supported by an old hub. The +best solution, if the driver supports it, is to force the card to negotiate +100BaseT full duplex. +----------------------------------------------------------------------------- + +5.9. Synchronous vs. Asynchronous Behavior in NFS + +The default export behavior for both NFS Version 2 and Version 3 protocols, +used by exportfs in nfs-utils versions prior to Version 1.11 (the latter is +in the CVS tree, but not yet released in a package, as of January, 2002) is +"asynchronous". This default permits the server to reply to client requests +as soon as it has processed the request and handed it off to the local file +system, without waiting for the data to be written to stable storage. This is +indicated by the async option denoted in the server's export list. It yields +better performance at the cost of possible data corruption if the server +reboots while still holding unwritten data and/or metadata in its caches. +This possible data corruption is not detectable at the time of occurrence, +since the async option instructs the server to lie to the client, telling the +client that all data has indeed been written to the stable storage, +regardless of the protocol used. + +In order to conform with "synchronous" behavior, used as the default for most +proprietary systems supporting NFS (Solaris, HP-UX, RS/6000, etc.), and now +used as the default in the latest version of exportfs, the Linux Server's +file system must be exported with the sync option. Note that specifying +synchronous exports will result in no option being seen in the server's +export list: + +  * Export a couple file systems to everyone, using slightly different + options: + + # /usr/sbin/exportfs -o rw,sync *:/usr/local + # /usr/sbin/exportfs -o rw *:/tmp + +  * Now we can see what the exported file system parameters look like: + + # /usr/sbin/exportfs -v + /usr/local *(rw) + /tmp *(rw,async) + + +If your kernel is compiled with the /proc filesystem, then the file /proc/fs/ +nfs/exports will also show the full list of export options. + +When synchronous behavior is specified, the server will not complete (that +is, reply to the client) an NFS version 2 protocol request until the local +file system has written all data/metadata to the disk. The server will +complete a synchronous NFS version 3 request without this delay, and will +return the status of the data in order to inform the client as to what data +should be maintained in its caches, and what data is safe to discard. There +are 3 possible status values, defined an enumerated type, nfs3_stable_how, in +include/linux/nfs.h. The values, along with the subsequent actions taken due +to these results, are as follows: + +  * NFS_UNSTABLE - Data/Metadata was not committed to stable storage on the + server, and must be cached on the client until a subsequent client commit + request assures that the server does send data to stable storage. + +  * NFS_DATA_SYNC - Metadata was not sent to stable storage, and must be + cached on the client. A subsequent commit is necessary, as is required + above. + +  * NFS_FILE_SYNC - No data/metadata need be cached, and a subsequent commit + need not be sent for the range covered by this request. + + +In addition to the above definition of synchronous behavior, the client may +explicitly insist on total synchronous behavior, regardless of the protocol, +by opening all files with the O_SYNC option. In this case, all replies to +client requests will wait until the data has hit the server's disk, +regardless of the protocol used (meaning that, in NFS version 3, all requests +will be NFS_FILE_SYNC requests, and will require that the Server returns this +status). In that case, the performance of NFS Version 2 and NFS Version 3 +will be virtually identical. + +If, however, the old default async behavior is used, the O_SYNC option has no +effect at all in either version of NFS, since the server will reply to the +client without waiting for the write to complete. In that case the +performance differences between versions will also disappear. + +Finally, note that, for NFS version 3 protocol requests, a subsequent commit +request from the NFS client at file close time, or at fsync() time, will +force the server to write any previously unwritten data/metadata to the disk, +and the server will not reply to the client until this has been completed, as +long as sync behavior is followed. If async is used, the commit is +essentially a no-op, since the server once again lies to the client, telling +the client that the data has been sent to stable storage. This again exposes +the client and server to data corruption, since cached data may be discarded +on the client due to its belief that the server now has the data maintained +in stable storage. +----------------------------------------------------------------------------- + +5.10. Non-NFS-Related Means of Enhancing Server Performance + +In general, server performance and server disk access speed will have an +important effect on NFS performance. Offering general guidelines for setting +up a well-functioning file server is outside the scope of this document, but +a few hints may be worth mentioning: + +  * If you have access to RAID arrays, use RAID 1/0 for both write speed and + redundancy; RAID 5 gives you good read speeds but lousy write speeds. + +  * A journalling filesystem will drastically reduce your reboot time in the + event of a system crash. Currently, [ftp://ftp.uk.linux.org/pub/linux/sct + /fs/jfs/] ext3 will work correctly with NFS version 3. In addition, + Reiserfs version 3.6 will work with NFS version 3 on 2.4.7 or later + kernels (patches are available for previous kernels). Earlier versions of + Reiserfs did not include room for generation numbers in the inode, + exposing the possibility of undetected data corruption during a server + reboot. + +  * Additionally, journalled file systems can be configured to maximize + performance by taking advantage of the fact that journal updates are all + that is necessary for data protection. One example is using ext3 with + data=journal so that all updates go first to the journal, and later to + the main file system. Once the journal has been updated, the NFS server + can safely issue the reply to the clients, and the main file system + update can occur at the server's leisure. + + The journal in a journalling file system may also reside on a separate + device such as a flash memory card so that journal updates normally + require no seeking. With only rotational delay imposing a cost, this + gives reasonably good synchronous IO performance. Note that ext3 + currently supports journal relocation, and ReiserFS will (officially) + support it soon. The Reiserfs tool package found at [ftp:// + ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.x.0k.tar.gz] ftp:// + ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.x.0k.tar.gz contains + the reiserfstune tool, which will allow journal relocation. It does, + however, require a kernel patch which has not yet been officially + released as of January, 2002. + +  * Using an automounter (such as autofs or amd) may prevent hangs if you + cross-mount files on your machines (whether on purpose or by oversight) + and one of those machines goes down. See the [http://www.linuxdoc.org/ + HOWTO/mini/Automount.html] Automount Mini-HOWTO for details. + +  * Some manufacturers (Network Appliance, Hewlett Packard, and others) + provide NFS accelerators in the form of Non-Volatile RAM. NVRAM will + boost access speed to stable storage up to the equivalent of async + access. + + +----------------------------------------------------------------------------- +6. Security and NFS + +This list of security tips and explanations will not make your site +completely secure. NOTHING will make your site completely secure. Reading +this section may help you get an idea of the security problems with NFS. This +is not a comprehensive guide and it will always be undergoing changes. If you +have any tips or hints to give us please send them to the HOWTO maintainer. + +If you are on a network with no access to the outside world (not even a +modem) and you trust all the internal machines and all your users then this +section will be of no use to you. However, its our belief that there are +relatively few networks in this situation so we would suggest reading this +section thoroughly for anyone setting up NFS. + +With NFS, there are two steps required for a client to gain access to a file +contained in a remote directory on the server. The first step is mount +access. Mount access is achieved by the client machine attempting to attach +to the server. The security for this is provided by the /etc/exports file. +This file lists the names or IP addresses for machines that are allowed to +access a share point. If the client's ip address matches one of the entries +in the access list then it will be allowed to mount. This is not terribly +secure. If someone is capable of spoofing or taking over a trusted address +then they can access your mount points. To give a real-world example of this +type of "authentication": This is equivalent to someone introducing +themselves to you and you believing they are who they claim to be because +they are wearing a sticker that says "Hello, My Name is ...." Once the +machine has mounted a volume, its operating system will have access to all +files on the volume (with the possible exception of those owned by root; see +below) and write access to those files as well, if the volume was exported +with the rw option. + +The second step is file access. This is a function of normal file system +access controls on the client and not a specialized function of NFS. Once the +drive is mounted the user and group permissions on the files determine access +control. + +An example: bob on the server maps to the UserID 9999. Bob makes a file on +the server that is only accessible the user (the equivalent to typing chmod +600 filename). A client is allowed to mount the drive where the file is +stored. On the client mary maps to UserID 9999. This means that the client +user mary can access bob's file that is marked as only accessible by him. It +gets worse: If someone has become superuser on the client machine they can su +- username and become any user. NFS will be none the wiser. + +Its not all terrible. There are a few measures you can take on the server to +offset the danger of the clients. We will cover those shortly. + +If you don't think the security measures apply to you, you're probably wrong. +In Section 6.1 we'll cover securing the portmapper, server and client +security in Section 6.2 and Section 6.3 respectively. Finally, in Section 6.4 +we'll briefly talk about proper firewalling for your nfs server. + +Finally, it is critical that all of your nfs daemons and client programs are +current. If you think that a flaw is too recently announced for it to be a +problem for you, then you've probably already been compromised. + +A good way to keep up to date on security alerts is to subscribe to the +bugtraq mailinglists. You can read up on how to subscribe and various other +information about bugtraq here: [http://www.securityfocus.com/forums/bugtraq/ +faq.html] http://www.securityfocus.com/forums/bugtraq/faq.html + +Additionally searching for NFS at [http://www.securityfocus.com] +securityfocus.com's search engine will show you all security reports +pertaining to NFS. + +You should also regularly check CERT advisories. See the CERT web page at +[http://www.cert.org] www.cert.org. +----------------------------------------------------------------------------- + +6.1. The portmapper + +The portmapper keeps a list of what services are running on what ports. This +list is used by a connecting machine to see what ports it wants to talk to +access certain services. + +The portmapper is not in as bad a shape as a few years ago but it is still a +point of worry for many sys admins. The portmapper, like NFS and NIS, should +not really have connections made to it outside of a trusted local area +network. If you have to expose them to the outside world - be careful and +keep up diligent monitoring of those systems. + +Not all Linux distributions were created equal. Some seemingly up-to-date +distributions do not include a securable portmapper. The easy way to check if +your portmapper is good or not is to run strings(1) and see if it reads the +relevant files, /etc/hosts.deny and /etc/hosts.allow. Assuming your +portmapper is /sbin/portmap you can check it with this command: + strings /sbin/portmap | grep hosts. + + +On a securable machine it comes up something like this: ++---------------------------------------------------------------------------+ +| /etc/hosts.allow | +| /etc/hosts.deny | +| @(#) hosts_ctl.c 1.4 94/12/28 17:42:27 | +| @(#) hosts_access.c 1.21 97/02/12 02:13:22 | +| | ++---------------------------------------------------------------------------+ + +First we edit /etc/hosts.deny. It should contain the line + ++---------------------------------------------------------------------------+ +| portmap: ALL | +| | ++---------------------------------------------------------------------------+ + +which will deny access to everyone. While it is closed run: ++---------------------------------------------------------------------------+ +| rpcinfo -p | +| | ++---------------------------------------------------------------------------+ +just to check that your portmapper really reads and obeys this file. Rpcinfo +should give no output, or possibly an error message. The files /etc/ +hosts.allow and /etc/hosts.deny take effect immediately after you save them. +No daemon needs to be restarted. + +Closing the portmapper for everyone is a bit drastic, so we open it again by +editing /etc/hosts.allow. But first we need to figure out what to put in it. +It should basically list all machines that should have access to your +portmapper. On a run of the mill Linux system there are very few machines +that need any access for any reason. The portmapper administers nfsd, mountd, +ypbind/ypserv, rquotad, lockd (which shows up as nlockmgr), statd (which +shows up as status) and 'r' services like ruptime and rusers. Of these only +nfsd, mountd, ypbind/ypserv and perhaps rquotad,lockd and statd are of any +consequence. All machines that need to access services on your machine should +be allowed to do that. Let's say that your machine's address is 192.168.0.254 +and that it lives on the subnet 192.168.0.0, and that all machines on the +subnet should have access to it (for an overview of those terms see the the +[http://www.linuxdoc.org/HOWTO/Networking-Overview-HOWTO.html] +Networking-Overview-HOWTO). Then we write: ++---------------------------------------------------------------------------+ +| portmap: 192.168.0.0/255.255.255.0 | +| | ++---------------------------------------------------------------------------+ +in /etc/hosts.allow. If you are not sure what your network or netmask are, +you can use the ifconfig command to determine the netmask and the netstat +command to determine the network. For, example, for the device eth0 on the +above machine ifconfig should show: + ++---------------------------------------------------------------------------+ +| ... | +| eth0 Link encap:Ethernet HWaddr 00:60:8C:96:D5:56 | +| inet addr:192.168.0.254 Bcast:192.168.0.255 Mask:255.255.255.0 | +| UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 | +| RX packets:360315 errors:0 dropped:0 overruns:0 | +| TX packets:179274 errors:0 dropped:0 overruns:0 | +| Interrupt:10 Base address:0x320 | +| ... | +| | ++---------------------------------------------------------------------------+ +and netstat -rn should show: ++---------------------------------------------------------------------------------+ +| Kernel routing table | +| Destination Gateway Genmask Flags Metric Ref Use Iface | +| ... | +| 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 174412 eth0 | +| ... | +| | ++---------------------------------------------------------------------------------+ +(The network address is in the first column). + +The /etc/hosts.deny and /etc/hosts.allow files are described in the manual +pages of the same names. + +IMPORTANT: Do not put anything but IP NUMBERS in the portmap lines of these +files. Host name lookups can indirectly cause portmap activity which will +trigger host name lookups which can indirectly cause portmap activity which +will trigger... + +Versions 0.2.0 and higher of the nfs-utils package also use the hosts.allow +and hosts.deny files, so you should put in entries for lockd, statd, mountd, +and rquotad in these files too. For a complete example, see Section 3.2.2. + +The above things should make your server tighter. The only remaining problem +is if someone gains administrative access to one of your trusted client +machines and is able to send bogus NFS requests. The next section deals with +safeguards against this problem. +----------------------------------------------------------------------------- + +6.2. Server security: nfsd and mountd + +On the server we can decide that we don't want to trust any requests made as +root on the client. We can do that by using the root_squash option in /etc/ +exports: + /home slave1(rw,root_squash) + + +This is, in fact, the default. It should always be turned on unless you have +a very good reason to turn it off. To turn it off use the no_root_squash +option. + +Now, if a user with UID 0 (i.e., root's user ID number) on the client +attempts to access (read, write, delete) the file system, the server +substitutes the UID of the server's 'nobody' account. Which means that the +root user on the client can't access or change files that only root on the +server can access or change. That's good, and you should probably use +root_squash on all the file systems you export. "But the root user on the +client can still use su to become any other user and access and change that +users files!" say you. To which the answer is: Yes, and that's the way it is, +and has to be with Unix and NFS. This has one important implication: All +important binaries and files should be owned by root, and not bin or other +non-root account, since the only account the clients root user cannot access +is the servers root account. In the exports(5) man page there are several +other squash options listed so that you can decide to mistrust whomever you +(don't) like on the clients. + +The TCP ports 1-1024 are reserved for root's use (and therefore sometimes +referred to as "secure ports") A non-root user cannot bind these ports. +Adding the secure option to an /etc/exports means that it will only listed to +requests coming from ports 1-1024 on the client, so that a malicious non-root +user on the client cannot come along and open up a spoofed NFS dialogue on a +non-reserved port. This option is set by default. +----------------------------------------------------------------------------- + +6.3. Client Security + +6.3.1. The nosuid mount option + +On the client we can decide that we don't want to trust the server too much a +couple of ways with options to mount. For example we can forbid suid programs +to work off the NFS file system with the nosuid option. Some unix programs, +such as passwd, are called "suid" programs: They set the id of the person +running them to whomever is the owner of the file. If a file is owned by root +and is suid, then the program will execute as root, so that they can perform +operations (such as writing to the password file) that only root is allowed +to do. Using the nosuid option is a good idea and you should consider using +this with all NFS mounted disks. It means that the server's root user cannot +make a suid-root program on the file system, log in to the client as a normal +user and then use the suid-root program to become root on the client too. One +could also forbid execution of files on the mounted file system altogether +with the noexec option. But this is more likely to be impractical than nosuid +since a file system is likely to at least contain some scripts or programs +that need to be executed. +----------------------------------------------------------------------------- + +6.3.2. The broken_suid mount option + +Some older programs (xterm being one of them) used to rely on the idea that +root can write everywhere. This is will break under new kernels on NFS +mounts. The security implications are that programs that do this type of suid +action can potentially be used to change your apparent uid on nfs servers +doing uid mapping. So the default has been to disable this broken_suid in the +linux kernel. + +The long and short of it is this: If you're using an old linux distribution, +some sort of old suid program or an older unix of some type you might have to +mount from your clients with the broken_suid option to mount. However, most +recent unixes and linux distros have xterm and such programs just as a normal +executable with no suid status, they call programs to do their setuid work. + +You enter the above options in the options column, with the rsize and wsize, +separated by commas. +----------------------------------------------------------------------------- + +6.3.3. Securing portmapper, rpc.statd, and rpc.lockd on the client + +In the current (2.2.18+) implementation of NFS, full file locking is +supported. This means that rpc.statd and rpc.lockd must be running on the +client in order for locks to function correctly. These services require the +portmapper to be running. So, most of the problems you will find with nfs on +the server you may also be plagued with on the client. Read through the +portmapper section above for information on securing the portmapper. +----------------------------------------------------------------------------- + +6.4. NFS and firewalls (ipchains and netfilter) + +IPchains (under the 2.2.X kernels) and netfilter (under the 2.4.x kernels) +allow a good level of security - instead of relying on the daemon (or perhaps +its TCP wrapper) to determine which machines can connect, the connection +attempt is allowed or disallowed at a lower level. In this case, you can stop +the connection much earlier and more globally, which can protect you from all +sorts of attacks. + +Describing how to set up a Linux firewall is well beyond the scope of this +document. Interested readers may wish to read the [http://www.linuxdoc.org/ +HOWTO/Firewall-HOWTO.html] Firewall-HOWTO or the [http://www.linuxdoc.org/ +HOWTO/IPCHAINS-HOWTO.HTML] IPCHAINS-HOWTO. For users of kernel 2.4 and above +you might want to visit the netfilter webpage at: [http:// +netfilter.filewatcher.org] http://netfilter.filewatcher.org. If you are +already familiar with the workings of ipchains or netfilter this section will +give you a few tips on how to better setup your NFS daemons to more easily +firewall and protect them. + +A good rule to follow for your firewall configuration is to deny all, and +allow only some - this helps to keep you from accidentally allowing more than +you intended. + +In order to understand how to firewall the NFS daemons, it will help to +breifly review how they bind to ports. + +When a daemon starts up, it requests a free port from the portmapper. The +portmapper gets the port for the daemon and keeps track of the port currently +used by that daemon. When other hosts or processes need to communicate with +the daemon, they request the port number from the portmapper in order to find +the daemon. So the ports will perpetually float because different ports may +be free at different times and so the portmapper will allocate them +differently each time. This is a pain for setting up a firewall. If you never +know where the daemons are going to be then you don't know precisely which +ports to allow access to. This might not be a big deal for many people +running on a protected or isolated LAN. For those people on a public network, +though, this is horrible. + +In kernels 2.4.13 and later with nfs-utils 0.3.3 or later you no longer have +to worry about the floating of ports in the portmapper. Now all of the +daemons pertaining to nfs can be "pinned" to a port. Most of them nicely take +a -p option when they are started; those daemons that are started by the +kernel take some kernel arguments or module options. They are described +below. + +Some of the daemons involved in sharing data via nfs are already bound to a +port. portmap is always on port 111 tcp and udp. nfsd is always on port 2049 +TCP and UDP (however, as of kernel 2.4.17, NFS over TCP is considered +experimental and is not for use on production machines). + +The other daemons, statd, mountd, lockd, and rquotad, will normally move +around to the first available port they are informed of by the portmapper. + +To force statd to bind to a particular port, use the -p portnum option. To +force statd to respond on a particular port, additionally use the -o portnum +option when starting it. + +To force mountd to bind to a particular port use the -p portnum option. + +For example, to have statd broadcast of port 32765 and listen on port 32766, +and mountd listen on port 32767, you would type: +# statd -p 32765 -o 32766 +# mountd -p 32767 + +lockd is started by the kernel when it is needed. Therefore you need to pass +module options (if you have it built as a module) or kernel options to force +lockd to listen and respond only on certain ports. + +If you are using loadable modules and you would like to specify these options +in your /etc/modules.conf file add a line like this to the file: +options lockd nlm_udpport=32768 nlm_tcpport=32768 + +The above line would specify the udp and tcp port for lockd to be 32768. + +If you are not using loadable modules or if you have compiled lockd into the +kernel instead of building it as a module then you will need to pass it an +option on the kernel boot line. + +It should look something like this: + vmlinuz 3 root=/dev/hda1 lockd.udpport=32768 lockd.tcpport=32768 + +The port numbers do not have to match but it would simply add unnecessary +confusion if they didn't. + +If you are using quotas and using rpc.quotad to make these quotas viewable +over nfs you will need to also take it into account when setting up your +firewall. There are two rpc.rquotad source trees. One of those is maintained +in the nfs-utils tree. The other in the quota-tools tree. They do not operate +identically. The one provided with nfs-utils supports binding the daemon to a +port with the -p directive. The one in quota-tools does not. Consult your +distribution's documentation to determine if yours does. + +For the sake of this discussion lets describe a network and setup a firewall +to protect our nfs server. Our nfs server is 192.168.0.42 our client is +192.168.0.45 only. As in the example above, statd has been started so that it +only binds to port 32765 for incoming requests and it must answer on port +32766. mountd is forced to bind to port 32767. lockd's module parameters have +been set to bind to 32768. nfsd is, of course, on port 2049 and the +portmapper is on port 111. + +We are not using quotas. + +Using IPCHAINS, a simple firewall might look something like this: +ipchains -A input -f -j ACCEPT -s 192.168.0.45 +ipchains -A input -s 192.168.0.45 -d 0/0 32765:32768 -p 6 -j ACCEPT +ipchains -A input -s 192.168.0.45 -d 0/0 32765:32768 -p 17 -j ACCEPT +ipchains -A input -s 192.168.0.45 -d 0/0 2049 -p 17 -j ACCEPT +ipchains -A input -s 192.168.0.45 -d 0/0 2049 -p 6 -j ACCEPT +ipchains -A input -s 192.168.0.45 -d 0/0 111 -p 6 -j ACCEPT +ipchains -A input -s 192.168.0.45 -d 0/0 111 -p 17 -j ACCEPT +ipchains -A input -s 0/0 -d 0/0 -p 6 -j DENY -y -l +ipchains -A input -s 0/0 -d 0/0 -p 17 -j DENY -l + +The equivalent set of commands in netfilter is: +iptables -A INPUT -f -j ACCEPT -s 192.168.0.45 +iptables -A INPUT -s 192.168.0.45 -d 0/0 32765:32768 -p 6 -j ACCEPT +iptables -A INPUT -s 192.168.0.45 -d 0/0 32765:32768 -p 17 -j ACCEPT +iptables -A INPUT -s 192.168.0.45 -d 0/0 2049 -p 17 -j ACCEPT +iptables -A INPUT -s 192.168.0.45 -d 0/0 2049 -p 6 -j ACCEPT +iptables -A INPUT -s 192.168.0.45 -d 0/0 111 -p 6 -j ACCEPT +iptables -A INPUT -s 192.168.0.45 -d 0/0 111 -p 17 -j ACCEPT +iptables -A INPUT -s 0/0 -d 0/0 -p 6 -j DENY --syn --log-level 5 +iptables -A INPUT -s 0/0 -d 0/0 -p 17 -j DENY --log-level 5 + +The first line says to accept all packet fragments (except the first packet +fragment which will be treated as a normal packet). In theory no packet will +pass through until it is reassembled, and it won't be reassembled unless the +first packet fragment is passed. Of course there are attacks that can be +generated by overloading a machine with packet fragments. But NFS won't work +correctly unless you let fragments through. See Section 7.8 for details. + +The other lines allow specific connections from any port on our client host +to the specific ports we have made available on our server. This means that +if, say, 192.158.0.46 attempts to contact the NFS server it will not be able +to mount or see what mounts are available. + +With the new port pinning capabilities it is obviously much easier to control +what hosts are allowed to mount your NFS shares. It is worth mentioning that +NFS is not an encrypted protocol and anyone on the same physical network +could sniff the traffic and reassemble the information being passed back and +forth. +----------------------------------------------------------------------------- + +6.5. Tunneling NFS through SSH + +One method of encrypting NFS traffic over a network is to use the +port-forwarding capabilities of ssh. However, as we shall see, doing so has a +serious drawback if you do not utterly and completely trust the local users +on your server. + +The first step will be to export files to the localhost. For example, to +export the /home partition, enter the following into /etc/exports: +/home 127.0.0.1(rw) + +The next step is to use ssh to forward ports. For example, ssh can tell the +server to forward to any port on any machine from a port on the client. Let +us assume, as in the previous section, that our server is 192.168.0.42, and +that we have pinned mountd to port 32767 using the argument -p 32767. Then, +on the client, we'll type: + # ssh root@192.168.0.42 -L 250:localhost:2049 -f sleep 60m + # ssh root@192.168.0.42 -L 251:localhost:32767 -f sleep 60m + +The above command causes ssh on the client to take any request directed at +the client's port 250 and forward it, first through sshd on the server, and +then on to the server's port 2049. The second line causes a similar type of +forwarding between requests to port 251 on the client and port 32767 on the +server. The localhost is relative to the server; that is, the forwarding will +be done to the server itself. The port could otherwise have been made to +forward to any other machine, and the requests would look to the outside +world as if they were coming from the server. Thus, the requests will appear +to NFSD on the server as if they are coming from the server itself. Note that +in order to bind to a port below 1024 on the client, we have to run this +command as root on the client. Doing this will be necessary if we have +exported our filesystem with the default secure option. + +Finally, we are pulling a little trick with the last option, -f sleep 60m. +Normally, when we use ssh, even with the -L option, we will open up a shell +on the remote machine. But instead, we just want the port forwarding to +execute in the background so that we get our shell on the client back. So, we +tell ssh to execute a command in the background on the server to sleep for 60 +minutes. This will cause the port to be forwarded for 60 minutes until it +gets a connection; at that point, the port will continue to be forwarded +until the connection dies or until the 60 minutes are up, whichever happens +later. The above command could be put in our startup scripts on the client, +right after the network is started. + +Next, we have to mount the filesystem on the client. To do this, we tell the +client to mount a filesystem on the localhost, but at a different port from +the usual 2049. Specifically, an entry in /etc/fstab would look like: + localhost:/home /mnt/home nfs rw,hard,intr,port=250,mountport=251 0 0 + +Having done this, we can see why the above will be incredibly insecure if we +have any ordinary users who are able to log in to the server locally. If they +can, there is nothing preventing them from doing what we did and using ssh to +forward a privileged port on their own client machine (where they are +legitimately root) to ports 2049 and 32767 on the server. Thus, any ordinary +user on the server can mount our filesystems with the same rights as root on +our client. + +If you are using an NFS server that does not have a way for ordinary users to +log in, and you wish to use this method, there are two additional caveats: +First, the connection travels from the client to the server via sshd; +therefore you will have to leave port 22 (where sshd listens) open to your +client on the firewall. However you do not need to leave the other ports, +such as 2049 and 32767, open anymore. Second, file locking will no longer +work. It is not possible to ask statd or the locking manager to make requests +to a particular port for a particular mount; therefore, any locking requests +will cause statd to connect to statd on localhost, i.e., itself, and it will +fail with an error. Any attempt to correct this would require a major rewrite +of NFS. + +It may also be possible to use IPSec to encrypt network traffic between your +client and your server, without compromising any local security on the +server; this will not be taken up here. See the [http://www.freeswan.org/] +FreeS/WAN home page for details on using IPSec under Linux. +----------------------------------------------------------------------------- + +6.6. Summary + +If you use the hosts.allow, hosts.deny, root_squash, nosuid and privileged +port features in the portmapper/NFS software, you avoid many of the presently +known bugs in NFS and can almost feel secure about that at least. But still, +after all that: When an intruder has access to your network, s/he can make +strange commands appear in your .forward or read your mail when /home or /var +/mail is NFS exported. For the same reason, you should never access your PGP +private key over NFS. Or at least you should know the risk involved. And now +you know a bit of it. + +NFS and the portmapper makes up a complex subsystem and therefore it's not +totally unlikely that new bugs will be discovered, either in the basic design +or the implementation we use. There might even be holes known now, which +someone is abusing. But that's life. +----------------------------------------------------------------------------- + +7. Troubleshooting + + + This is intended as a step-by-step guide to what to do when things go + wrong using NFS. Usually trouble first rears its head on the client end, + so this diagnostic will begin there. + + +----------------------------------------------------------------------------- +7.1. Unable to See Files on a Mounted File System + +First, check to see if the file system is actually mounted. There are several +ways of doing this. The most reliable way is to look at the file /proc/ +mounts, which will list all mounted filesystems and give details about them. +If this doesn't work (for example if you don't have the /proc filesystem +compiled into your kernel), you can type mount -f although you get less +information. + +If the file system appears to be mounted, then you may have mounted another +file system on top of it (in which case you should unmount and remount both +volumes), or you may have exported the file system on the server before you +mounted it there, in which case NFS is exporting the underlying mount point +(if so then you need to restart NFS on the server). + +If the file system is not mounted, then attempt to mount it. If this does not +work, see Symptom 3. +----------------------------------------------------------------------------- + +7.2. File requests hang or timeout waiting for access to the file. + +This usually means that the client is unable to communicate with the server. +See Symptom 3 letter b. +----------------------------------------------------------------------------- + +7.3. Unable to mount a file system + +There are two common errors that mount produces when it is unable to mount a +volume. These are: + + a. failed, reason given by server: Permission denied + + This means that the server does not recognize that you have access to the + volume. + + i. Check your /etc/exports file and make sure that the volume is + exported and that your client has the right kind of access to it. For + example, if a client only has read access then you have to mount the + volume with the ro option rather than the rw option. + + ii. Make sure that you have told NFS to register any changes you made to + /etc/exports since starting nfsd by running the exportfs command. Be + sure to type exportfs -ra to be extra certain that the exports are + being re-read. + + iii. Check the file /proc/fs/nfs/exports and make sure the volume and + client are listed correctly. (You can also look at the file /var/lib/ + nfs/xtab for an unabridged list of how all the active export options + are set.) If they are not, then you have not re-exported properly. If + they are listed, make sure the server recognizes your client as being + the machine you think it is. For example, you may have an old listing + for the client in /etc/hosts that is throwing off the server, or you + may not have listed the client's complete address and it may be + resolving to a machine in a different domain. One trick is login to + the server from the client via ssh or telnet; if you then type who, + one of the listings should be your login session and the name of your + client machine as the server sees it. Try using this machine name in + your /etc/exports entry. Finally, try to ping the client from the + server, and try to ping the server from the client. If this doesn't + work, or if there is packet loss, you may have lower-level network + problems. + + iv. It is not possible to export both a directory and its child (for + example both /usr and /usr/local). You should export the parent + directory with the necessary permissions, and all of its + subdirectories can then be mounted with those same permissions. + + + b. RPC: Program Not Registered: (or another "RPC" error): + + This means that the client does not detect NFS running on the server. + This could be for several reasons. + + i. First, check that NFS actually is running on the server by typing + rpcinfo -p on the server. You should see something like this: + +------------------------------------------------------------+ + | program vers proto port | + | 100000 2 tcp 111 portmapper | + | 100000 2 udp 111 portmapper | + | 100011 1 udp 749 rquotad | + | 100011 2 udp 749 rquotad | + | 100005 1 udp 759 mountd | + | 100005 1 tcp 761 mountd | + | 100005 2 udp 764 mountd | + | 100005 2 tcp 766 mountd | + | 100005 3 udp 769 mountd | + | 100005 3 tcp 771 mountd | + | 100003 2 udp 2049 nfs | + | 100003 3 udp 2049 nfs | + | 300019 1 tcp 830 amd | + | 300019 1 udp 831 amd | + | 100024 1 udp 944 status | + | 100024 1 tcp 946 status | + | 100021 1 udp 1042 nlockmgr | + | 100021 3 udp 1042 nlockmgr | + | 100021 4 udp 1042 nlockmgr | + | 100021 1 tcp 1629 nlockmgr | + | 100021 3 tcp 1629 nlockmgr | + | 100021 4 tcp 1629 nlockmgr | + | | + +------------------------------------------------------------+ + This says that we have NFS versions 2 and 3, rpc.statd version 1, + network lock manager (the service name for rpc.lockd) versions 1, 3, + and 4. There are also different service listings depending on whether + NFS is travelling over TCP or UDP. UDP is usually (but not always) + the default unless TCP is explicitly requested. + + If you do not see at least portmapper, nfs, and mountd, then you need + to restart NFS. If you are not able to restart successfully, proceed + to Symptom 9. + + ii. Now check to make sure you can see it from the client. On the client, + type rpcinfo -p server where server is the DNS name or IP address of + your server. + + If you get a listing, then make sure that the type of mount you are + trying to perform is supported. For example, if you are trying to + mount using Version 3 NFS, make sure Version 3 is listed; if you are + trying to mount using NFS over TCP, make sure that is registered. + (Some non-Linux clients default to TCP). Type man rpcinfo for more + details on how to read the output. If the type of mount you are + trying to perform is not listed, try a different type of mount. + + If you get the error No Remote Programs Registered, then you need to + check your /etc/hosts.allow and /etc/hosts.deny files on the server + and make sure your client actually is allowed access. Again, if the + entries appear correct, check /etc/hosts (or your DNS server) and + make sure that the machine is listed correctly, and make sure you can + ping the server from the client. Also check the error logs on the + system for helpful messages: Authentication errors from bad /etc/ + hosts.allow entries will usually appear in /var/log/messages, but may + appear somewhere else depending on how your system logs are set up. + The man pages for syslog can help you figure out how your logs are + set up. Finally, some older operating systems may behave badly when + routes between the two machines are asymmetric. Try typing tracepath + [server] from the client and see if the word "asymmetric" shows up + anywhere in the output. If it does then this may be causing packet + loss. However asymmetric routes are not usually a problem on recent + linux distributions. + + If you get the error Remote system error - No route to host, but you + can ping the server correctly, then you are the victim of an + overzealous firewall. Check any firewalls that may be set up, either + on the server or on any routers in between the client and the server. + Look at the man pages for ipchains, netfilter, and ipfwadm, as well + as the [http://www.linuxdoc.org/HOWTO/IPCHAINS-HOWTO.html] + IPChains-HOWTO and the [http://www.linuxdoc.org/HOWTO/ + Firewall-HOWTO.html] Firewall-HOWTO for help. + + + +----------------------------------------------------------------------------- +7.4. I do not have permission to access files on the mounted volume. + +This could be one of two problems. + +If it is a write permission problem, check the export options on the server +by looking at /proc/fs/nfs/exports and make sure the filesystem is not +exported read-only. If it is you will need to re-export it read/write (don't +forget to run exportfs -ra after editing /etc/exports). Also, check /proc/ +mounts and make sure the volume is mounted read/write (although if it is +mounted read-only you ought to get a more specific error message). If not +then you need to re-mount with the rw option. + +The second problem has to do with username mappings, and is different +depending on whether you are trying to do this as root or as a non-root user. + +If you are not root, then usernames may not be in sync on the client and the +server. Type id [user] on both the client and the server and make sure they +give the same UID number. If they don't then you are having problems with +NIS, NIS+, rsync, or whatever system you use to sync usernames. Check group +names to make sure that they match as well. Also, make sure you are not +exporting with the all_squash option. If the user names match then the user +has a more general permissions problem unrelated to NFS. + +If you are root, then you are probably not exporting with the no_root_squash +option; check /proc/fs/nfs/exports or /var/lib/nfs/xtab on the server and +make sure the option is listed. In general, being able to write to the NFS +server as root is a bad idea unless you have an urgent need -- which is why +Linux NFS prevents it by default. See Section 6 for details. + +If you have root squashing, you want to keep it, and you're only trying to +get root to have the same permissions on the file that the user nobody should +have, then remember that it is the server that determines which uid root gets +mapped to. By default, the server uses the UID and GID of nobody in the /etc/ +passwd file, but this can also be overridden with the anonuid and anongid +options in the /etc/exports file. Make sure that the client and the server +agree about which UID nobody gets mapped to. +----------------------------------------------------------------------------- + +7.5. When I transfer really big files, NFS takes over all the CPU cycles on +the server and it screeches to a halt. + +This is a problem with the fsync() function in 2.2 kernels that causes all +sync-to-disk requests to be cumulative, resulting in a write time that is +quadratic in the file size. If you can, upgrading to a 2.4 kernel should +solve the problem. Also, exporting with the no_wdelay option forces the +program to use o_sync() instead, which may prove faster. +----------------------------------------------------------------------------- + +7.6. Strange error or log messages + + a. Messages of the following format: + + +-------------------------------------------------------------------------------------------+ + | Jan 7 09:15:29 server kernel: fh_verify: mail/guest permission failure, acc=4, error=13 | + | Jan 7 09:23:51 server kernel: fh_verify: ekonomi/test permission failure, acc=4, error=13 | + | | + +-------------------------------------------------------------------------------------------+ + + These happen when a NFS setattr operation is attempted on a file you + don't have write access to. The messages are harmless. + + b. The following messages frequently appear in the logs: + + +---------------------------------------------------------------------+ + | kernel: nfs: server server.domain.name not responding, still trying | + | kernel: nfs: task 10754 can't get a request slot | + | kernel: nfs: server server.domain.name OK | + | | + +---------------------------------------------------------------------+ + + The "can't get a request slot" message means that the client-side RPC + code has detected a lot of timeouts (perhaps due to network congestion, + perhaps due to an overloaded server), and is throttling back the number + of concurrent outstanding requests in an attempt to lighten the load. The + cause of these messages is basically sluggish performance. See Section 5 + for details. + + c. After mounting, the following message appears on the client: + + +---------------------------------------------------------------+ + |nfs warning: mount version older than kernel | + | | + +---------------------------------------------------------------+ + + It means what it says: You should upgrade your mount package and/or + am-utils. (If for some reason upgrading is a problem, you may be able to + get away with just recompiling them so that the newer kernel features are + recognized at compile time). + + d. Errors in startup/shutdown log for lockd + + You may see a message of the following kind in your boot log: + +---------------------------------------------------------------+ + |nfslock: rpc.lockd startup failed | + | | + +---------------------------------------------------------------+ + + They are harmless. Older versions of rpc.lockd needed to be started up + manually, but newer versions are started automatically by nfsd. Many of + the default startup scripts still try to start up lockd by hand, in case + it is necessary. You can alter your startup scripts if you want the + messages to go away. + + e. The following message appears in the logs: + + +---------------------------------------------------------------+ + |kmem_create: forcing size word alignment - nfs_fh | + | | + +---------------------------------------------------------------+ + + This results from the file handle being 16 bits instead of a mulitple of + 32 bits, which makes the kernel grimace. It is harmless. + + +----------------------------------------------------------------------------- +7.7. Real permissions don't match what's in /etc/exports. + +/etc/exports is very sensitive to whitespace - so the following statements +are not the same: +/export/dir hostname(rw,no_root_squash) +/export/dir hostname (rw,no_root_squash) + +The first will grant hostname rw access to /export/dir without squashing root +privileges. The second will grant hostname rw privileges with root squash and +it will grant everyone else read/write access, without squashing root +privileges. Nice huh? +----------------------------------------------------------------------------- + +7.8. Flaky and unreliable behavior + +Simple commands such as ls work, but anything that transfers a large amount +of information causes the mount point to lock. + +This could be one of two problems: + + i. It will happen if you have ipchains on at the server and/or the client + and you are not allowing fragmented packets through the chains. Allow + fragments from the remote host and you'll be able to function again. See + Section 6.4 for details on how to do this. + +ii. You may be using a larger rsize and wsize in your mount options than the + server supports. Try reducing rsize and wsize to 1024 and seeing if the + problem goes away. If it does, then increase them slowly to a more + reasonable value. + + +----------------------------------------------------------------------------- +7.9. nfsd won't start + +Check the file /etc/exports and make sure root has read permission. Check the +binaries and make sure they are executable. Make sure your kernel was +compiled with NFS server support. You may need to reinstall your binaries if +none of these ideas helps. +----------------------------------------------------------------------------- + +7.10. File Corruption When Using Multiple Clients + +If a file has been modified within one second of its previous modification +and left the same size, it will continue to generate the same inode number. +Because of this, constant reads and writes to a file by multiple clients may +cause file corruption. Fixing this bug requires changes deep within the +filesystem layer, and therefore it is a 2.5 item. +----------------------------------------------------------------------------- + +8. Using Linux NFS with Other OSes + +Every operating system, Linux included, has quirks and deviations in the +behavior of its NFS implementation -- sometimes because the protocols are +vague, sometimes because they leave gaping security holes. Linux will work +properly with all major vendors' NFS implementations, as far as we know. +However, there may be extra steps involved to make sure the two OSes are +communicating clearly with one another. This section details those steps. + +In general, it is highly ill-advised to attempt to use a Linux machine with a +kernel before 2.2.18 as an NFS server for non-Linux clients. Implementations +with older kernels may work fine as clients; however if you are using one of +these kernels and get stuck, the first piece of advice we would give is to +upgrade your kernel and see if the problems go away. The user-space NFS +implementations also do not work well with non-Linux clients. + +Following is a list of known issues for using Linux together with major +operating systems. +----------------------------------------------------------------------------- + +8.1. AIX + +8.1.1. Linux Clients and AIX Servers + +The format for the /etc/exports file for our example in Section 3 is: + /usr slave1.foo.com:slave2.foo.com,access=slave1.foo.com:slave2.foo.com + /home slave1.foo.com:slave2.foo.com,rw=slave1.foo.com:slave2.foo.com + +----------------------------------------------------------------------------- + +8.1.2. AIX clients and Linux Servers + +AIX uses the file /etc/filesystems instead of /etc/fstab. A sample entry, +based on the example in Section 4, looks like this: +/mnt/home: + dev = "/home" + vfs = nfs + nodename = master.foo.com + mount = true + options = bg,hard,intr,rsize=1024,wsize=1024,vers=2,proto=udp + account = false + + + i. Version 4.3.2 of AIX, and possibly earlier versions as well, requires + that file systems be exported with the insecure option, which causes NFS + to listen to requests from insecure ports (i.e., ports above 1024, to + which non-root users can bind). Older versions of AIX do not seem to + require this. + +ii. AIX clients will default to mounting version 3 NFS over TCP. If your + Linux server does not support this, then you may need to specify vers=2 + and/or proto=udp in your mount options. + +iii. Using netmasks in /etc/exports seems to sometimes cause clients to lose + mounts when another client is reset. This can be fixed by listing out + hosts explicitly. + +iv. Apparently automount in AIX 4.3.2 is rather broken. + + +----------------------------------------------------------------------------- +8.2. BSD + +8.2.1. BSD servers and Linux clients + +BSD kernels tend to work better with larger block sizes. +----------------------------------------------------------------------------- + +8.2.2. Linux servers and BSD clients + +Some versions of BSD may make requests to the server from insecure ports, in +which case you will need to export your volumes with the insecure option. See +the man page for exports(5) for more details. +----------------------------------------------------------------------------- + +8.3. Tru64 Unix + +8.3.1. Tru64 Unix Servers and Linux Clients + +In general, Tru64 Unix servers work quite smoothly with Linux clients. The +format for the /etc/exports file for our example in Section 3 is: + +/usr slave1.foo.com:slave2.foo.com \ + -access=slave1.foo.com:slave2.foo.com \ + +/home slave1.foo.com:slave2.foo.com \ + -rw=slave1.foo.com:slave2.foo.com \ + -root=slave1.foo.com:slave2.foo.com + + +(The root option is listed in the last entry for informational purposes only; +its use is not recommended unless necessary.) + +Tru64 checks the /etc/exports file every time there is a mount request so you +do not need to run the exportfs command; in fact on many versions of Tru64 +Unix the command does not exist. +----------------------------------------------------------------------------- + +8.3.2. Linux Servers and Tru64 Unix Clients + +There are two issues to watch out for here. First, Tru64 Unix mounts using +Version 3 NFS by default. You will see mount errors if your Linux server does +not support Version 3 NFS. Second, in Tru64 Unix 4.x, NFS locking requests +are made by daemon. You will therefore need to specify the insecure_locks +option on all volumes you export to a Tru64 Unix 4.x client; see the exports +man pages for details. +----------------------------------------------------------------------------- + +8.4. HP-UX + +8.4.1. HP-UX Servers and Linux Clients + +A sample /etc/exports entry on HP-UX looks like this: +/usr -ro,access=slave1.foo.com:slave2.foo.com +/home -rw=slave1.foo.com:slave2.fo.com:root=slave1.foo.com:slave2.foo.com + +(The root option is listed in the last entry for informational purposes only; +its use is not recommended unless necessary.) +----------------------------------------------------------------------------- + +8.4.2. Linux Servers and HP-UX Clients + +HP-UX diskless clients will require at least a kernel version 2.2.19 (or +patched 2.2.18) for device files to export correctly. Also, any exports to an +HP-UX client will need to be exported with the insecure_locks option. +----------------------------------------------------------------------------- + +8.5. IRIX + +8.5.1. IRIX Servers and Linux Clients + +A sample /etc/exports entry on IRIX looks like this: +/usr -ro,access=slave1.foo.com:slave2.foo.com +/home -rw=slave1.foo.com:slave2.fo.com:root=slave1.foo.com:slave2.foo.com + +(The root option is listed in the last entry for informational purposes only; +its use is not recommended unless necessary.) + +There are reportedly problems when using the nohide option on exports to +linux 2.2-based systems. This problem is fixed in the 2.4 kernel. As a +workaround, you can export and mount lower-down file systems separately. + +As of Kernel 2.4.17, there continue to be several minor interoperability +issues that may require a kernel upgrade. In particular: + +  * Make sure that Trond Myklebust's seekdir (or dir) kernel patch is + applied. The latest version (for 2.4.17) is located at: + + [http://www.fys.uio.no/~trondmy/src/2.4.17/linux-2.4.17-seekdir.dif] + http://www.fys.uio.no/~trondmy/src/2.4.17/linux-2.4.17-seekdir.dif + +  * IRIX servers do not always use the same fsid attribute field across + reboots, which results in inode number mismatch errors on a Linux client + if the mounted IRIX server reboots. A patch is available from: + + [http://www.geocrawler.com/lists/3/SourceForge/789/0/7777454/] http:// + www.geocrawler.com/lists/3/SourceForge/789/0/7777454/ + +  * Linux kernels v2.4.9 and above have problems reading large directories + (hundreds of files) from exported IRIX XFS file systems that were made + with naming version=1. The reason for the problem can be found at: + + [http://www.geocrawler.com/archives/3/789/2001/9/100/6531172/] http:// + www.geocrawler.com/archives/3/789/2001/9/100/6531172/ + + The naming version can be found by using (on the IRIX server): + xfs_growfs -n mount_point + + + The workaround is to export these file systems using the -32bitclients + option in the /etc/exports file. The fix is to convert the file system to + 'naming version=2'. Unfortunately the only way to do this is by a backup/ + mkfs/restore. + + mkfs_xfs on IRIX 6.5.14 (and above) creates naming version=2 XFS file + systems by default. On IRIX 6.5.5 to 6.5.13, use: + mkfs_xfs -n version=2 device + + + Versions of IRIX prior to 6.5.5 do not support naming version=2 XFS file + systems. + + +----------------------------------------------------------------------------- +8.5.2. IRIX clients and Linux servers + +Irix versions up to 6.5.12 have problems mounting file systems exported from +Linux boxes - the mount point "gets lost," e.g., + # mount linux:/disk1 /mnt + # cd /mnt/xyz/abc + # pwd + /xyz/abc + + +This is known IRIX bug (SGI bug 815265 - IRIX not liking file handles of less +than 32 bytes), which is fixed in IRIX 6.5.13. If it is not possible to +upgrade to IRIX 6.5.13, then the unofficial workaround is to force the Linux +nfsd to always use 32 byte file handles. + +A number of patches exist - see: + +  * [http://www.geocrawler.com/archives/3/789/2001/8/50/6371896/] http:// + www.geocrawler.com/archives/3/789/2001/8/50/6371896/ + +  * [http://oss.sgi.com/projects/xfs/mail_archive/0110/msg00006.html] http:// + oss.sgi.com/projects/xfs/mail_archive/0110/msg00006.html + + +----------------------------------------------------------------------------- +8.6. Solaris + +8.6.1. Solaris Servers + +Solaris has a slightly different format on the server end from other +operating systems. Instead of /etc/exports, the configuration file is /etc/ +dfs/dfstab. Entries are of the form of a share command, where the syntax for +the example in Section 3 would look like +share -o rw=slave1,slave2 -d "Master Usr" /usr + +and instead of running exportfs after editing, you run shareall. + +Solaris servers are especially sensitive to packet size. If you are using a +Linux client with a Solaris server, be sure to set rsize and wsize to 32768 +at mount time. + +Finally, there is an issue with root squashing on Solaris: root gets mapped +to the user noone, which is not the same as the user nobody. If you are +having trouble with file permissions as root on the client machine, be sure +to check that the mapping works as you expect. +----------------------------------------------------------------------------- + +8.6.2. Solaris Clients + +Solaris clients will regularly produce the following message: ++---------------------------------------------------------------------------+ +|svc: unknown program 100227 (me 100003) | +| | ++---------------------------------------------------------------------------+ + +This happens because Solaris clients, when they mount, try to obtain ACL +information - which Linux obviously does not have. The messages can safely be +ignored. + +There are two known issues with diskless Solaris clients: First, a kernel +version of at least 2.2.19 is needed to get /dev/null to export correctly. +Second, the packet size may need to be set extremely small (i.e., 1024) on +diskless sparc clients because the clients do not know how to assemble +packets in reverse order. This can be done from /etc/bootparams on the +clients. +----------------------------------------------------------------------------- + +8.7. SunOS + +SunOS only has NFS Version 2 over UDP. +----------------------------------------------------------------------------- + +8.7.1. SunOS Servers + +On the server end, SunOS uses the most traditional format for its /etc/ +exports file. The example in Section 3 would look like: +/usr -access=slave1.foo.com,slave2.foo.com +/home -rw=slave1.foo.com,slave2.foo.com, root=slave1.foo.com,slave2.foo.com + + +Again, the root option is listed for informational purposes and is not +recommended unless necessary. +----------------------------------------------------------------------------- + +8.7.2. SunOS Clients + +Be advised that SunOS makes all NFS locking requests as daemon, and therefore +you will need to add the insecure_locks option to any volumes you export to a +SunOS machine. See the exports man page for details. + + + + + + 8.11. SAMBA - `NetBEUI', `NetBios', `CIFS' support. + + SAMBA is an implementation of the Session Management Block protocol. + Samba allows Microsoft and other systems to mount and use your disks + and printers. + + SAMBA and its configuration are covered in detail in the SMB-HOWTO. + + 5.2. Windows Environment + + Samba is a suite of applications that allow most Unices (and in + particular Linux) to integrate into a Microsoft network both as a + client and a server. Acting as a server it allows Windows 95, Windows + for Workgroups, DOS and Windows NT clients to access Linux files and + printing services. It can completely replace Windows NT for file and + printing services, including the automatic downloading of printer + drivers to clients. Acting as a client allows the Linux workstation to + mount locally exported windows file shares. + + According to the SAMBA Meta-FAQ: + + "Many users report that compared to other SMB implementations Samba is more stable, + faster, and compatible with more clients. Administrators of some large installations say + that Samba is the only SMB server available which will scale to many tens of thousands + of users without crashing" + + ˇ Samba project home page + + ˇ SMB HOWTO + + ˇ Printing HOWTO + + + +samba + + + +A LanManager like file and printer server for Unix. The Samba software suite is a collection of programs that implements the SMB protocol for unix systems, allowing you to serve files and printers to Windows, NT, OS/2 and DOS clients. This protocol is sometimes also referred to as the LanManager or NetBIOS protocol. This package contains all the components necessary to turn your Debian GNU/Linux box into a powerful file and printer server. Currently, the Samba Debian packages consist of the following: samba - A LanManager like file and printer server for Unix. samba-common - Samba common files used by both the server and the client. smbclient - A LanManager like simple client for Unix. swat - Samba Web Administration Tool samba-doc - Samba documentation. smbfs - Mount and umount commands for the smbfs (kernels 2.0.x and above). libpam-smbpass - pluggable authentication module for SMB password database libsmbclient - Shared library that allows applications to talk to SMB servers libsmbclient-dev - libsmbclient shared libraries winbind: Service to resolve user and group information from Windows NT servers It is possible to install a subset of these packages depending on your particular needs. For example, to access other SMB servers you should only need the smbclient and samba-common packages. From Debian 3.0r0 APT +http://www.tldp.org/LDP/Linux-Dictionary/html/index.html + + + + + + +Samba + + + +A lot of emphasis has been placed on peaceful coexistence between UNIX and Windows. Unfortunately, the two systems come from very different cultures and they have difficulty getting along without mediation. ...and that, of course, is Samba's job. Samba <http://samba.org/> runs on UNIX platforms, but speaks to Windows clients like a native. It allows a UNIX system to move into a Windows ``Network Neighborhood'' without causing a stir. Windows users can happily access file and print services without knowing or caring that those services are being offered by a UNIX host. All of this is managed through a protocol suite which is currently known as the ``Common Internet File System,'' or CIFS <http://www.cifs.com>. This name was introduced by Microsoft, and provides some insight into their hopes for the future. At the heart of CIFS is the latest incarnation of the Server Message Block (SMB) protocol, which has a long and tedious history. Samba is an open source CIFS implementation, and is available for free from the http://samba.org/ mirror sites. Samba and Windows are not the only ones to provide CIFS networking. OS/2 supports SMB file and print sharing, and there are commercial CIFS products for Macintosh and other platforms (including several others for UNIX). Samba has been ported to a variety of non-UNIX operating systems, including VMS, AmigaOS, and NetWare. CIFS is also supported on dedicated file server platforms from a variety of vendors. In other words, this stuff is all over the place. From Rute-Users-Guide +http://www.tldp.org/LDP/Linux-Dictionary/html/index.html + + + + + + +Samba + + + +Samba adds Windows-networking support to UNIX. Whereas NFS is the most popular protocol for sharing files among UNIX machines, SMB is the most popular protocol for sharing files among Windows machines. The Samba package adds the ability for UNIX systems to interact with Windows systems. Key point: The Samba package comprises the following: smbd The Samba service allowing other machines (often Windows) to read files from a UNIX machine. nmbd Provides support for NetBIOS. Logically, the SMB protocol is layered on top of NetBIOS, which is in turn layered on top of TCP/IP. smbmount An extension to the mount program that allows a UNIX machine to connect to another machine implicitly. Files can be accessed as if they were located on the local machines. smbclient Allows files to be access through SMB in an explicity manner. This is a command-line tool much like the FTP tool that allows files to be copied. Unlike smbmount, files cannot be accessed as if they were local. smb.conf The configuration file for Samba. From Hacking-Lexicon +http://www.tldp.org/LDP/Linux-Dictionary/html/index.html + + + + + Samba Authenticated Gateway HOWTO + Ricardo Alexandre Mattar + v1.2, 2004-05-21 + + + + + +SSH + + +The Secure Shell, or SSH, provides a way of running command line and +graphical applications, and transferring files, over an encrypted +connection. SSH uses up to 2,048-bit encryption with a variety of +cryptographic schemes to make sure that if a cracker intercepts your +connection, all they can see is useless gibberish. It is both a +protocol and a suite of small command line applications which can be +used for various functions. + + + +SSH replaces the old Telnet application, and can be used for secure +remote administration of machines across the Internet. However, it +has more features. + + + +SSH increases the ease of running applications remotely by setting up +permissions automatically. If you can log into a machine, it allows you +to run a graphical application on it, unlike Telnet, which requires users +to type lots of geeky xhost and xauth commands. SSH also has inbuild +compression, which allows your graphic applications to run much faster +over the network. + + + +SCP (Secure Copy) and SFTP (Secure FTP) allow transfer of files over the +remote link, either via SSH's own command line utilities or graphical tools +like Gnome's GFTP. Like Telnet, SSH is cross-platform. You can find SSH +servers and clients for Linux, Unix, all flavours of Windows, BeOS, PalmOS, +Java and Embedded OSes used in routers. + + + +Encrypted remote shell sessions are available through SSH + (http://www.ssh.fi/sshprotocols2/index.html + ) thus effectively +allowing secure remote administration. + + + + + + +Telnet + + +Created in the early 1970s, Telnet provides a method of running command +line applications on a remote computer as if that person were actually at +the remote site. Telnet is one of the most powerful tools for Unix, allowing +for true remote administration. It is also an interesting program from the +point of view of users, because it allows remote access to all their files +and programs from anywhere in the Internet. Combined with an X server (as +well as some rather arcane manipluation of authentication 'cookies' and +'DISPLAY' environment variables), there is no difference (apart from the +delay) between being at the console or on the other side of the planet. +However, since the 'telnet' protocol sends data 'en-clair' and there are +now more efficient protocols with features such as built-in +compression and 'tunneling' which allows for greater ease of usage of graphical +applications across the network as well as more secure connections it is an +effectively a dead protocol. Like the 'r' (such as rlogin and rsh) related +protocols it is still used though, within internal networks for the reasons +of ease of installation and use as well as backwards compatibility and also +as a means by which to configure networking devices such as routers +and firewalls. + + + +Please consult RFC 854 for further details behind its implementation. + + + + ˇ Telnet related software + + + + + + + +TFTP + + +Trivial File Transfer Protocol TFTP is a bare-bones protocol used by +devices that boot from the network. It is runs on top of UDP, so it +doesn't require a real TCP/IP stack. Misunderstanding: Many people +describe TFTP as simply a trivial version of FTP without authentication. +This misses the point. The purpose of TFTP is not to reduce the complexity +of file transfer, but to reduce the complexity of the underlying TCP/IP +stack so that it can fit inside boot ROMs. Key point: TFTP is almost +always used with BOOTP. BOOTP first configures the device, then TFTP +transfers the boot image named by BOOTP which is then used to boot the +device. Key point: Many systems come with unnecessary TFTP servers. Many +TFTP servers have bugs, like the backtracking problem or buffer overflows. +As a consequence, many systems can be exploited with TFTP even though +virtually nobody really uses it. Key point: A TFTP file transfer client +is built into many operating systems (UNIX, Windows, etc....). These clients +are often used to download rootkits when being broken into. Therefore, +removing the TFTP client should be part of your hardening procedure. +For further details on the TFTP protocol please see RFC's 1350, 1782, +1783, 1784, and 1785. + + + +Most likely, you'll interface with the TFTP protocol using the TFTP command +line client, 'tftp', which allows users to transfer files to and from a +remote machine. The remote host may be specified on the command line, in +which case tftp uses host as the default host for future transfers. + + + +Setting up TFTP is almost as easy as DHCP. +First install from the rpm package: + +# rpm -ihv tftp-server-*.rpm + + + + +Create a directory for the files: + +# mkdir /tftpboot +# chown nobody:nobody /tftpboot + + + + +The directory /tftpboot is owned by user nobody, because this is the default +user id set up by tftpd to access the files. Edit the file /etc/xinetd.d/tftp +to look like the following: + + + + +service tftp +{ + socket_type = dgram + protocol = udp + wait = yes + user = root + server = /usr/sbin/in.tftpd + server_args = -c -s /tftpboot + disable = no + per_source = 11 + cps = 100 2 +} + + + + +The changes from the default file are the parameter disable = no (to enable +the service) and the server argument -c. This argument allows for the +creation of files, which is necessary if you want to save boot or disk +images. You may want to make TFTP read only in normal operation. + + + +Then reload xinetd: + +/etc/rc.d/init.d/xinetd reload + + + + +You can use the tftp command, available from the tftp (client) rpm package, +to test the server. At the tftp prompt, you can issue the commands put and +get. + + + + + + +VNC + + 8.13. Tunnelling, mobile IP and virtual private networks + + The Linux kernel allows the tunnelling (encapsulation) of protocols. + It can do IPX tunnelling through IP, allowing the connection of two + IPX networks through an IP only link. It can also do IP-IP tunnelling, + which it is essential for mobile IP support, multicast support and + amateur radio. (see + http://metalab.unc.edu/mdw/HOWTO/NET3-4-HOWTO-6.html#ss6.8) + + Mobile IP specifies enhancements that allow transparent routing of IP + datagrams to mobile nodes in the Internet. Each mobile node is always + identified by its home address, regardless of its current point of + attachment to the Internet. While situated away from its home, a + mobile node is also associated with a care-of address, which provides + information about its current point of attachment to the Internet. + The protocol provides for registering the care-of address with a home + agent. The home agent sends datagrams destined for the mobile node + through a tunnel to the care-of address. After arriving at the end of + the tunnel, each datagram is then delivered to the mobile node. + + Point-to-Point Tunneling Protocol (PPTP) is a networking technology + that allows the use of the Internet as a secure virtual private + network (VPN). PPTP is integrated with the Remote Access Services + (RAS) server which is built into Windows NT Server. With PPTP, users + can dial into a local ISP, or connect directly to the Internet, and + access their network as if they were at their desks. PPTP is a closed + protocol and its security has recently being compromised. It is highly + recomendable to use other Linux based alternatives, since they rely on + open standards which have been carefully examined and tested. + + + ˇ A client implementation of the PPTP for Linux is available here + + + ˇ More on Linux PPTP can be found here + + + Mobile IP: + + ˇ http://www.hpl.hp.com/personal/Jean_Tourrilhes/MobileIP/mip.html + + ˇ http://metalab.unc.edu/mdw/HOWTO/NET3-4-HOWTO-6.html#ss6.12 + + Virtual Private Networks related documents: + + + ˇ http://metalab.unc.edu/mdw/HOWTO/mini/VPN.html + + ˇ http://sites.inka.de/sites/bigred/devel/cipe.html + + +7.4. VNC + + VNC stands for Virtual Network Computing. It is, in essence, a remote + display system which allows one to view a computing 'desktop' + environment not only on the machine where it is running, but from + anywhere on the Internet and from a wide variety of machine + architectures. Both clients and servers exist for Linux as well as for + many other platforms. It is possible to execute MS-Word in a Windows + NT or 95 machine and have the output displayed in a Linux machine. The + opposite is also true; it is possible to execute an application in a + Linux machine and have the output displayed in any other Linux or + Windows machine. One of the available clients is a Java applet, + allowing the remote display to be run inside a web browser. Another + client is a port for Linux using the SVGAlib graphics library, + allowing 386s with as little as 4 MB of RAM to become fully functional + X-Terminals. + + ˇ VNC web site + + +Virtual Network Computing (VNC) allows a user to operate a session running on another machine. +Although Linux and all other Unix-like OSes already have this functionality built in, VNC +provides further advantages because it's cross-platform, running on Linux, BSD, Unix, Win32, +MacOS, and PalmOS. This makes it far more versatile. + +For example, let's assume the machine that you are attempting to connect to is running Linux. +You can use VNC to access applications running on that other Linux desktop. You can also use +VNC to provide technical support to users on Window's based machines by taking control of +their desktops from the comfort of your server room. VNC is usually installed as seperate +packages for the client and server, typically named 'vnc' and 'vnc-server'. + +VNC uses screen numbers to connect clients to servers. This is because Unix machines allow +multiple graphical sessions to be stated simultaneously (check this out by logging in to a +virtual terminal and typing startx -- :1). + +For platforms (Windows, MacOS, Palm, etc) which don't have this capability, you'll connect +to 'screen 0' and take over the session of the existing user. For Unix systems, you'll need +to specify a higher number and receive a new desktop. + +If you prefer the Windows-style approach where the VNC client takes over the currently +running display, you can use x0rfbserver - see the sidebox below. + +VNC Servers and Clients + +On Linux, the VNC server (which allows the machine to be used remotely) is actually +run as a replacement X server. To be able to start a VNC session to a machine, log +into it and run vncserver. You'll be prompted for a password - in future you can +change this password with the vncpasswd command. After you enter the password, you'll +be told the display number of the newly created machine. + +It is possible to control a remote macine by using the vncviewer command. If it is +typed on its own it will prompt for a remote machine, or you can use: +vncviewer [host]:[screen-number] + +> The VPN HOWTO, deprecated!!!! +> VPN HOWTO +> Linux VPN Masquerade HOWTO + + + 10. References + + 10.1. Web Sites + + Cipe Home Page + + Masq Home Page + + Samba Home Page + + Linux HQ ---great site for lots of linux + info + + 10.2. Documentation + + cipe.info: info file included with cipe distribution + + Firewall HOWTO, by Mark Grennan, markg@netplus.net + + IP Masquerade mini-HOWTO,by Ambrose Au, ambrose@writeme.com + + IPChains-Howto, by Paul Russell, Paul.Russell@rustcorp.com.au + + + + + +Web-Serving + + +The World Wide Web provides a simple method of publishing and linking +information across the Internet, and is responsible for popularising +the Internet to its current level. In the simplest case, a Web client +(or browser), such as Netscape or Internet Explorer, connects with a +Web server using a simple request/response protocol called HTTP +(Hypertext Transfer Protocol), and requests HTML (Hypertext Markup +Language) pages, images, Flash and other objects. + + + +In mode modern situations, the Web server can also geneate pages +dynamically based on information returned from the user. Either way +setting up your own Web server is extremely simple. There are many +choices for Web serving under Linux. Some servers are very mature, +such as Apache, and are perfect for small and large sites alike. +Other servers programmed to be light and fast, and to have only a +limited feature set to reduce complexity. A search on freshmeat.net +will reveal a multitude of servers. + + + +Most Linux distributions include Apache . +Apache is the number one server on the internet according to +http://www.netcraft.co.uk/survey/ . More than a half of all internet +sites are running Apache or one of it derivatives. Apache's advantages +include its modular design, stability and speed. Given the appropriate +hardware and configuration it can support the highest loads: Yahoo, +Altavista, GeoCities, and Hotmail are based on customized versions of +this server. + + + +Optional support for SSL (which enables secure transactions) is also +available at: + + + ˇ http://www.apache-ssl.org/ + ˇ http://raven.covalent.net/ + ˇ http://www.c2.net/ + +Dynamic Web content generation + + +Web scripting languages are even more common on Linux than databases +- basically, every language is available. This includes CGI, +PHP 3 and 4, Perl, JSP, ASP (via closed source applications from +Chill!soft and Halycon Software) and ColdFusion. + + + +PHP is an open source scripting language designed to churn out +dynamically produced Web content ranging from databases to browsers. +This inludes not only HTML, but also graphics, Macromedia Flash and +XML-based information. The latest versions of PHP provide impressive +speed improvements, install easily from packages and can be set up +quickly. PHP is the most popular Apache module and is used by over +two million sites, including Amazon.com, US telco giant Sprint, +Xoom Networks and Lycos. And unlike most other server side scripting +languages, developers (or those that employ them) can add their own +functions into the source to improve it. Supported databases include +those in the Database serving section and most ODBC compliant +databases. The language itself borrows its structure from Perl and C. + + + ˇ http://metalab.unc.edu/mdw/HOWTO/WWW-HOWTO.html + ˇ http://metalab.unc.edu/mdw/HOWTO/Virtual-Services-HOWTO.html + ˇ http://metalab.unc.edu/mdw/HOWTO/Intranet-Server-HOWTO.html + ˇ Web servers for Linux + + + + + + +X11 + + +The X Window System was developed at MIT in the late 1980s, rapidly +becoming the industry standard windowing system for Unix graphics +workstations. The software is freely available, very versatile, and is +suitable for a wide range of hardware platforms. Any X environment +consists of two distinct parts, the X server and one or more X +clients. It is important to realise the distinction between the server +and the client. The server controls the display directly and is +responsible for all input/output via the keyboard, mouse or display. +The clients, on the other hand, do not access the screen directly - +they communicate with the server, which handles all input and output. +It is the clients which do the "real" computing work - running +applications or whatever. The clients communicate with the server, +causing the server to open one or more windows to handle input and +output for that client. + + + +In short, the X Window System allows a user to log in into a remote +machine, execute a process (for example, open a web browser) and have +the output displayed on his own machine. Because the process is +actually being executed on the remote system, very little CPU power is +needed in the local one. Indeed, computers exist whose primary purpose +is to act as pure X servers. Such systems are called X terminals. + + + +A free port of the X Window System exists for Linux and can be found +at: Xfree . It is included in most Linux +distributions. + + + +For further information regarding X please see: + + +X11, LBX, DXPC, NXServer, SSH, MAS + +Related HOWTOs: + +ˇ Remote X Apps HOWTO +ˇ Linux XDMCP HOWTO +ˇ XDM and X Terminal mini-HOWTO +ˇ The Linux XFree86 HOWTO +ˇ ATI R200 + XFree86 4.x mini-HOWTO +ˇ Second Mouse in X mini-HOWTO +ˇ Linux Touch Screen HOWTO +ˇ XFree86 Video Timings HOWTO +ˇ Linux XFree-to-Xinside mini-HOWTO +ˇ XFree Local Multi-User HOWTO +ˇ Using Xinerama to MultiHead XFree86 V. 4.0+ +ˇ Connecting X Terminals to Linux Mini-HOWTO +ˇ How to change the title of an xterm +ˇ X Window System Architecture Overview HOWTO +ˇ The X Window User HOWTO + + + + + +Email + + +Alongside the Web, mail is the top reason for the popularity of the Internet. Email is an inexpensive and fast method of time-shifted messaging which, much like the Web, is actually based around sending and receiving plain text files. The protocol used is called the Simple Mail Transfer Protocol (SMTP). The server programs that implement SMTP to move mail from one server to another are called Mail Transfer Agents (MTAs). + + + +In times gone by, users would Telnet into the SMTP server itself and use a command line program like elm or pine to check ther mail. These days, users run email clients like Netscape, Evolution, Kmail or Outlook on their desktop to check their email off a local SMTP server. Additional protocols like POP3 and IMAP4 are used between the SMTP server and desktop mail client to allow clients to manipulate files on, and download from, their local mail server. The programs that implement POP3 and IMAP4 are called Mail Delivery Agents (MDAs). They are generally separate from MTAs. + + +* Linux Mail-Queue mini-HOWTO + +* The Linux Mail User HOWTO + + + + + + 8.11. Proxy Server + + The term proxy means "to do something on behalf of someone else." In + networking terms, a proxy server computer can act on the behalf of + several clients. An HTTP proxy is a machine that receives requests for + web pages from another machine (Machine A). The proxy gets the page + requested and returns the result to Machine A. The proxy may have a + cache with the requested pages, so if another machine asks for the + same page the copy in the cache will be returned instead. This allows + efficient use of bandwidth resources and less response time. As a side + effect, as client machines are not directly connected to the outside + world this is a way of securing the internal network. A well- + configured proxy can be as effective as a good firewall. + + Several proxy servers exist for Linux. One popular solution is the + Apache proxy module. A more complete and robust implementation of an + HTTP proxy is SQUID. + ˇ Apache + + ˇ Squid + +Proxy-Caching + + +When a web browser retreives information from the Internet, it stores a copy of that information +in a cache on the local machine. When a user requests that information in future, the browser will +check to seee if the original source has updated; if not, the browser will simply use the cached version +rather than fetch the data again. + +By doing this, there is less information that needs to be downloadded, which makes the connection seem responsive +to users and reduces bandwidth costs. + +But if there are many browsers accessing the Internet through the same connection, it makes better sense to have +a single, centralised cache so that once a single machine has requested some information, the next +machine to try and download that information can also access it more quickly. This is the +theory behind the proxy cache. Squid is by far the most popular cache used on the Web, and can also be used +to accelerate Web serving. + +Although Squid is useful for an ISP, large businesses or even a small office can afford to use Squid to +speed up transfers and save money, and it can easily be used to the same effect in a home with a few +flatmates sharing a cable or ADSL connection. + + +Traffic Control HOWTO + +Version 1.0.1 + +Martin A. Brown + + [http://www.securepipe.com/] SecurePipe, Inc. +Network Administration + + + +"Nov 2003" +Revision History +Revision 1.0.1 2003-11-17 Revised by: MAB +Added link to Leonardo Balliache's documentation +Revision 1.0 2003-09-24 Revised by: MAB +reviewed and approved by TLDP +Revision 0.7 2003-09-14 Revised by: MAB +incremental revisions, proofreading, ready for TLDP +Revision 0.6 2003-09-09 Revised by: MAB +minor editing, corrections from Stef Coene +Revision 0.5 2003-09-01 Revised by: MAB +HTB section mostly complete, more diagrams, LARTC pre-release +Revision 0.4 2003-08-30 Revised by: MAB +added diagram +Revision 0.3 2003-08-29 Revised by: MAB +substantial completion of classless, software, rules, elements and components +sections +Revision 0.2 2003-08-23 Revised by: MAB +major work on overview, elements, components and software sections +Revision 0.1 2003-08-15 Revised by: MAB +initial revision (outline complete) + + + Traffic control encompasses the sets of mechanisms and operations by which +packets are queued for transmission/reception on a network interface. The +operations include enqueuing, policing, classifying, scheduling, shaping and +dropping. This HOWTO provides an introduction and overview of the +capabilities and implementation of traffic control under Linux. + +Š 2003, Martin A. Brown + + + Permission is granted to copy, distribute and/or modify this document + under the terms of the GNU Free Documentation License, Version 1.1 or any + later version published by the Free Software Foundation; with no + invariant sections, with no Front-Cover Texts, with no Back-Cover Text. A + copy of the license is located at [http://www.gnu.org/licenses/fdl.html] + http://www.gnu.org/licenses/fdl.html. + +----------------------------------------------------------------------------- +Table of Contents +1. Introduction to Linux Traffic Control + 1.1. Target audience and assumptions about the reader + 1.2. Conventions + 1.3. Recommended approach + 1.4. Missing content, corrections and feedback + + +2. Overview of Concepts + 2.1. What is it? + 2.2. Why use it? + 2.3. Advantages + 2.4. Disdvantages + 2.5. Queues + 2.6. Flows + 2.7. Tokens and buckets + 2.8. Packets and frames + + +3. Traditional Elements of Traffic Control + 3.1. Shaping + 3.2. Scheduling + 3.3. Classifying + 3.4. Policing + 3.5. Dropping + 3.6. Marking + + +4. Components of Linux Traffic Control + 4.1. qdisc + 4.2. class + 4.3. filter + 4.4. classifier + 4.5. policer + 4.6. drop + 4.7. handle + + +5. Software and Tools + 5.1. Kernel requirements + 5.2. iproute2 tools (tc) + 5.3. tcng, Traffic Control Next Generation + 5.4. IMQ, Intermediate Queuing device + + +6. Classless Queuing Disciplines (qdiscs) + 6.1. FIFO, First-In First-Out (pfifo and bfifo) + 6.2. pfifo_fast, the default Linux qdisc + 6.3. SFQ, Stochastic Fair Queuing + 6.4. ESFQ, Extended Stochastic Fair Queuing + 6.5. GRED, Generic Random Early Drop + 6.6. TBF, Token Bucket Filter + + +7. Classful Queuing Disciplines (qdiscs) + 7.1. HTB, Hierarchical Token Bucket + 7.2. PRIO, priority scheduler + 7.3. CBQ, Class Based Queuing + + +8. Rules, Guidelines and Approaches + 8.1. General Rules of Linux Traffic Control + 8.2. Handling a link with a known bandwidth + 8.3. Handling a link with a variable (or unknown) bandwidth + 8.4. Sharing/splitting bandwidth based on flows + 8.5. Sharing/splitting bandwidth based on IP + + +9. Scripts for use with QoS/Traffic Control + 9.1. wondershaper + 9.2. ADSL Bandwidth HOWTO script (myshaper) + 9.3. htb.init + 9.4. tcng.init + 9.5. cbq.init + + +10. Diagram + 10.1. General diagram + + +11. Annotated Traffic Control Links + +1. Introduction to Linux Traffic Control + + Linux offers a very rich set of tools for managing and manipulating the +transmission of packets. The larger Linux community is very familiar with the +tools available under Linux for packet mangling and firewalling (netfilter, +and before that, ipchains) as well as hundreds of network services which can +run on the operating system. Few inside the community and fewer outside the +Linux community are aware of the tremendous power of the traffic control +subsystem which has grown and matured under kernels 2.2 and 2.4. + + This HOWTO purports to introduce the concepts of traffic control, the +traditional elements (in general), the components of the Linux traffic +control implementation and provide some guidelines . This HOWTO represents +the collection, amalgamation and synthesis of the [http://lartc.org/howto/] +LARTC HOWTO, documentation from individual projects and importantly the LARTC +mailing list over a period of study. + + The impatient soul, who simply wishes to experiment right now, is +recommended to the [http://tldp.org/HOWTO/Traffic-Control-tcng-HTB-HOWTO/] +Traffic Control using tcng and HTB HOWTO and [http://lartc.org/howto/] LARTC +HOWTO for immediate satisfaction. + + +----------------------------------------------------------------------------- + +1.1. Target audience and assumptions about the reader + + The target audience for this HOWTO is the network administrator or savvy +home user who desires an introduction to the field of traffic control and an +overview of the tools available under Linux for implementing traffic control. + + I assume that the reader is comfortable with UNIX concepts and the command +line and has a basic knowledge of IP networking. Users who wish to implement +traffic control may require the ability to patch, compile and install a +kernel or software package [1]. For users with newer kernels (2.4.20+, see +also Section 5.1), however, the ability to install and use software may be +all that is required. + + Broadly speaking, this HOWTO was written with a sophisticated user in mind, +perhaps one who has already had experience with traffic control under Linux. +I assume that the reader may have no prior traffic control experience. +----------------------------------------------------------------------------- + +1.2. Conventions + + This text was written in [http://www.docbook.org/] DocBook ([http:// +www.docbook.org/xml/4.2/index.html] version 4.2) with vim. All formatting has +been applied by [http://xmlsoft.org/XSLT/] xsltproc based on DocBook XSL and +LDP XSL stylesheets. Typeface formatting and display conventions are similar +to most printed and electronically distributed technical documentation. +----------------------------------------------------------------------------- + +1.3. Recommended approach + + I strongly recommend to the eager reader making a first foray into the +discipline of traffic control, to become only casually familiar with the tc +command line utility, before concentrating on tcng. The tcng software package +defines an entire language for describing traffic control structures. At +first, this language may seem daunting, but mastery of these basics will +quickly provide the user with a much wider ability to employ (and deploy) +traffic control configurations than the direct use of tc would afford. + + Where possible, I'll try to prefer describing the behaviour of the Linux +traffic control system in an abstract manner, although in many cases I'll +need to supply the syntax of one or the other common systems for defining +these structures. I may not supply examples in both the tcng language and the +tc command line, so the wise user will have some familiarity with both. + + +----------------------------------------------------------------------------- + +1.4. Missing content, corrections and feedback + + There is content yet missing from this HOWTO. In particular, the following +items will be added at some point to this documentation. + +  *  A description and diagram of GRED, WRR, PRIO and CBQ. + +  *  A section of examples. + +  *  A section detailing the classifiers. + +  *  A section discussing the techniques for measuring traffic. + +  *  A section covering meters. + +  *  More details on tcng. + + + I welcome suggestions, corrections and feedback at . All errors and omissions are strictly my fault. Although I have made every +effort to verify the factual correctness of the content presented herein, I +cannot accept any responsibility for actions taken under the influence of +this documentation. + + +----------------------------------------------------------------------------- + +2. Overview of Concepts + + This section will introduce traffic control and examine reasons for it, +identify a few advantages and disadvantages and introduce key concepts used +in traffic control. +----------------------------------------------------------------------------- + +2.1. What is it? + + Traffic control is the name given to the sets of queuing systems and +mechanisms by which packets are received and transmitted on a router. This +includes deciding which (and whether) packets to accept at what rate on the +input of an interface and determining which packets to transmit in what order +at what rate on the output of an interface. + + In the overwhelming majority of situations, traffic control consists of a +single queue which collects entering packets and dequeues them as quickly as +the hardware (or underlying device) can accept them. This sort of queue is a +FIFO. + +Note The default qdisc under Linux is the pfifo_fast, which is slightly more + complex than the FIFO. + + There are examples of queues in all sorts of software. The queue is a way +of organizing the pending tasks or data (see also Section 2.5). Because +network links typically carry data in a serialized fashion, a queue is +required to manage the outbound data packets. + + In the case of a desktop machine and an efficient webserver sharing the +same uplink to the Internet, the following contention for bandwidth may +occur. The web server may be able to fill up the output queue on the router +faster than the data can be transmitted across the link, at which point the +router starts to drop packets (its buffer is full!). Now, the desktop machine +(with an interactive application user) may be faced with packet loss and high +latency. Note that high latency sometimes leads to screaming users! By +separating the internal queues used to service these two different classes of +application, there can be better sharing of the network resource between the +two applications. + + Traffic control is the set of tools which allows the user to have granular +control over these queues and the queuing mechanisms of a networked device. +The power to rearrange traffic flows and packets with these tools is +tremendous and can be complicated, but is no substitute for adequate +bandwidth. + + The term Quality of Service (QoS) is often used as a synonym for traffic +control. +----------------------------------------------------------------------------- + +2.2. Why use it? + + Packet-switched networks differ from circuit based networks in one very +important regard. A packet-switched network itself is stateless. A +circuit-based network (such as a telephone network) must hold state within +the network. IP networks are stateless and packet-switched networks by +design; in fact, this statelessness is one of the fundamental strengths of +IP. + + The weakness of this statelessness is the lack of differentiation between +types of flows. In simplest terms, traffic control allows an administrator to +queue packets differently based on attributes of the packet. It can even be +used to simulate the behaviour of a circuit-based network. This introduces +statefulness into the stateless network. + + There are many practical reasons to consider traffic control, and many +scenarios in which using traffic control makes sense. Below are some examples +of common problems which can be solved or at least ameliorated with these +tools. + + The list below is not an exhaustive list of the sorts of solutions +available to users of traffic control, but introduces the types of problems +that can be solved by using traffic control to maximize the usability of a +network connection. + +Common traffic control solutions + +  *  Limit total bandwidth to a known rate; TBF, HTB with child class(es). + +  *  Limit the bandwidth of a particular user, service or client; HTB + classes and classifying with a filter. traffic. + +  *  Maximize TCP throughput on an asymmetric link; prioritize transmission + of ACK packets, wondershaper. + +  *  Reserve bandwidth for a particular application or user; HTB with + children classes and classifying. + +  *  Prefer latency sensitive traffic; PRIO inside an HTB class. + +  *  Managed oversubscribed bandwidth; HTB with borrowing. + +  *  Allow equitable distribution of unreserved bandwidth; HTB with + borrowing. + +  *  Ensure that a particular type of traffic is dropped; policer attached + to a filter with a drop action. + + + Remember, too that sometimes, it is simply better to purchase more +bandwidth. Traffic control does not solve all problems! + + +----------------------------------------------------------------------------- + +2.3. Advantages + + When properly employed, traffic control should lead to more predictable +usage of network resources and less volatile contention for these resources. +The network then meets the goals of the traffic control configuration. Bulk +download traffic can be allocated a reasonable amount of bandwidth even as +higher priority interactive traffic is simultaneously serviced. Even low +priority data transfer such as mail can be allocated bandwidth without +tremendously affecting the other classes of traffic. + + In a larger picture, if the traffic control configuration represents policy +which has been communicated to the users, then users (and, by extension, +applications) know what to expect from the network. + + +----------------------------------------------------------------------------- + +2.4. Disdvantages + + + + Complexity is easily one of the most significant disadvantages of using +traffic control. There are ways to become familiar with traffic control tools +which ease the learning curve about traffic control and its mechanisms, but +identifying a traffic control misconfiguration can be quite a challenge. + + Traffic control when used appropriately can lead to more equitable +distribution of network resources. It can just as easily be installed in an +inappropriate manner leading to further and more divisive contention for +resources. + + The computing resources required on a router to support a traffic control +scenario need to be capable of handling the increased cost of maintaining the +traffic control structures. Fortunately, this is a small incremental cost, +but can become more significant as the configuration grows in size and +complexity. + + For personal use, there's no training cost associated with the use of +traffic control, but a company may find that purchasing more bandwidth is a +simpler solution than employing traffic control. Training employees and +ensuring depth of knowledge may be more costly than investing in more +bandwidth. + + Although traffic control on packet-switched networks covers a larger +conceptual area, you can think of traffic control as a way to provide [some +of] the statefulness of a circuit-based network to a packet-switched network. +----------------------------------------------------------------------------- + +2.5. Queues + + Queues form the backdrop for all of traffic control and are the integral +concept behind scheduling. A queue is a location (or buffer) containing a +finite number of items waiting for an action or service. In networking, a +queue is the place where packets (our units) wait to be transmitted by the +hardware (the service). In the simplest model, packets are transmitted in a +first-come first-serve basis [2]. In the discipline of computer networking +(and more generally computer science), this sort of a queue is known as a +FIFO. + + Without any other mechanisms, a queue doesn't offer any promise for traffic +control. There are only two interesting actions in a queue. Anything entering +a queue is enqueued into the queue. To remove an item from a queue is to +dequeue that item. + + A queue becomes much more interesting when coupled with other mechanisms +which can delay packets, rearrange, drop and prioritize packets in multiple +queues. A queue can also use subqueues, which allow for complexity of +behaviour in a scheduling operation. + + From the perspective of the higher layer software, a packet is simply +enqueued for transmission, and the manner and order in which the enqueued +packets are transmitted is immaterial to the higher layer. So, to the higher +layer, the entire traffic control system may appear as a single queue [3]. It +is only by examining the internals of this layer that the traffic control +structures become exposed and available. +----------------------------------------------------------------------------- + +2.6. Flows + + A flow is a distinct connection or conversation between two hosts. Any +unique set of packets between two hosts can be regarded as a flow. Under TCP +the concept of a connection with a source IP and port and destination IP and +port represents a flow. A UDP flow can be similarly defined. + + Traffic control mechanisms frequently separate traffic into classes of +flows which can be aggregated and transmitted as an aggregated flow (consider +DiffServ). Alternate mechanisms may attempt to divide bandwidth equally based +on the individual flows. + + Flows become important when attempting to divide bandwidth equally among a +set of competing flows, especially when some applications deliberately build +a large number of flows. +----------------------------------------------------------------------------- + +2.7. Tokens and buckets + + Two of the key underpinnings of a shaping mechanisms are the interrelated +concepts of tokens and buckets. + + In order to control the rate of dequeuing, an implementation can count the +number of packets or bytes dequeued as each item is dequeued, although this +requires complex usage of timers and measurements to limit accurately. +Instead of calculating the current usage and time, one method, used widely in +traffic control, is to generate tokens at a desired rate, and only dequeue +packets or bytes if a token is available. + + Consider the analogy of an amusement park ride with a queue of people +waiting to experience the ride. Let's imagine a track on which carts traverse +a fixed track. The carts arrive at the head of the queue at a fixed rate. In +order to enjoy the ride, each person must wait for an available cart. The +cart is analogous to a token and the person is analogous to a packet. Again, +this mechanism is a rate-limiting or shaping mechanism. Only a certain number +of people can experience the ride in a particular period. + + To extend the analogy, imagine an empty line for the amusement park ride +and a large number of carts sitting on the track ready to carry people. If a +large number of people entered the line together many (maybe all) of them +could experience the ride because of the carts available and waiting. The +number of carts available is a concept analogous to the bucket. A bucket +contains a number of tokens and can use all of the tokens in bucket without +regard for passage of time. + + And to complete the analogy, the carts on the amusement park ride (our +tokens) arrive at a fixed rate and are only kept available up to the size of +the bucket. So, the bucket is filled with tokens according to the rate, and +if the tokens are not used, the bucket can fill up. If tokens are used the +bucket will not fill up. Buckets are a key concept in supporting bursty +traffic such as HTTP. + + The TBF qdisc is a classical example of a shaper (the section on TBF +includes a diagram which may help to visualize the token and bucket +concepts). The TBF generates rate tokens and only transmits packets when a +token is available. Tokens are a generic shaping concept. + + In the case that a queue does not need tokens immediately, the tokens can +be collected until they are needed. To collect tokens indefinitely would +negate any benefit of shaping so tokens are collected until a certain number +of tokens has been reached. Now, the queue has tokens available for a large +number of packets or bytes which need to be dequeued. These intangible tokens +are stored in an intangible bucket, and the number of tokens that can be +stored depends on the size of the bucket. + + This also means that a bucket full of tokens may be available at any +instant. Very predictable regular traffic can be handled by small buckets. +Larger buckets may be required for burstier traffic, unless one of the +desired goals is to reduce the burstiness of the flows. + + In summary, tokens are generated at rate, and a maximum of a bucket's worth +of tokens may be collected. This allows bursty traffic to be handled, while +smoothing and shaping the transmitted traffic. + + The concepts of tokens and buckets are closely interrelated and are used in +both TBF (one of the classless qdiscs) and HTB (one of the classful qdiscs). +Within the tcng language, the use of two- and three-color meters is +indubitably a token and bucket concept. +----------------------------------------------------------------------------- + +2.8. Packets and frames + + The terms for data sent across network changes depending on the layer the +user is examining. This document will rather impolitely (and incorrectly) +gloss over the technical distinction between packets and frames although they +are outlined here. + + The word frame is typically used to describe a layer 2 (data link) unit of +data to be forwarded to the next recipient. Ethernet interfaces, PPP +interfaces, and T1 interfaces all name their layer 2 data unit a frame. The +frame is actually the unit on which traffic control is performed. + + A packet, on the other hand, is a higher layer concept, representing layer +3 (network) units. The term packet is preferred in this documentation, +although it is slightly inaccurate. +----------------------------------------------------------------------------- + +3. Traditional Elements of Traffic Control + + +----------------------------------------------------------------------------- + +3.1. Shaping + + Shapers delay packets to meet a desired rate. + + Shaping is the mechanism by which packets are delayed before transmission +in an output queue to meet a desired output rate. This is one of the most +common desires of users seeking bandwidth control solutions. The act of +delaying a packet as part of a traffic control solution makes every shaping +mechanism into a non-work-conserving mechanism, meaning roughly: "Work is +required in order to delay packets." + + Viewed in reverse, a non-work-conserving queuing mechanism is performing a +shaping function. A work-conserving queuing mechanism (see PRIO) would not be +capable of delaying a packet. + + Shapers attempt to limit or ration traffic to meet but not exceed a +configured rate (frequently measured in packets per second or bits/bytes per +second). As a side effect, shapers can smooth out bursty traffic [4]. One of +the advantages of shaping bandwidth is the ability to control latency of +packets. The underlying mechanism for shaping to a rate is typically a token +and bucket mechanism. See also Section 2.7 for further detail on tokens and +buckets. +----------------------------------------------------------------------------- + +3.2. Scheduling + + Schedulers arrange and/or rearrange packets for output. + + Scheduling is the mechanism by which packets are arranged (or rearranged) +between input and output of a particular queue. The overwhelmingly most +common scheduler is the FIFO (first-in first-out) scheduler. From a larger +perspective, any set of traffic control mechanisms on an output queue can be +regarded as a scheduler, because packets are arranged for output. + + Other generic scheduling mechanisms attempt to compensate for various +networking conditions. A fair queuing algorithm (see SFQ) attempts to prevent +any single client or flow from dominating the network usage. A round-robin +algorithm (see WRR) gives each flow or client a turn to dequeue packets. +Other sophisticated scheduling algorithms attempt to prevent backbone +overload (see GRED) or refine other scheduling mechanisms (see ESFQ). +----------------------------------------------------------------------------- + +3.3. Classifying + + Classifiers sort or separate traffic into queues. + + Classifying is the mechanism by which packets are separated for different +treatment, possibly different output queues. During the process of accepting, +routing and transmitting a packet, a networking device can classify the +packet a number of different ways. Classification can include marking the +packet, which usually happens on the boundary of a network under a single +administrative control or classification can occur on each hop individually. + + The Linux model (see Section 4.3) allows for a packet to cascade across a +series of classifiers in a traffic control structure and to be classified in +conjunction with policers (see also Section 4.5). +----------------------------------------------------------------------------- + +3.4. Policing + + Policers measure and limit traffic in a particular queue. + + Policing, as an element of traffic control, is simply a mechanism by which +traffic can be limited. Policing is most frequently used on the network +border to ensure that a peer is not consuming more than its allocated +bandwidth. A policer will accept traffic to a certain rate, and then perform +an action on traffic exceeding this rate. A rather harsh solution is to drop +the traffic, although the traffic could be reclassified instead of being +dropped. + + A policer is a yes/no question about the rate at which traffic is entering +a queue. If the packet is about to enter a queue below a given rate, take one +action (allow the enqueuing). If the packet is about to enter a queue above a +given rate, take another action. Although the policer uses a token bucket +mechanism internally, it does not have the capability to delay a packet as a +shaping mechanism does. +----------------------------------------------------------------------------- + +3.5. Dropping + + Dropping discards an entire packet, flow or classification. + + Dropping a packet is a mechanism by which a packet is discarded. + + +----------------------------------------------------------------------------- + +3.6. Marking + + Marking is a mechanism by which the packet is altered. + +Note This is not fwmark. The iptablestarget MARKand the ipchains--markare + used to modify packet metadata, not the packet itself. + + Traffic control marking mechanisms install a DSCP on the packet itself, +which is then used and respected by other routers inside an administrative +domain (usually for DiffServ). +----------------------------------------------------------------------------- + +4. Components of Linux Traffic Control + + + + + + + + +Table 1. Correlation between traffic control elements and Linux components ++-------------------+-------------------------------------------------------+ +|traditional element|Linux component | ++-------------------+-------------------------------------------------------+ +|shaping |The class offers shaping capabilities. | ++-------------------+-------------------------------------------------------+ +|scheduling |A qdisc is a scheduler. Schedulers can be simple such | +| |as the FIFO or complex, containing classes and other | +| |qdiscs, such as HTB. | ++-------------------+-------------------------------------------------------+ +|classifying |The filter object performs the classification through | +| |the agency of a classifier object. Strictly speaking, | +| |Linux classifiers cannot exist outside of a filter. | ++-------------------+-------------------------------------------------------+ +|policing |A policer exists in the Linux traffic control | +| |implementation only as part of a filter. | ++-------------------+-------------------------------------------------------+ +|dropping |To drop traffic requires a filter with a policer which | +| |uses "drop" as an action. | ++-------------------+-------------------------------------------------------+ +|marking |The dsmark qdisc is used for marking. | ++-------------------+-------------------------------------------------------+ +----------------------------------------------------------------------------- + +4.1. qdisc + + Simply put, a qdisc is a scheduler (Section 3.2). Every output interface +needs a scheduler of some kind, and the default scheduler is a FIFO. Other +qdiscs available under Linux will rearrange the packets entering the +scheduler's queue in accordance with that scheduler's rules. + + The qdisc is the major building block on which all of Linux traffic control +is built, and is also called a queuing discipline. + + The classful qdiscs can contain classes, and provide a handle to which to +attach filters. There is no prohibition on using a classful qdisc without +child classes, although this will usually consume cycles and other system +resources for no benefit. + + The classless qdiscs can contain no classes, nor is it possible to attach +filter to a classless qdisc. Because a classless qdisc contains no children +of any kind, there is no utility to classifying. This means that no filter +can be attached to a classless qdisc. + + A source of terminology confusion is the usage of the terms root qdisc and +ingress qdisc. These are not really queuing disciplines, but rather locations +onto which traffic control structures can be attached for egress (outbound +traffic) and ingress (inbound traffic). + + Each interface contains both. The primary and more common is the egress +qdisc, known as the root qdisc. It can contain any of the queuing disciplines +(qdiscs) with potential classes and class structures. The overwhelming +majority of documentation applies to the root qdisc and its children. Traffic +transmitted on an interface traverses the egress or root qdisc. + + For traffic accepted on an interface, the ingress qdisc is traversed. With +its limited utility, it allows no child class to be created, and only exists +as an object onto which a filter can be attached. For practical purposes, the +ingress qdisc is merely a convenient object onto which to attach a policer to +limit the amount of traffic accepted on a network interface. + + In short, you can do much more with an egress qdisc because it contains a +real qdisc and the full power of the traffic control system. An ingress qdisc +can only support a policer. The remainder of the documentation will concern +itself with traffic control structures attached to the root qdisc unless +otherwise specified. +----------------------------------------------------------------------------- + +4.2. class + + Classes only exist inside a classful qdisc (e.g., HTB and CBQ). Classes are +immensely flexible and can always contain either multiple children classes or +a single child qdisc [5]. There is no prohibition against a class containing +a classful qdisc itself, which facilitates tremendously complex traffic +control scenarios. + + Any class can also have an arbitrary number of filters attached to it, +which allows the selection of a child class or the use of a filter to +reclassify or drop traffic entering a particular class. + + A leaf class is a terminal class in a qdisc. It contains a qdisc (default +FIFO) and will never contain a child class. Any class which contains a child +class is an inner class (or root class) and not a leaf class. +----------------------------------------------------------------------------- + +4.3. filter + + The filter is the most complex component in the Linux traffic control +system. The filter provides a convenient mechanism for gluing together +several of the key elements of traffic control. The simplest and most obvious +role of the filter is to classify (see Section 3.3) packets. Linux filters +allow the user to classify packets into an output queue with either several +different filters or a single filter. + +  *  A filter must contain a classifier phrase. + +  *  A filter may contain a policer phrase. + + + Filters can be attached either to classful qdiscs or to classes, however +the enqueued packet always enters the root qdisc first. After the filter +attached to the root qdisc has been traversed, the packet may be directed to +any subclasses (which can have their own filters) where the packet may +undergo further classification. + + +----------------------------------------------------------------------------- + +4.4. classifier + + Filter objects, which can be manipulated using tc, can use several +different classifying mechanisms, the most common of which is the u32 +classifier. The u32 classifier allows the user to select packets based on +attributes of the packet. + + The classifiers are tools which can be used as part of a filter to identify +characteristics of a packet or a packet's metadata. The Linux classfier +object is a direct analogue to the basic operation and elemental mechanism of +traffic control classifying. +----------------------------------------------------------------------------- + +4.5. policer + + This elemental mechanism is only used in Linux traffic control as part of a +filter. A policer calls one action above and another action below the +specified rate. Clever use of policers can simulate a three-color meter. See +also Section 10. + + Although both policing and shaping are basic elements of traffic control +for limiting bandwidth usage a policer will never delay traffic. It can only +perform an action based on specified criteria. See also Example 5. + + + + +----------------------------------------------------------------------------- + +4.6. drop + + This basic traffic control mechanism is only used in Linux traffic control +as part of a policer. Any policer attached to any filter could have a drop +action. + +Note The only place in the Linux traffic control system where a packet can be + explicitly dropped is a policer. A policer can limit packets enqueued at + a specific rate, or it can be configured to drop all traffic matching a + particular pattern [6]. + + There are, however, places within the traffic control system where a packet +may be dropped as a side effect. For example, a packet will be dropped if the +scheduler employed uses this method to control flows as the GRED does. + + Also, a shaper or scheduler which runs out of its allocated buffer space +may have to drop a packet during a particularly bursty or overloaded period. + + +----------------------------------------------------------------------------- + +4.7. handle + + Every class and classful qdisc (see also Section 7) requires a unique +identifier within the traffic control structure. This unique identifier is +known as a handle and has two constituent members, a major number and a minor +number. These numbers can be assigned arbitrarily by the user in accordance +with the following rules [7]. + + + +The numbering of handles for classes and qdiscs + +major + This parameter is completely free of meaning to the kernel. The user + may use an arbitrary numbering scheme, however all objects in the traffic + control structure with the same parent must share a major handle number. + Conventional numbering schemes start at 1 for objects attached directly + to the root qdisc. + +minor + This parameter unambiguously identifies the object as a qdisc if minor + is 0. Any other value identifies the object as a class. All classes + sharing a parent must have unique minor numbers. + + + The special handle ffff:0 is reserved for the ingress qdisc. + + The handle is used as the target in classid and flowid phrases of tc filter +statements. These handles are external identifiers for the objects, usable by +userland applications. The kernel maintains internal identifiers for each +object. +----------------------------------------------------------------------------- + +5. Software and Tools + + +----------------------------------------------------------------------------- + +5.1. Kernel requirements + + Many distributions provide kernels with modular or monolithic support for +traffic control (Quality of Service). Custom kernels may not already provide +support (modular or not) for the required features. If not, this is a very +brief listing of the required kernel options. + + The user who has little or no experience compiling a kernel is recommended +to Kernel HOWTO. Experienced kernel compilers should be able to determine +which of the below options apply to the desired configuration, after reading +a bit more about traffic control and planning. + + +Example 1. Kernel compilation options [8] +# +# QoS and/or fair queueing +# +CONFIG_NET_SCHED=y +CONFIG_NET_SCH_CBQ=m +CONFIG_NET_SCH_HTB=m +CONFIG_NET_SCH_CSZ=m +CONFIG_NET_SCH_PRIO=m +CONFIG_NET_SCH_RED=m +CONFIG_NET_SCH_SFQ=m +CONFIG_NET_SCH_TEQL=m +CONFIG_NET_SCH_TBF=m +CONFIG_NET_SCH_GRED=m +CONFIG_NET_SCH_DSMARK=m +CONFIG_NET_SCH_INGRESS=m +CONFIG_NET_QOS=y +CONFIG_NET_ESTIMATOR=y +CONFIG_NET_CLS=y +CONFIG_NET_CLS_TCINDEX=m +CONFIG_NET_CLS_ROUTE4=m +CONFIG_NET_CLS_ROUTE=y +CONFIG_NET_CLS_FW=m +CONFIG_NET_CLS_U32=m +CONFIG_NET_CLS_RSVP=m +CONFIG_NET_CLS_RSVP6=m +CONFIG_NET_CLS_POLICE=y + + + A kernel compiled with the above set of options will provide modular +support for almost everything discussed in this documentation. The user may +need to modprobe module before using a given feature. Again, the confused +user is recommended to the Kernel HOWTO, as this document cannot adequately +address questions about the use of the Linux kernel. +----------------------------------------------------------------------------- + +5.2. iproute2 tools (tc) + + iproute2 is a suite of command line utilities which manipulate kernel +structures for IP networking configuration on a machine. For technical +documentation on these tools, see the iproute2 documentation and for a more +expository discussion, the documentation at [http://linux-ip.net/] +linux-ip.net. Of the tools in the iproute2 package, the binary tc is the only +one used for traffic control. This HOWTO will ignore the other tools in the +suite. + + + Because it interacts with the kernel to direct the creation, deletion and +modification of traffic control structures, the tc binary needs to be +compiled with support for all of the qdiscs you wish to use. In particular, +the HTB qdisc is not supported yet in the upstream iproute2 package. See +Section 7.1 for more information. + + The tc tool performs all of the configuration of the kernel structures +required to support traffic control. As a result of its many uses, the +command syntax can be described (at best) as arcane. The utility takes as its +first non-option argument one of three Linux traffic control components, +qdisc, class or filter. + + +Example 2. tc command usage +[root@leander]# tc +Usage: tc [ OPTIONS ] OBJECT { COMMAND | help } +where OBJECT := { qdisc | class | filter } + OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] } + + + Each object accepts further and different options, and will be incompletely +described and documented below. The hints in the examples below are designed +to introduce the vagaries of tc command line syntax. For more examples, +consult the [http://lartc.org/howto/] LARTC HOWTO. For even better +understanding, consult the kernel and iproute2 code. + + +Example 3. tc qdisc +[root@leander]# tc qdisc add \ (1) +> dev eth0 \ (2) +> root \ (3) +> handle 1:0 \ (4) +> htb (5) + + +(1) Add a queuing discipline. The verb could also be del. +(2) Specify the device onto which we are attaching the new queuing + discipline. +(3) This means "egress" to tc. The word root must be used, however. Another + qdisc with limited functionality, the ingress qdisc can be attached to + the same device. +(4) The handle is a user-specified number of the form major:minor. The + minor number for any queueing discipline handle must always be zero (0). + An acceptable shorthand for a qdisc handle is the syntax "1:", where the + minor number is assumed to be zero (0) if not specified. +(5) This is the queuing discipline to attach, HTB in this example. Queuing + discipline specific parameters will follow this. In the example here, we + add no qdisc-specific parameters. + + Above was the simplest use of the tc utility for adding a queuing +discipline to a device. Here's an example of the use of tc to add a class to +an existing parent class. + + +Example 4. tc class +[root@leander]# tc class add \ (1) +> dev eth0 \ (2) +> parent 1:1 \ (3) +> classid 1:6 \ (4) +> htb \ (5) +> rate 256kbit \ (6) +> ceil 512kbit (7) + + +(1) Add a class. The verb could also be del. +(2) Specify the device onto which we are attaching the new class. +(3) Specify the parent handle to which we are attaching the new class. +(4) This is a unique handle (major:minor) identifying this class. The minor + number must be any non-zero (0) number. +(5) Both of the classful qdiscs require that any children classes be + classes of the same type as the parent. Thus an HTB qdisc will contain + HTB classes. +(6) (7) + This is a class specific parameter. Consult Section 7.1 for more detail + on these parameters. + + + + +Example 5. tc filter +[root@leander]# tc filter add \ (1) +> dev eth0 \ (2) +> parent 1:0 \ (3) +> protocol ip \ (4) +> prio 5 \ (5) +> u32 \ (6) +> match ip port 22 0xffff \ (7) +> match ip tos 0x10 0xff \ (8) +> flowid 1:6 \ (9) +> police \ (10) +> rate 32000bps \ (11) +> burst 10240 \ (12) +> mpu 0 \ (13) +> action drop/continue (14) + + +(1) Add a filter. The verb could also be del. +(2) Specify the device onto which we are attaching the new filter. +(3) Specify the parent handle to which we are attaching the new filter. +(4) This parameter is required. It's use should be obvious, although I + don't know more. +(5) The prio parameter allows a given filter to be preferred above another. + The pref is a synonym. +(6) This is a classifier, and is a required phrase in every tc filter + command. +(7) (8) + These are parameters to the classifier. In this case, packets with a + type of service flag (indicating interactive usage) and matching port 22 + will be selected by this statement. +(9) The flowid specifies the handle of the target class (or qdisc) to which + a matching filter should send its selected packets. +(10) + This is the policer, and is an optional phrase in every tc filter + command. +(11) The policer will perform one action above this rate, and another action + below (see action parameter). +(12) The burst is an exact analog to burst in HTB (burst is a buckets + concept). +(13) The minimum policed unit. To count all traffic, use an mpu of zero (0). +(14) The action indicates what should be done if the rate based on the + attributes of the policer. The first word specifies the action to take if + the policer has been exceeded. The second word specifies action to take + otherwise. + + As evidenced above, the tc command line utility has an arcane and complex +syntax, even for simple operations such as these examples show. It should +come as no surprised to the reader that there exists an easier way to +configure Linux traffic control. See the next section, Section 5.3. +----------------------------------------------------------------------------- + +5.3. tcng, Traffic Control Next Generation + + FIXME; sing the praises of tcng. See also [http://tldp.org/HOWTO/ +Traffic-Control-tcng-HTB-HOWTO/] Traffic Control using tcng and HTB HOWTO +and tcng documentation. + + Traffic control next generation (hereafter, tcng) provides all of the power +of traffic control under Linux with twenty percent of the headache. + + +----------------------------------------------------------------------------- + +5.4. IMQ, Intermediate Queuing device + + + + FIXME; must discuss IMQ. See also Patrick McHardy's website on [http:// +trash.net/~kaber/imq/] IMQ. + + +----------------------------------------------------------------------------- + +6. Classless Queuing Disciplines (qdiscs) + + Each of these queuing disciplines can be used as the primary qdisc on an +interface, or can be used inside a leaf class of a classful qdiscs. These are +the fundamental schedulers used under Linux. Note that the default scheduler +is the pfifo_fast. + + +----------------------------------------------------------------------------- + +6.1. FIFO, First-In First-Out (pfifo and bfifo) + +Note This is not the default qdisc on Linux interfaces. Be certain to see + Section 6.2 for the full details on the default (pfifo_fast) qdisc. + + The FIFO algorithm forms the basis for the default qdisc on all Linux +network interfaces (pfifo_fast). It performs no shaping or rearranging of +packets. It simply transmits packets as soon as it can after receiving and +queuing them. This is also the qdisc used inside all newly created classes +until another qdisc or a class replaces the FIFO. + +[fifo-qdisc] + + A real FIFO qdisc must, however, have a size limit (a buffer size) to +prevent it from overflowing in case it is unable to dequeue packets as +quickly as it receives them. Linux implements two basic FIFO qdiscs, one +based on bytes, and one on packets. Regardless of the type of FIFO used, the +size of the queue is defined by the parameter limit. For a pfifo the unit is +understood to be packets and for a bfifo the unit is understood to be bytes. + + +Example 6. Specifying a limit for a packet or byte FIFO +[root@leander]# cat bfifo.tcc +/* + * make a FIFO on eth0 with 10kbyte queue size + * + */ + +dev eth0 { + egress { + fifo (limit 10kB ); + } +} +[root@leander]# tcc < bfifo.tcc +# ================================ Device eth0 ================================ + +tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0 +tc qdisc add dev eth0 handle 2:0 parent 1:0 bfifo limit 10240 +[root@leander]# cat pfifo.tcc +/* + * make a FIFO on eth0 with 30 packet queue size + * + */ + +dev eth0 { + egress { + fifo (limit 30p ); + } +} +[root@leander]# tcc < pfifo.tcc +# ================================ Device eth0 ================================ + +tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0 +tc qdisc add dev eth0 handle 2:0 parent 1:0 pfifo limit 30 + +----------------------------------------------------------------------------- + +6.2. pfifo_fast, the default Linux qdisc + + The pfifo_fast qdisc is the default qdisc for all interfaces under Linux. +Based on a conventional FIFO qdisc, this qdisc also provides some +prioritization. It provides three different bands (individual FIFOs) for +separating traffic. The highest priority traffic (interactive flows) are +placed into band 0 and are always serviced first. Similarly, band 1 is always +emptied of pending packets before band 2 is dequeued. + +[pfifo_fast-qdisc] + + There is nothing configurable to the end user about the pfifo_fast qdisc. +For exact details on the priomap and use of the ToS bits, see the pfifo-fast +section of the LARTC HOWTO. +----------------------------------------------------------------------------- + +6.3. SFQ, Stochastic Fair Queuing + + The SFQ qdisc attempts to fairly distribute opportunity to transmit data to +the network among an arbitrary number of flows. It accomplishes this by using +a hash function to separate the traffic into separate (internally maintained) +FIFOs which are dequeued in a round-robin fashion. Because there is the +possibility for unfairness to manifest in the choice of hash function, this +function is altered periodically. Perturbation (the parameter perturb) sets +this periodicity. + +[sfq-qdisc] + + +Example 7. Creating an SFQ +[root@leander]# cat sfq.tcc +/* + * make an SFQ on eth0 with a 10 second perturbation + * + */ + +dev eth0 { + egress { + sfq( perturb 10s ); + } +} +[root@leander]# tcc < sfq.tcc +# ================================ Device eth0 ================================ + +tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0 +tc qdisc add dev eth0 handle 2:0 parent 1:0 sfq perturb 10 + + + Unfortunately, some clever software (e.g. Kazaa and eMule among others) +obliterate the benefit of this attempt at fair queuing by opening as many TCP +sessions (flows) as can be sustained. In many networks, with well-behaved +users, SFQ can adequately distribute the network resources to the contending +flows, but other measures may be called for when obnoxious applications have +invaded the network. + + See also Section 6.4 for an SFQ qdisc with more exposed parameters for the +user to manipulate. +----------------------------------------------------------------------------- + +6.4. ESFQ, Extended Stochastic Fair Queuing + + Conceptually, this qdisc is no different than SFQ although it allows the +user to control more parameters than its simpler cousin. This qdisc was +conceived to overcome the shortcoming of SFQ identified above. By allowing +the user to control which hashing algorithm is used for distributing access +to network bandwidth, it is possible for the user to reach a fairer real +distribution of bandwidth. + + +Example 8. ESFQ usage +Usage: ... esfq [ perturb SECS ] [ quantum BYTES ] [ depth FLOWS ] + [ divisor HASHBITS ] [ limit PKTS ] [ hash HASHTYPE] + +Where: +HASHTYPE := { classic | src | dst } + + + FIXME; need practical experience and/or attestation here. +----------------------------------------------------------------------------- + +6.5. GRED, Generic Random Early Drop + + FIXME; I have never used this. Need practical experience or attestation. + + Theory declares that a RED algorithm is useful on a backbone or core +network, but not as useful near the end-user. See the section on flows to see +a general discussion of the thirstiness of TCP. +----------------------------------------------------------------------------- + +6.6. TBF, Token Bucket Filter + + This qdisc is built on tokens and buckets. It simply shapes traffic +transmitted on an interface. To limit the speed at which packets will be +dequeued from a particular interface, the TBF qdisc is the perfect solution. +It simply slows down transmitted traffic to the specified rate. + + Packets are only transmitted if there are sufficient tokens available. +Otherwise, packets are deferred. Delaying packets in this fashion will +introduce an artificial latency into the packet's round trip time. + +[tbf-qdisc] + + +Example 9. Creating a 256kbit/s TBF +[root@leander]# cat tbf.tcc +/* + * make a 256kbit/s TBF on eth0 + * + */ + +dev eth0 { + egress { + tbf( rate 256 kbps, burst 20 kB, limit 20 kB, mtu 1514 B ); + } +} +[root@leander]# tcc < tbf.tcc +# ================================ Device eth0 ================================ + +tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0 +tc qdisc add dev eth0 handle 2:0 parent 1:0 tbf burst 20480 limit 20480 mtu 1514 rate 32000bps + + + +----------------------------------------------------------------------------- + +7. Classful Queuing Disciplines (qdiscs) + + The flexibility and control of Linux traffic control can be unleashed +through the agency of the classful qdiscs. Remember that the classful queuing +disciplines can have filters attached to them, allowing packets to be +directed to particular classes and subqueues. + + There are several common terms to describe classes directly attached to the +root qdisc and terminal classes. Classess attached to the root qdisc are +known as root classes, and more generically inner classes. Any terminal class +in a particular queuing discipline is known as a leaf class by analogy to the +tree structure of the classes. Besides the use of figurative language +depicting the structure as a tree, the language of family relationships is +also quite common. +----------------------------------------------------------------------------- + +7.1. HTB, Hierarchical Token Bucket + + HTB uses the concepts of tokens and buckets along with the class-based +system and filters to allow for complex and granular control over traffic. +With a complex borrowing model, HTB can perform a variety of sophisticated +traffic control techniques. One of the easiest ways to use HTB immediately is +that of shaping. + + By understanding tokens and buckets or by grasping the function of TBF, HTB +should be merely a logical step. This queuing discipline allows the user to +define the characteristics of the tokens and bucket used and allows the user +to nest these buckets in an arbitrary fashion. When coupled with a +classifying scheme, traffic can be controlled in a very granular fashion. + + + + Below is example output of the syntax for HTB on the command line with the +tc tool. Although the syntax for tcng is a language of its own, the rules for +HTB are the same. + + +Example 10. tc usage for HTB +Usage: ... qdisc add ... htb [default N] [r2q N] + default minor id of class to which unclassified packets are sent {0} + r2q DRR quantums are computed as rate in Bps/r2q {10} + debug string of 16 numbers each 0-3 {0} + +... class add ... htb rate R1 burst B1 [prio P] [slot S] [pslot PS] + [ceil R2] [cburst B2] [mtu MTU] [quantum Q] + rate rate allocated to this class (class can still borrow) + burst max bytes burst which can be accumulated during idle period {computed} + ceil definite upper class rate (no borrows) {rate} + cburst burst but for ceil {computed} + mtu max packet size we create rate map for {1600} + prio priority of leaf; lower are served first {0} + quantum how much bytes to serve from leaf at once {use r2q} + +TC HTB version 3.3 + + + +----------------------------------------------------------------------------- + +7.1.1. Software requirements + + Unlike almost all of the other software discussed, HTB is a newer queuing +discipline and your distribution may not have all of the tools and capability +you need to use HTB. The kernel must support HTB; kernel version 2.4.20 and +later support it in the stock distribution, although earlier kernel versions +require patching. To enable userland support for HTB, see [http:// +luxik.cdi.cz/~devik/qos/htb/] HTB for an iproute2 patch to tc. +----------------------------------------------------------------------------- + +7.1.2. Shaping + + One of the most common applications of HTB involves shaping transmitted +traffic to a specific rate. + + All shaping occurs in leaf classes. No shaping occurs in inner or root +classes as they only exist to suggest how the borrowing model should +distribute available tokens. + + + + +----------------------------------------------------------------------------- + +7.1.3. Borrowing + + A fundamental part of the HTB qdisc is the borrowing mechanism. Children +classes borrow tokens from their parents once they have exceeded rate. A +child class will continue to attempt to borrow until it reaches ceil, at +which point it will begin to queue packets for transmission until more tokens +/ctokens are available. As there are only two primary types of classes which +can be created with HTB the following table and diagram identify the various +possible states and the behaviour of the borrowing mechanisms. + + + + +Table 2. HTB class states and potential actions taken ++------+-----+--------------+-----------------------------------------------+ +|type |class|HTB internal |action taken | +|of |state|state | | +|class | | | | ++------+-----+--------------+-----------------------------------------------+ +|leaf |< |HTB_CAN_SEND |Leaf class will dequeue queued bytes up to | +| |rate | |available tokens (no more than burst packets) | ++------+-----+--------------+-----------------------------------------------+ +|leaf |> |HTB_MAY_BORROW|Leaf class will attempt to borrow tokens/ | +| |rate,| |ctokens from parent class. If tokens are | +| |< | |available, they will be lent in quantum | +| |ceil | |increments and the leaf class will dequeue up | +| | | |to cburst bytes | ++------+-----+--------------+-----------------------------------------------+ +|leaf |> |HTB_CANT_SEND |No packets will be dequeued. This will cause | +| |ceil | |packet delay and will increase latency to meet | +| | | |the desired rate. | ++------+-----+--------------+-----------------------------------------------+ +|inner,|< |HTB_CAN_SEND |Inner class will lend tokens to children. | +|root |rate | | | ++------+-----+--------------+-----------------------------------------------+ +|inner,|> |HTB_MAY_BORROW|Inner class will attempt to borrow tokens/ | +|root |rate,| |ctokens from parent class, lending them to | +| |< | |competing children in quantum increments per | +| |ceil | |request. | ++------+-----+--------------+-----------------------------------------------+ +|inner,|> |HTB_CANT_SEND |Inner class will not attempt to borrow from its| +|root |ceil | |parent and will not lend tokens/ctokens to | +| | | |children classes. | ++------+-----+--------------+-----------------------------------------------+ + + This diagram identifies the flow of borrowed tokens and the manner in which +tokens are charged to parent classes. In order for the borrowing model to +work, each class must have an accurate count of the number of tokens used by +itself and all of its children. For this reason, any token used in a child or +leaf class is charged to each parent class until the root class is reached. + + Any child class which wishes to borrow a token will request a token from +its parent class, which if it is also over its rate will request to borrow +from its parent class until either a token is located or the root class is +reached. So the borrowing of tokens flows toward the leaf classes and the +charging of the usage of tokens flows toward the root class. + +[htb-borrow] + + Note in this diagram that there are several HTB root classes. Each of these +root classes can simulate a virtual circuit. +----------------------------------------------------------------------------- + +7.1.4. HTB class parameters + + + +default + An optional parameter with every HTB qdisc object, the default default + is 0, which cause any unclassified traffic to be dequeued at hardware + speed, completely bypassing any of the classes attached to the root + qdisc. + +rate + Used to set the minimum desired speed to which to limit transmitted + traffic. This can be considered the equivalent of a committed information + rate (CIR), or the guaranteed bandwidth for a given leaf class. + +ceil + Used to set the maximum desired speed to which to limit the transmitted + traffic. The borrowing model should illustrate how this parameter is + used. This can be considered the equivalent of "burstable bandwidth". + +burst + This is the size of the rate bucket (see Tokens and buckets). HTB will + dequeue burst bytes before awaiting the arrival of more tokens. + +cburst + This is the size of the ceil bucket (see Tokens and buckets). HTB will + dequeue cburst bytes before awaiting the arrival of more ctokens. + +quantum + This is a key parameter used by HTB to control borrowing. Normally, the + correct quantum is calculated by HTB, not specified by the user. Tweaking + this parameter can have tremendous effects on borrowing and shaping under + contention, because it is used both to split traffic between children + classes over rate (but below ceil) and to transmit packets from these + same classes. + +r2q + Also, usually calculated for the user, r2q is a hint to HTB to help + determine the optimal quantum for a particular class. + +mtu + + +prio + + + + +----------------------------------------------------------------------------- + +7.1.5. Rules + + Below are some general guidelines to using HTB culled from [http:// +docum.org/] http://docum.org/ and the LARTC mailing list. These rules are +simply a recommendation for beginners to maximize the benefit of HTB until +gaining a better understanding of the practical application of HTB. + + + +  *  Shaping with HTB occurs only in leaf classes. See also Section 7.1.2. + +  *  Because HTB does not shape in any class except the leaf class, the sum + of the rates of leaf classes should not exceed the ceil of a parent + class. Ideally, the sum of the rates of the children classes would match + the rate of the parent class, allowing the parent class to distribute + leftover bandwidth (ceil - rate) among the children classes. + + This key concept in employing HTB bears repeating. Only leaf classes + actually shape packets; packets are only delayed in these leaf classes. + The inner classes (all the way up to the root class) exist to define how + borrowing/lending occurs (see also Section 7.1.3). + +  *  The quantum is only only used when a class is over rate but below ceil. + +  *  The quantum should be set at MTU or higher. HTB will dequeue a single + packet at least per service opportunity even if quantum is too small. In + such a case, it will not be able to calculate accurately the real + bandwidth consumed [9]. + +  *  Parent classes lend tokens to children in increments of quantum, so for + maximum granularity and most instantaneously evenly distributed + bandwidth, quantum should be as low as possible while still no less than + MTU. + +  *  A distinction between tokens and ctokens is only meaningful in a leaf + class, because non-leaf classes only lend tokens to child classes. + +  *  HTB borrowing could more accurately be described as "using". + + + +----------------------------------------------------------------------------- + +7.2. PRIO, priority scheduler + + The PRIO classful qdisc works on a very simple precept. When it is ready to +dequeue a packet, the first class is checked for a packet. If there's a +packet, it gets dequeued. If there's no packet, then the next class is +checked, until the queuing mechanism has no more classes to check. + + This section will be completed at a later date. +----------------------------------------------------------------------------- + +7.3. CBQ, Class Based Queuing + + CBQ is the classic implementation (also called venerable) of a traffic +control system. This section will be completed at a later date. + + +----------------------------------------------------------------------------- + +8. Rules, Guidelines and Approaches + + +----------------------------------------------------------------------------- + +8.1. General Rules of Linux Traffic Control + + There are a few general rules which ease the study of Linux traffic +control. Traffic control structures under Linux are the same whether the +initial configuration has been done with tcng or with tc. + +  *  Any router performing a shaping function should be the bottleneck on + the link, and should be shaping slightly below the maximum available link + bandwidth. This prevents queues from forming in other routers, affording + maximum control of packet latency/deferral to the shaping device. + +  *  A device can only shape traffic it transmits [10]. Because the traffic + has already been received on an input interface, the traffic cannot be + shaped. A traditional solution to this problem is an ingress policer. + +  *  Every interface must have a qdisc. The default qdisc (the pfifo_fast + qdisc) is used when another qdisc is not explicitly attached to the + interface. + +  *  One of the classful qdiscs added to an interface with no children + classes typically only consumes CPU for no benefit. + +  *  Any newly created class contains a FIFO. This qdisc can be replaced + explicitly with any other qdisc. The FIFO qdisc will be removed + implicitly if a child class is attached to this class. + +  *  Classes directly attached to the root qdisc can be used to simulate + virtual circuits. + +  *  A filter can be attached to classes or one of the classful qdiscs. + + + + + + + + + +----------------------------------------------------------------------------- + +8.2. Handling a link with a known bandwidth + + HTB is an ideal qdisc to use on a link with a known bandwidth, because the +innermost (root-most) class can be set to the maximum bandwidth available on +a given link. Flows can be further subdivided into children classes, allowing +either guaranteed bandwidth to particular classes of traffic or allowing +preference to specific kinds of traffic. + + + + +----------------------------------------------------------------------------- + +8.3. Handling a link with a variable (or unknown) bandwidth + + In theory, the PRIO scheduler is an ideal match for links with variable +bandwidth, because it is a work-conserving qdisc (which means that it +provides no shaping). In the case of a link with an unknown or fluctuating +bandwidth, the PRIO scheduler simply prefers to dequeue any available packet +in the highest priority band first, then falling to the lower priority +queues. + + + + +----------------------------------------------------------------------------- + +8.4. Sharing/splitting bandwidth based on flows + + Of the many types of contention for network bandwidth, this is one of the +easier types of contention to address in general. By using the SFQ qdisc, +traffic in a particular queue can be separated into flows, each of which will +be serviced fairly (inside that queue). Well-behaved applications (and users) +will find that using SFQ and ESFQ are sufficient for most sharing needs. + + The Achilles heel of these fair queuing algorithms is a misbehaving user or +application which opens many connections simultaneously (e.g., eMule, +eDonkey, Kazaa). By creating a large number of individual flows, the +application can dominate slots in the fair queuing algorithm. Restated, the +fair queuing algorithm has no idea that a single application is generating +the majority of the flows, and cannot penalize the user. Other methods are +called for. + + +----------------------------------------------------------------------------- + +8.5. Sharing/splitting bandwidth based on IP + + For many administrators this is the ideal method of dividing bandwidth +amongst their users. Unfortunately, there is no easy solution, and it becomes +increasingly complex with the number of machine sharing a network link. + + To divide bandwidth equitably between N IP addresses, there must be N +classes. + + +----------------------------------------------------------------------------- + +9. Scripts for use with QoS/Traffic Control + + + + + + +----------------------------------------------------------------------------- + +9.1. wondershaper + + More to come, see [http://lartc.org/wondershaper/] wondershaper. +----------------------------------------------------------------------------- + +9.2. ADSL Bandwidth HOWTO script (myshaper) + + More to come, see [http://www.tldp.org/HOWTO/ +ADSL-Bandwidth-Management-HOWTO/implementation.html] myshaper. +----------------------------------------------------------------------------- + +9.3. htb.init + + More to come, see htb.init. +----------------------------------------------------------------------------- + +9.4. tcng.init + + More to come, see tcng.init. +----------------------------------------------------------------------------- + +9.5. cbq.init + + More to come, see cbq.init. +----------------------------------------------------------------------------- + +10. Diagram + + + + +----------------------------------------------------------------------------- + +10.1. General diagram + + Below is a general diagram of the relationships of the components of a +classful queuing discipline (HTB pictured). A larger version of the diagram +is [http://linux-ip.net/traffic-control/htb-class.png] available. + + + + +Example 11. An example HTB tcng configuration +/* + * + * possible mock up of diagram shown at + * http://linux-ip.net/traffic-control/htb-class.png + * + */ + +$m_web = trTCM ( + cir 512 kbps, /* commited information rate */ + cbs 10 kB, /* burst for CIR */ + pir 1024 kbps, /* peak information rate */ + pbs 10 kB /* burst for PIR */ + ) ; + +dev eth0 { + egress { + + class ( <$web> ) if tcp_dport == PORT_HTTP && __trTCM_green( $m_web ); + class ( <$bulk> ) if tcp_dport == PORT_HTTP && __trTCM_yellow( $m_web ); + drop if __trTCM_red( $m_web ); + class ( <$bulk> ) if tcp_dport == PORT_SSH ; + + htb () { /* root qdisc */ + + class ( rate 1544kbps, ceil 1544kbps ) { /* root class */ + + $web = class ( rate 512kbps, ceil 512kbps ) { sfq ; } ; + $bulk = class ( rate 512kbps, ceil 1544kbps ) { sfq ; } ; + + } + } + } +} + + +[htb-class] + + +----------------------------------------------------------------------------- + +11. Annotated Traffic Control Links + + This section identifies a number of links to documentation about traffic +control and Linux traffic control software. Each link will be listed with a +brief description of the content at that site. + +  *  HTB site, HTB user guide and HTB theory (Martin "devik" Devera) + + Hierarchical Token Bucket, HTB, is a classful queuing discipline. + Widely used and supported it is also fairly well documented in the user + guide and at [http://www.docum.org/] Stef Coene's site (see below). + +  *  General Quality of Service docs (Leonardo Balliache) + + + There is a good deal of understandable and introductory documentation on + his site, and in particular has some excellent overview material. See in + particular, the detailed [http://opalsoft.net/qos/DS.htm] Linux QoS + document among others. +  *  tcng (Traffic Control Next Generation) and tcng manual (Werner + Almesberger) + + The tcng software includes a language and a set of tools for creating + and testing traffic control structures. In addition to generating tc + commands as output, it is also capable of providing output for non-Linux + applications. A key piece of the tcng suite which is ignored in this + documentation is the tcsim traffic control simulator. + + The user manual provided with the tcng software has been converted to + HTML with latex2html. The distribution comes with the TeX documentation. + +  *  iproute2 and iproute2 manual (Alexey Kuznetsov) + + This is a the source code for the iproute2 suite, which includes the + essential tc binary. Note, that as of + iproute2-2.4.7-now-ss020116-try.tar.gz, the package did not support HTB, + so a patch available from the [http://luxik.cdi.cz/~devik/qos/htb/] HTB + site will be required. + + The manual documents the entire suite of tools, although the tc utility + is not adequately documented here. The ambitious reader is recommended to + the LARTC HOWTO after consuming this introduction. + +  *  Documentation, graphs, scripts and guidelines to traffic control under + Linux (Stef Coene) + + Stef Coene has been gathering statistics and test results, scripts and + tips for the use of QoS under Linux. There are some particularly useful + graphs and guidelines available for implementing traffic control at + Stef's site. + +  *  [http://lartc.org/howto/] LARTC HOWTO (bert hubert, et. al.) + + The Linux Advanced Routing and Traffic Control HOWTO is one of the key + sources of data about the sophisticated techniques which are available + for use under Linux. The Traffic Control Introduction HOWTO should + provide the reader with enough background in the language and concepts of + traffic control. The LARTC HOWTO is the next place the reader should look + for general traffic control information. + +  *  Guide to IP Networking with Linux (Martin A. Brown) + + Not directly related to traffic control, this site includes articles + and general documentation on the behaviour of the Linux IP layer. + +  *  Werner Almesberger's Papers + + Werner Almesberger is one of the main developers and champions of + traffic control under Linux (he's also the author of tcng, above). One of + the key documents describing the entire traffic control architecture of + the Linux kernel is his Linux Traffic Control - Implementation Overview + which is available in [http://www.almesberger.net/cv/papers/tcio8.pdf] + PDF or [http://www.almesberger.net/cv/papers/tcio8.ps.gz] PS format. + +  *  Linux DiffServ project + + Mercilessly snipped from the main page of the DiffServ site... + + Differentiated Services (short: Diffserv) is an architecture for + providing different types or levels of service for network traffic. + One key characteristic of Diffserv is that flows are aggregated in + the network, so that core routers only need to distinguish a + comparably small number of aggregated flows, even if those flows + contain thousands or millions of individual flows. + + +Notes + +[1] See Section 5 for more details on the use or installation of a + particular traffic control mechanism, kernel or command line utility. +[2] This queueing model has long been used in civilized countries to + distribute scant food or provisions equitably. William Faulkner is + reputed to have walked to the front of the line for to fetch his share + of ice, proving that not everybody likes the FIFO model, and providing + us a model for considering priority queuing. +[3] Similarly, the entire traffic control system appears as a queue or + scheduler to the higher layer which is enqueuing packets into this + layer. +[4] This smoothing effect is not always desirable, hence the HTB parameters + burst and cburst. +[5] A classful qdisc can only have children classes of its type. For + example, an HTB qdisc can only have HTB classes as children. A CBQ qdisc + cannot have HTB classes as children. +[6] In this case, you'll have a filter which uses a classifier to select the + packets you wish to drop. Then you'll use a policer with a with a drop + action like this police rate 1bps burst 1 action drop/drop. +[7] I do not know the range nor base of these numbers. I believe they are + u32 hexadecimal, but need to confirm this. +[8] The options listed in this example are taken from a 2.4.20 kernel source + tree. The exact options may differ slightly from kernel release to + kernel release depending on patches and new schedulers and classifiers. +[9] HTB will report bandwidth usage in this scenario incorrectly. It will + calculate the bandwidth used by quantum instead of the real dequeued + packet size. This can skew results quickly. +[10] In fact, the Intermediate Queuing Device (IMQ) simulates an output + device onto which traffic control structures can be attached. This + clever solution allows a networking device to shape ingress traffic in + the same fashion as egress traffic. Despite the apparent contradiction + of the rule, IMQ appears as a device to the kernel. Thus, there has been + no violation of the rule, but rather a sneaky reinterpretation of that + rule. + +ProxyARP Subnetting HOWTO + +Bob Edwards + + Robert.Edwards@anu.edu.au + + v2.0, 27 August 2000 + + This HOWTO discusses using Proxy Address Resolution Protocol (ARP) + with subnetting in order to make a small network of machines visible + on another Internet Protocol (IP) subnet (I call it sub-subnetting). + This makes all the machines on the local network (network 0 from now + on) appear as if they are connected to the main network (network 1). + + This is only relevent if all machines are connected by Ethernet or + ether devices (ie. it won't work for SLIP/PPP/CSLIP etc.) + _________________________________________________________________ + + Table of Contents + 1. [1]Acknowledgements + 2. [2]Why use Proxy ARP with subnetting? + 3. [3]How Proxy ARP with subnetting works + 4. [4]Setting up Proxy ARP with subnetting + 5. [5]Other alternatives to Proxy ARP with subnetting + 6. [6]Other Applications of Proxy ARP with subnetting + 7. [7]Copying conditions + +1. Acknowledgements + + This document, and my Proxy ARP implementation could not have been + made possible without the help of: + + * Andrew Tridgell, who implemented the subnetting options for arp in + Linux, and who personally assisted me in getting it working + * the Proxy-ARP mini-HOWTO, by Al Longyear + * the Multiple-Ethernet mini-HOWTO, by Don Becker + * the arp(8) source code and man page by Fred N. van Kempen and + Bernd Eckenfels + _________________________________________________________________ + +2. Why use Proxy ARP with subnetting? + + The applications for using Proxy ARP with subnetting are fairly + specific. + + In my case, I had a wireless Ethernet card that plugs into an 8-bit + ISA slot. I wanted to use this card to provide connectivity for a + number of machines at once. Being an ISA card, I could use it on a + Linux machine, after I had written an appropriate device driver for it + - this is the subject of another document. From here, it was only + necessary to add a second Ethernet interface to the Linux machine and + then use some mechanism to join the two networks together. + + For the purposes of discussion, let network 0 be the local Ethernet + connected to the Linux box via an NE-2000 clone Ethernet interface on + eth0. Network 1 is the main network connected via the wireless + Ethernet card on eth1. Machine A is the Linux box with both + interfaces. Machine B is any TCP/IP machine on network 0 and machine C + is likewise on network 1. + + Normally, to provide the connectivity, I would have done one of the + following: + + * Used the IP-Bridge software (see the Bridge mini-HOWTO) to bridge + the traffic between the two network interfaces. Unfortunately, the + wireless Ethernet interface cannot be put into "Promiscuous" mode + (ie. it can't see all packets on network 1). This is mainly due to + the lower bandwidth of the wireless Ethernet (2MBit/sec) meaning + that we don't want to carry any traffic not specifically destined + to another wireless Ethernet machine - in our case machine A - or + broadcasts. Also, bridging is rather CPU intensive! + * Alternatively, use subnets and an IP-router to pass packets + between the two networks (see the IP-Subnetworking mini-HOWTO). + This is a protocol specific solution, where the Linux kernel can + handle the Internet Protocol (IP) packets, but other protocols + (such as AppleTalk) need extra software to route. This also + requires the allocation of a new IP subnet (network) number, which + is not always an option. + + In my case, getting a new subnet (network) number was not an option, + so I wanted a solution that allowed all the machines on network 0 to + appear as if they were on network 1. This is where Proxy ARP comes in. + Other solutions are used to connect other (non-IP) protocols, such as + netatalk to provide AppleTalk routing. + _________________________________________________________________ + +3. How Proxy ARP with subnetting works + + The Proxy ARP is actually only used to get packets from network 1 to + network 0. To get packets back the other way, the normal IP routing + functionality is employed. + + In my case, network 1 has an 8-bit subnet mask (255.255.255.0). I have + chosen the subnet mask for network 0 to be 4-bit (255.255.255.240), + allowing 14 IP nodes on network 0 (2 ^ 4 = 16, less two for the all + zeros and all ones cases). Note that any size of subnet mask up to, + but not including, the size of the mask of the other network is + allowable here (eg. 2, 3, 4, 5, 6 or 7 bits in this case - for one + bit, just use normal Proxy ARP!) + + All the IP numbers for network 0 (16 in total) appear in network 1 as + a subset. Note that it is very important, in this case, not to allow + any machine connected directly to network 1 to have an IP number in + this range! In my case, I have "reserved" the IP numbers of network 1 + ending in 64 .. 79 for network 0. In this case, the IP numbers ending + in 64 and 79 can't actually be used by nodes - 79 is the broadcast + address for network 0. + + Machine A is allocated two IP numbers, one within the network 0 range + for it's real Ethernet interface (eth0) and the other within the + network 1 range, but outside of the network 0 range, for the wireless + Ethernet interface (eth1). + + Say machine C (on network 1) wants to send a packet to machine B (on + network 0). Because the IP number of machine B makes it look to + machine C as though it is on the same physical network, machine C will + use the Address Resolution Protocol (ARP) to send a broadcast message + on network 1 requesting the machine with the IP number of machine B to + respond with it's hardware (Ethernet or MAC layer) address. Machine B + won't see this request, as it isn't actually on network 1, but machine + A, on both networks, will see it. + + The first bit of magic now happens as the Linux kernel arp code on + machine A, with a properly configured Proxy ARP with subnetting entry, + determines that the ARP request has come in on the network 1 interface + (eth1) and that the IP number being ARP'd for is in the subnet range + for network 0. Machine A then sends it's own hardware (Ethernet) + address back to machine C as an ARP response packet. + + Machine C then updates it's ARP cache with an entry for machine B, but + with the hardware (Ethernet) address of machine A (in this case, the + wireless Ethernet interface). Machine C can now send the packet for + machine B to this hardware (Ethernet) address, and machine A receives + it. + + Machine A notices that the destination IP number in the packet is that + of machine B, not itself. Machine A's Linux kernel IP routing code + attempts to forward the packet to machine B by looking at it's routing + tables to determine which interface contains the network number for + machine B. However, the IP number for machine B is valid for both the + network 0 interface (eth0), and for the network 1 interface (eth1). + + At this point, something else clever happens. Because the subnet mask + for the network 0 interface has more 1 bits (it is more specific) than + the subnet mask for the network 1 interface, the Linux kernel routing + code will match the IP number for machine B to the network 0 + interface, and not keep looking for the potential match with the + network 1 interface (the one the packet came in on). + + Now machine A needs to find out the "real" hardware (Ethernet) address + for machine B (assuming that it doesn't already have it in the ARP + cache). Machine A uses an ARP request, but this time the Linux kernel + arp code notes that the request isn't coming from the network 1 + interface (eth1), and so doesn't respond with the Proxy address of + eth1. Instead, it sends the ARP request on the network 0 interface + (eth0), where machine B will see it and respond with it's own (real) + hardware (Ethernet) address. Now machine A can send the packet (from + machine C) onto machine B. + + Machine B gets the packet from machine C (via machine A) and then + wants to send back a response. This time, machine B notices that + machine C in on a different subnet (machine B's subnet mask of + 255.255.255.240 excludes all machines not in the network 0 IP address + range). Machine B is setup with a "default" route to machine A's + network 0 IP number and sends the packet to machine A. This time, + machine A's Linux kernel routing code determines the destination IP + number (of machine C) as being on network 1 and sends the packet onto + machine C via Ethernet interface eth1. + + Similar (less complicated) things occur for packets originating from + and destined to machine A from other machines on either of the two + networks. + + Similarly, it should be obvious that if another machine (D) on network + 0 ARP's for machine B, machine A will receive the ARP request on it's + network 0 interface (eth0) and won't respond to the request as it is + set up to only Proxy on it's network 1 interface (eth1). + + Note also that all of machines B and C (and D) are not required to do + anything unusual, IP-wise. In my case, there is a mixture of Suns, + Macs and PC/Windoze 95 machines on network 0 all connecting through + Linux machine A to the rest of the world. + + Finally, note that once the hardware (Ethernet) addresses are + discovered by each of machines A, B, C (and D), they are placed in the + ARP cache and subsequent packet transfers occur without the ARP + overhead. The ARP caches normally expire entries after 5 minutes of + non-activity. + _________________________________________________________________ + +4. Setting up Proxy ARP with subnetting + + I set up Proxy ARP with subnetting on a Linux kernel version 2.0.30 + machine, but I am told that the code works right back to some kernel + version in the 1.2.x era. + + The first thing to note is that the ARP code is in two parts: the part + inside the kernel that sends and receives ARP requests and responses + and updates the ARP cache etc.; and other part is the arp(8) command + that allows the super user to modify the ARP cache manually and anyone + to examine it. + + The first problem I had was that the arp(8) command that came with my + Slackware 3.1 distribution was ancient (1994 era!!!) and didn't + communicate with the kernel arp code correctly at all (mainly + evidenced by the strange output that it gave for "arp -a"). + + The arp(8) command in "net-tools-1.33a" available from a variety of + places, including (from the README file that came with it) + [8]ftp.linux.org.uk:/pub/linux/Networking/base/ works properly and + includes new man pages that explain stuff a lot better than the older + arp(8) man page. + + Armed with a decent arp(8) command, all the changes I made were in the + /etc/rc.d/rc.inet1 script (for Slackware - probably different for + other flavours). First of all, we need to change the broadcast + address, network number and netmask of eth0: + +NETMASK=255.255.255.240 # for a 4-bit host part +NETWORK=x.y.z.64 # our new network number (replace x.y.z with your net) +BROADCAST=x.y.z.79 # in my case + + Then a line needs to be added to configure the second Ethernet port + (after any module loading that might be required to load the driver + code): + +/sbin/ifconfig eth1 (name on net 1) broadcast (x.y.z.255) netmask 255.255.255.0 + + Then we add a route for the new interface: + +/sbin/route add -net (x.y.z.0) netmask 255.255.255.0 + + And you will probably need to change the default gateway to the one + for network 1. + + At this point, it is appropriate to add the Proxy ARP entry: + +/sbin/arp -i eth1 -Ds ${NETWORK} eth1 netmask ${NETMASK} pub + + This tells ARP to add a static entry (the s) to the cache for network + ${NETWORK}. The -D tells ARP to use the same hardware address as + interface eth1 (the second eth1), thus saving us from having to look + up the hardware address for eth1 and hardcoding it in. The netmask + option tells ARP that we want to use subnetting (ie. Proxy for all (IP + number) & ${NETMASK} == ${NETWORK} & ${NETMASK}). The pub option tells + ARP to publish this ARP entry, ie. it is a Proxy entry, so respond on + behalf of these IP numbers. The -i eth1 option tells ARP to only + respond to requests that come in on interface eth1. + + Hopefully, at this point, when the machine is rebooted, all the + machines on network 0 will appear to be on network 1. You can check + that the Proxy ARP with subnetting entry has been correctly installed + on machine A. On my machine (names changed to protect the innocent) it + is: + +bash$ /sbin/arp -an +Address HWtype HWaddress Flags Mask Iface +x.y.z.1 ether 00:00:0C:13:6F:17 C * eth1 +x.y.z.65 ether 00:40:05:49:77:01 C * eth0 +x.y.z.67 ether 08:00:20:0B:79:47 C * eth0 +x.y.z.5 ether 00:00:3B:80:18:E5 C * eth1 +x.y.z.64 ether 00:40:96:20:CD:D2 CMP 255.255.255.240 eth1 + + Alternatively, you can examine the /proc/net/arp file with eg. cat(1). + + The last line is the proxy entry for the subnet. The CMP flags + indicate that it is a static (Manually entered) entry and that it is + to be Published. The entry is only going to reply to ARP requests on + eth1 where the requested IP number, once masked, matches the network + number, also masked. Note that arp(8) has automatically determined the + hardware address of eth1 and inserted this for the address to use (the + -Ds option). + + Likewise, it is probably prudent to check that the routing table has + been set up correctly. Here is mine (again, the names are changed to + protect the innocent): + +#/bin/netstat -rn +Kernel routing table +Destination Gateway Genmask Flags Metric Ref Use Iface +x.y.z.64 0.0.0.0 255.255.255.240 U 0 0 71 eth0 +x.y.z.0 0.0.0.0 255.255.255.0 U 0 0 389 eth1 +127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 7 lo +0.0.0.0 x.y.z.1 0.0.0.0 UG 1 0 573 eth1 + + Alternatively, you can examine the /proc/net/route file with eg. + cat(1). + + Note that the first entry is a proper subset of the second, but the + routing table has ranked them in netmask order, so the eth0 entry will + be checked before the eth1 entry. + _________________________________________________________________ + +5. Other alternatives to Proxy ARP with subnetting + + There are several other alternatives to using Proxy ARP with + subnetting in this situation, apart from the ones mentioned about + (bridging and straight routing): + + * IP-Masquerading (see the IP-Masquerade mini-HOWTO), in which + network 0 is "hidden" behind machine A from the rest of the + Internet. As machines on network 0 attempt to connect outside + through machine A, it re-addresses the source address and port + number of the packets and makes them look like they are coming + from itself, rather than from the machine on the hidden network 0. + This is an elegant solution, although it prevents any machine on + network 1 from initiating a connection to any machine on network + 0, as the machines on network 0 effectively don't exist outside of + network 0. This effectively increases security of the machines on + network 0, but is also means that servers on network 1 cannot + check the identity of clients on network 0 using IP numbers (eg. + NFS servers use IP hostnames for access to mountable file + systems). + * Another option is IP in IP tunneling, which isn't supported on all + platforms (such as Macs and Windoze machines) so I opted not to go + this way. + * Use Proxy ARP without subnetting. This is certainly possible, it + just means that a separate entry needs to be created for each + machine on network 0, instead of a single entry for all machines + (current and future) on network 0. + * Possibly IP Aliasing might also be useful here, but I haven't + looked into this at all. + _________________________________________________________________ + +6. Other Applications of Proxy ARP with subnetting + + There is only one other application that I know about that uses Proxy + ARP with subnetting, also here at the Australian National University. + It is the one that Andrew Tridgell originally wrote the subnetting + extensions to Proxy ARP for. However, Andrew reliably informs me that + there are, in fact, several other sites around the world using it as + well (I don't have any details). + + The other A.N.U. application involves a teaching lab set up to teach + students how to configure machines to use TCP/IP, including setting up + the gateway. The network used is a Class C network, and Andrew needed + to "subnet" it for security, traffic control and the educational + reason mentioned above. He did this using Proxy ARP, and then decided + that a single entry in the ARP cache for the whole subnet would be + faster and cleaner than one for each host on the subnet. Voila...Proxy + ARP with subnetting! + _________________________________________________________________ + +7. Copying conditions + + Copyright 1997 by Bob Edwards <[9]Robert.Edwards@anu.edu.au> + + Voice: (+61) 2 6249 4090 + + Unless otherwise stated, Linux HOWTO documents are copyrighted by + their respective authors. Linux HOWTO documents may be reproduced and + distributed in whole or in part, in any medium physical or electronic, + as long as this copyright notice is retained on all copies. Commercial + redistribution is allowed and encouraged; however, the author would + like to be notified of any such distributions. All translations, + derivative works, or aggregate works incorporating any Linux HOWTO + documents must be covered under this copyright notice. That is, you + may not produce a derivative work from a HOWTO and impose additional + restrictions on its distribution. Exceptions to these rules may be + granted under certain conditions; please contact the Linux HOWTO + coordinator at the address given below. In short, we wish to promote + dissemination of this information through as many channels as + possible. However, we do wish to retain copyright on the HOWTO + documents, and would like to be notified of any plans to redistribute + the HOWTOs. If you have questions, please contact the Linux HOWTO + coordinator, at <[10]linux-howto@metalab.unc.edu> via email. + +References + + 1. Proxy-ARP-Subnet.html#INTRO + 2. Proxy-ARP-Subnet.html#WHY + 3. Proxy-ARP-Subnet.html#HOW + 4. Proxy-ARP-Subnet.html#SETUP + 5. Proxy-ARP-Subnet.html#ALTERNATIVES + 6. Proxy-ARP-Subnet.html#APPLICATIONS + 7. Proxy-ARP-Subnet.html#COPYING + 8. ftp://ftp.linux.org.uk/pub/linux/Networking/base/ + 9. mailto:Robert.Edwards@anu.edu.au + 10. mailto:linux-howto@metalab.unc.edu + + + + + +NTP + + +Time synchorinisation is generally considered important in the computing +environment. There are a number of reasons why this is important: it makes +sure your scheduled cron tasks on your various servers run well together, +it allows better use of log files between various machines to help +troubleshoot problems, and synchronised, correct logs are also useful if +your servers are ever attacked by crackers (either to report the attempt +to organisations such as AusCERT or in court to use against the bad guys). +Users who have overclocked their machine might also use time synchronisation +techniques to bring the time on their machines back to an accurate figure +at regular intervals, say every 20 minutes of so. This section contains an +overview of time keeping under Linux and some information about NTP, a +protocol which can be used to accurately reset the time across a computer +network. + + +2. How Linux Keeps Track of Time + +2.1. Basic Strategies + + +A Linux system actually has two clocks: One is the battery powered +"Real Time Clock" (also known as the "RTC", "CMOS clock", or "Hardware +clock") which keeps track of time when the system is turned off but is +not used when the system is running. The other is the "system clock" +(sometimes called the "kernel clock" or "software clock") which is a +software counter based on the timer interrupt. It does not exist when +the system is not running, so it has to be initialized from the RTC +(or some other time source) at boot time. References to "the clock" in +the ntpd documentation refer to the system clock, not the RTC. + + + +The two clocks will drift at different rates, so they will gradually +drift apart from each other, and also away from the "real" time. The +simplest way to keep them on time is to measure their drift rates and +apply correction factors in software. Since the RTC is only used when +the system is not running, the correction factor is applied when the +clock is read at boot time, using clock(8) or hwclock(8). The system +clock is corrected by adjusting the rate at which the system time is +advanced with each timer interrupt, using adjtimex(8). + + + +A crude alternative to adjtimex(8) is to have chron run clock(8) or +hwclock(8) periodically to sync the system time to the (corrected) +RTC. This was recommended in the clock(8) man page, and it works if +you do it often enough that you don't cause large "jumps" in the +system time, but adjtimex(8) is a more elegant solution. Some +applications may complain if the time jumps backwards. + + + +The next step up in accuracy is to use a program like ntpd to read the +time periodically from a network time server or radio clock, and +continuously adjust the rate of the system clock so that the times +always match, without causing sudden "jumps" in the system time. If +you always have a network connection at boot time, you can ignore the +RTC completely and use ntpdate (which comes with the ntpd package) to +initialize the system clock from a time server-- either a local server +on a LAN, or a remote server on the internet. But if you sometimes +don't have a network connection, or if you need the time to be +accurate during the boot sequence before the network is active, then +you need to maintain the time in the RTC as well. + + +2.2. Potential Conflicts + + +It might seem obvious that if you're using a program like ntpd, you +would want to sync the RTC to the (corrected) system clock. But this +turns out to be a bad idea if the system is going to stay shut down +longer than a few minutes, because it interferes with the programs +that apply the correction factor to the RTC at boot time. + + + +If the system runs 24/7 and is always rebooted immediately whenever +it's shut down, then you can just set the RTC from the system clock +right before you reboot. The RTC won't drift enough to make a +difference in the time it takes to reboot, so you don't need to know +its drift rate. + + + +Of course the system may go down unexpectedly, so some versions of the +kernel sync the RTC to the system clock every 11 minutes if the system +clock has been adjusted by another program. The RTC won't drift enough +in 11 minutes to make any difference, but if the system is down long +enough for the RTC to drift significantly, then you have a problem: +the programs that apply the drift correction to the RTC need to know +*exactly* when it was last reset, and the kernel doesn't record that +information anywhere. + + + +Some unix "traditionalists" might wonder why anyone would run a linux +system less than 24/7, but some of us run dual-boot systems with +another OS running some of the time, or run Linux on laptops that have +to be shut down to conserve battery power when they're not being used. +Other people just don't like to leave machines running unattended for +long periods of time (even though we've heard all the arguments in +favor of it). So the "every 11 minutes" feature becomes a bug. + + + +This "feature/bug" appears to behave differently in different versions +of the kernel (and possibly in different versions of xntpd and ntpd as +well), so if you're running both ntpd and hwclock you may need to test +your system to see what it actually does. If you can't keep the kernel +from resetting the RTC, you might have to run without a correction +factor on the RTC. + + + +The part of the kernel that controls this can be found in +/usr/src/linux-2.0.34/arch/i386/kernel/time.c (where the version +number in the path will be the version of the kernel you're running). +If the variable time_status is set to TIME_OK then the kernel will +write the system time to the RTC every 11 minutes, otherwise it leaves +the RTC alone. Calls to adjtimex(2) (as used by ntpd and timed, for +example) may turn this on. Calls to settimeofday(2) will set +time_status to TIME_UNSYNC, which tells the kernel not to adjust the +RTC. I have not found any real documentation on this. + + + +I've heard reports that some versions of the kernel may have problems +with "sleep modes" that shut down the CPU to save energy. The best +solution is to keep your kernel up to date, and refer any problems to +the people who maintain the kernel. + + + +If you get bizarre results from the RTC you may have a hardware +problem. Some RTC chips include a lithium battery that can run down, +and some motherboards have an option for an external battery (be sure +the jumper is set correctly). The same battery maintains the CMOS RAM, +but the clock takes more power and is likely to fail first. Bizarre +results from the system clock may mean there is a problem with +interrupts. + + +2.3. Should the RTC use Local Time or UTC, and What About DST? + + +The Linux "system clock" actually just counts the number of seconds +past Jan 1, 1970, and is always in UTC (or GMT, which is technically +different but close enough that casual users tend to use both terms +interchangeably). UTC does not change as DST comes and goes-- what +changes is the conversion between UTC and local time. The translation +to local time is done by library functions that are linked into the +application programs. + + + +This has two consequences: First, any application that needs to know +the local time also needs to know what time zone you're in, and +whether DST is in effect or not (see the next section for more on time +zones). Second, there is no provision in the kernel to change either +the system clock or the RTC as DST comes and goes, because UTC doesn't +change. Therefore, machines that only run Linux should have the RTC +set to UTC, not local time. + + + +However, many people run dual-boot systems with other OS's that expect +the RTC to contain the local time, so hwclock needs to know whether +your RTC is in local time or UTC, which it then converts to seconds +past Jan 1, 1970 (UTC). This still does not provide for seasonal +changes to the RTC, so the change must be made by the other OS (this +is the one exception to the rule against letting more than one program +change the time in the RTC). + + + +Unfortunately, there are no flags in the RTC or the CMOS RAM to +indicate standard time vs DST, so each OS stores this information +someplace where the other OS's can't find it. This means that hwclock +must assume that the RTC always contains the correct local time, even +if the other OS has not been run since the most recent seasonal time +change. + + + +If Linux is running when the seasonal time change occurs, the system +clock is unaffected and applications will make the correct conversion. +But if linux has to be rebooted for any reason, the system clock will +be set to the time in the RTC, which will be off by one hour until the +other OS (usually Windows) has a chance to run. + + + +There is no way around this, but Linux doesn't crash very often, so +the most likely reason to reboot on a dual-boot system is to run the +other OS anyway. But beware if you're one of those people who shuts +down Linux whenever you won't be using it for a while-- if you haven't +had a chance to run the other OS since the last time change, the RTC +will be off by an hour until you do. + + + +Some other documents have stated that setting the RTC to UTC allows +Linux to take care of DST properly. This is not really wrong, but it +doesn't tell the whole story-- as long as you don't reboot, it does +not matter which time is in the RTC (or even if the RTC's battery +dies). Linux will maintain the correct time either way, until the next +reboot. In theory, if you only reboot once a year (which is not +unreasonable for Linux), DST could come and go and you'd never notice +that the RTC had been wrong for several months, because the system +clock would have stayed correct all along. But since you can't predict +when you'll want to reboot, it's better to have the RTC set to UTC if +you're not running another OS that requires local time. + + + +The Dallas Semiconductor RTC chip (which is a drop-in replacement for +the Motorola chip used in the IBM AT and clones) actually has the +ability to do the DST conversion by itself, but this feature is not +used because the changeover dates are hard-wired into the chip and +can't be changed. Current versions change on the first Sunday in April +and the last Sunday in October, but earlier versions used different +dates (and obviously this doesn't work in countries that use other +dates). Also, the RTC is often integrated into the motherboard's +"chipset" (rather than being a separate chip) and I don't know if they +all have this ability. + + +2.4. How Linux keeps Track of Time Zones + + +You probably set your time zone correctly when you installed Linux. +But if you have to change it for some reason, or if the local laws +regarding DST have changed (as they do frequently in some countries), +then you'll need to know how to change it. If your system time is off +by some exact number of hours, you may have a time zone problem (or a +DST problem). + + + +Time zone and DST information is stored in /usr/share/zoneinfo (or +/usr/lib/zoneinfo on older systems). The local time zone is +determined by a symbolic link from /etc/localtime to one of these +files. The way to change your timezone is to change the link. If +your local DST dates have changed, you'll have to edit the file. + + + +You can also use the TZ environment variable to change the current +time zone, which is handy of you're logged in remotely to a machine in +another time zone. Also see the man pages for tzset and tzfile. +This is nicely summarized at + + + +2.5. The Bottom Line + + +If you don't need sub-second accuracy, hwclock(8) and adjtimex(8) may +be all you need. It's easy to get enthused about time servers and +radio clocks and so on, but I ran the old clock(8) program for years +with excellent results. On the other hand, if you have several +machines on a LAN it can be handy (and sometimes essential) to have +them automatically sync their clocks to each other. And the other +stuff can be fun to play with even if you don't really need it. + + + +On machines that only run Linux, set the RTC to UTC (or GMT). On +dual-boot systems that require local time in the RTC, be aware that if +you have to reboot Linux after the seasonal time change, the clock may +be temporarily off by one hour, until you have a chance to run the +other OS. If you run more than two OS's, be sure only one of them is +trying to adjust for DST. + + + +NTP is a standard method of synchronising time on a client from a remote +server across the network. NTP clients are typically installed on servers. +NTP is a standard method of synchronising time across a network of +computers. NTP clients are typically installed on servers. +Most business class ISPs provide NTP servers. Otherwise, there are a +number of free NTP servers in Australia: + + + +The Univeristy of Melbourne ntp.cs.mu.oz.au +University of Adelaide ntp.saard.net +CSIRO Marine Labs, Tasmania ntp.ml.csiro.au +CSIRO National Measurements Laboratory, Sydney ntp.syd.dms.csiro.au + + + +Xntpd (NTPv3) has been replaced by ntpd (NTPv4); the earlier version +is no longer being maintained. + + + +Ntpd is the standard program for synchronizing clocks across a +network, and it comes with a list of public time servers you can +connect to. It can be a little more complicated to set up, but if +you're interested in this kind of thing I highly recommend that you +take a look at it. + + + +The "home base" for information on ntpd is the NTP website at + which also includes links to all +kinds of interesting time-related stuff (including software for other +OS's). Some linux distributions include ntpd on the CD. There is a +list of public time servers at +. + + + +A relatively new feature in ntpd is a "burst mode" which is designed +for machines that have only intermittent dial-up access to the +internet. + + + +Ntpd includes drivers for quite a few radio clocks (although some +appear to be better supported than others). Most radio clocks are +designed for commercial use and cost thousands of dollars, but there +are some cheaper alternatives (discussed in later sections). In the +past most were WWV or WWVB receivers, but now most of them seem to be +GPS receivers. NIST has a PDF file that lists manufacturers of radio +clocks on their website at + (near the bottom of +the page). The NTP website also includes many links to manufacturers +of radio clocks at and +. Either list may +or may not be up to date at any given time :-). The list of drivers +for ntpd is at +. + + + +Ntpd also includes drivers for several dial-up time services. These +are all long-distance (toll) calls, so be sure to calculate the effect +on your phone bill before using them. + + +3.4. Chrony + + +Xntpd was originally written for machines that have a full-time +connection to a network time server or radio clock. In theory it can +also be used with machines that are only connected intermittently, but +Richard Curnow couldn't get it to work the way he wanted it to, so he +wrote "chrony" as an alternative for those of us who only have network +access when we're dialed in to an ISP (this is the same problem that +ntpd's new "burst mode" was designed to solve). The current version +of chrony includes drift correction for the RTC, for machines that are +turned off for long periods of time. + + + +You can get more information from Richard Curnow's website at + or . +There are also two chrony mailing lists, one for announcements and one +for discussion by users. For information send email to chrony-users- +subscribe@egroups.com or chrony-announce-subscribe@egroups.com + + + +Chrony is normally distributed as source code only, but Debian has +been including a binary in their "unstable" collection. The source +file is also available at the usual Linux archive sites. + + +3.5. Clockspeed + + +Another option is the clockspeed program by DJ Bernstein. It gets the +time from a network time server and simply resets the system clock +every three seconds. It can also be used to synchronize several +machines on a LAN. + + + +I've sometimes had trouble reaching his website at +, so if you get a DNS error try again +on another day. I'll try to update this section if I get some better +information. + + + +Note +You must be logged in as "root" to run any program that affects +the RTC or the system time, which includes most of the programs +described here. If you normally use a graphical interface for +everything, you may also need to learn some basic unix shell +commands. + + + +Note +If you run more than one OS on your machine, you should only let +one of them set the RTC, so they don't confuse each other. The +exception is the twice-a-year adjustment for Daylight Saving(s) +Time. + + + +If you run a dual-boot system that spends a lot of time running +Windows, you may want to check out some of the clock software +available for that OS instead. Follow the links on the NTP website at +. + + + + diff --git a/LDP/guide/docbook/Linux-Networking/TFTP.xml b/LDP/guide/docbook/Linux-Networking/TFTP.xml deleted file mode 100644 index 392a85f8..00000000 --- a/LDP/guide/docbook/Linux-Networking/TFTP.xml +++ /dev/null @@ -1,92 +0,0 @@ - - -TFTP - - -Trivial File Transfer Protocol TFTP is a bare-bones protocol used by -devices that boot from the network. It is runs on top of UDP, so it -doesn't require a real TCP/IP stack. Misunderstanding: Many people -describe TFTP as simply a trivial version of FTP without authentication. -This misses the point. The purpose of TFTP is not to reduce the complexity -of file transfer, but to reduce the complexity of the underlying TCP/IP -stack so that it can fit inside boot ROMs. Key point: TFTP is almost -always used with BOOTP. BOOTP first configures the device, then TFTP -transfers the boot image named by BOOTP which is then used to boot the -device. Key point: Many systems come with unnecessary TFTP servers. Many -TFTP servers have bugs, like the backtracking problem or buffer overflows. -As a consequence, many systems can be exploited with TFTP even though -virtually nobody really uses it. Key point: A TFTP file transfer client -is built into many operating systems (UNIX, Windows, etc....). These clients -are often used to download rootkits when being broken into. Therefore, -removing the TFTP client should be part of your hardening procedure. -For further details on the TFTP protocol please see RFC's 1350, 1782, -1783, 1784, and 1785. - - - -Most likely, you'll interface with the TFTP protocol using the TFTP command -line client, 'tftp', which allows users to transfer files to and from a -remote machine. The remote host may be specified on the command line, in -which case tftp uses host as the default host for future transfers. - - - -Setting up TFTP is almost as easy as DHCP. -First install from the rpm package: - -# rpm -ihv tftp-server-*.rpm - - - - -Create a directory for the files: - -# mkdir /tftpboot -# chown nobody:nobody /tftpboot - - - - -The directory /tftpboot is owned by user nobody, because this is the default -user id set up by tftpd to access the files. Edit the file /etc/xinetd.d/tftp -to look like the following: - - - - -service tftp -{ - socket_type = dgram - protocol = udp - wait = yes - user = root - server = /usr/sbin/in.tftpd - server_args = -c -s /tftpboot - disable = no - per_source = 11 - cps = 100 2 -} - - - - -The changes from the default file are the parameter disable = no (to enable -the service) and the server argument -c. This argument allows for the -creation of files, which is necessary if you want to save boot or disk -images. You may want to make TFTP read only in normal operation. - - - -Then reload xinetd: - -/etc/rc.d/init.d/xinetd reload - - - - -You can use the tftp command, available from the tftp (client) rpm package, -to test the server. At the tftp prompt, you can issue the commands put and -get. - - - diff --git a/LDP/guide/docbook/Linux-Networking/Telnet.xml b/LDP/guide/docbook/Linux-Networking/Telnet.xml deleted file mode 100644 index af3dcd2f..00000000 --- a/LDP/guide/docbook/Linux-Networking/Telnet.xml +++ /dev/null @@ -1,35 +0,0 @@ - - -Telnet - - -Created in the early 1970s, Telnet provides a method of running command -line applications on a remote computer as if that person were actually at -the remote site. Telnet is one of the most powerful tools for Unix, allowing -for true remote administration. It is also an interesting program from the -point of view of users, because it allows remote access to all their files -and programs from anywhere in the Internet. Combined with an X server (as -well as some rather arcane manipluation of authentication 'cookies' and -'DISPLAY' environment variables), there is no difference (apart from the -delay) between being at the console or on the other side of the planet. -However, since the 'telnet' protocol sends data 'en-clair' and there are -now more efficient protocols with features such as built-in -compression and 'tunneling' which allows for greater ease of usage of graphical -applications across the network as well as more secure connections it is an -effectively a dead protocol. Like the 'r' (such as rlogin and rsh) related -protocols it is still used though, within internal networks for the reasons -of ease of installation and use as well as backwards compatibility and also -as a means by which to configure networking devices such as routers -and firewalls. - - - -Please consult RFC 854 for further details behind its implementation. - - - - ˇ Telnet related software - - - - diff --git a/LDP/guide/docbook/Linux-Networking/VNC.xml b/LDP/guide/docbook/Linux-Networking/VNC.xml deleted file mode 100644 index cfaa3d77..00000000 --- a/LDP/guide/docbook/Linux-Networking/VNC.xml +++ /dev/null @@ -1,138 +0,0 @@ - - -VNC - - 8.13. Tunnelling, mobile IP and virtual private networks - - The Linux kernel allows the tunnelling (encapsulation) of protocols. - It can do IPX tunnelling through IP, allowing the connection of two - IPX networks through an IP only link. It can also do IP-IP tunnelling, - which it is essential for mobile IP support, multicast support and - amateur radio. (see - http://metalab.unc.edu/mdw/HOWTO/NET3-4-HOWTO-6.html#ss6.8) - - Mobile IP specifies enhancements that allow transparent routing of IP - datagrams to mobile nodes in the Internet. Each mobile node is always - identified by its home address, regardless of its current point of - attachment to the Internet. While situated away from its home, a - mobile node is also associated with a care-of address, which provides - information about its current point of attachment to the Internet. - The protocol provides for registering the care-of address with a home - agent. The home agent sends datagrams destined for the mobile node - through a tunnel to the care-of address. After arriving at the end of - the tunnel, each datagram is then delivered to the mobile node. - - Point-to-Point Tunneling Protocol (PPTP) is a networking technology - that allows the use of the Internet as a secure virtual private - network (VPN). PPTP is integrated with the Remote Access Services - (RAS) server which is built into Windows NT Server. With PPTP, users - can dial into a local ISP, or connect directly to the Internet, and - access their network as if they were at their desks. PPTP is a closed - protocol and its security has recently being compromised. It is highly - recomendable to use other Linux based alternatives, since they rely on - open standards which have been carefully examined and tested. - - - ˇ A client implementation of the PPTP for Linux is available here - - - ˇ More on Linux PPTP can be found here - - - Mobile IP: - - ˇ http://www.hpl.hp.com/personal/Jean_Tourrilhes/MobileIP/mip.html - - ˇ http://metalab.unc.edu/mdw/HOWTO/NET3-4-HOWTO-6.html#ss6.12 - - Virtual Private Networks related documents: - - - ˇ http://metalab.unc.edu/mdw/HOWTO/mini/VPN.html - - ˇ http://sites.inka.de/sites/bigred/devel/cipe.html - - -7.4. VNC - - VNC stands for Virtual Network Computing. It is, in essence, a remote - display system which allows one to view a computing 'desktop' - environment not only on the machine where it is running, but from - anywhere on the Internet and from a wide variety of machine - architectures. Both clients and servers exist for Linux as well as for - many other platforms. It is possible to execute MS-Word in a Windows - NT or 95 machine and have the output displayed in a Linux machine. The - opposite is also true; it is possible to execute an application in a - Linux machine and have the output displayed in any other Linux or - Windows machine. One of the available clients is a Java applet, - allowing the remote display to be run inside a web browser. Another - client is a port for Linux using the SVGAlib graphics library, - allowing 386s with as little as 4 MB of RAM to become fully functional - X-Terminals. - - ˇ VNC web site - - -Virtual Network Computing (VNC) allows a user to operate a session running on another machine. -Although Linux and all other Unix-like OSes already have this functionality built in, VNC -provides further advantages because it's cross-platform, running on Linux, BSD, Unix, Win32, -MacOS, and PalmOS. This makes it far more versatile. - -For example, let's assume the machine that you are attempting to connect to is running Linux. -You can use VNC to access applications running on that other Linux desktop. You can also use -VNC to provide technical support to users on Window's based machines by taking control of -their desktops from the comfort of your server room. VNC is usually installed as seperate -packages for the client and server, typically named 'vnc' and 'vnc-server'. - -VNC uses screen numbers to connect clients to servers. This is because Unix machines allow -multiple graphical sessions to be stated simultaneously (check this out by logging in to a -virtual terminal and typing startx -- :1). - -For platforms (Windows, MacOS, Palm, etc) which don't have this capability, you'll connect -to 'screen 0' and take over the session of the existing user. For Unix systems, you'll need -to specify a higher number and receive a new desktop. - -If you prefer the Windows-style approach where the VNC client takes over the currently -running display, you can use x0rfbserver - see the sidebox below. - -VNC Servers and Clients - -On Linux, the VNC server (which allows the machine to be used remotely) is actually -run as a replacement X server. To be able to start a VNC session to a machine, log -into it and run vncserver. You'll be prompted for a password - in future you can -change this password with the vncpasswd command. After you enter the password, you'll -be told the display number of the newly created machine. - -It is possible to control a remote macine by using the vncviewer command. If it is -typed on its own it will prompt for a remote machine, or you can use: -vncviewer [host]:[screen-number] - -> The VPN HOWTO, deprecated!!!! -> VPN HOWTO -> Linux VPN Masquerade HOWTO - - - 10. References - - 10.1. Web Sites - - Cipe Home Page - - Masq Home Page - - Samba Home Page - - Linux HQ ---great site for lots of linux - info - - 10.2. Documentation - - cipe.info: info file included with cipe distribution - - Firewall HOWTO, by Mark Grennan, markg@netplus.net - - IP Masquerade mini-HOWTO,by Ambrose Au, ambrose@writeme.com - - IPChains-Howto, by Paul Russell, Paul.Russell@rustcorp.com.au - - diff --git a/LDP/guide/docbook/Linux-Networking/WaveLAN.xml b/LDP/guide/docbook/Linux-Networking/WaveLAN.xml deleted file mode 100644 index f5b6cd48..00000000 --- a/LDP/guide/docbook/Linux-Networking/WaveLAN.xml +++ /dev/null @@ -1,31 +0,0 @@ - - -WaveLAN - - -The WaveLAN card is a spread spectrum wireless lan card. The card -looks very like an ethernet card in practice and is configured in much -the same way. - - - -You can get information on the Wavelan card from wavelan.com. - - - -Wavelan device names are `eth0', `eth1', etc. - - - - Kernel Compile Options: - - Network device support ---> - [*] Network device support - .... - [*] Radio network interfaces - .... - <*> WaveLAN support - - - - diff --git a/LDP/guide/docbook/Linux-Networking/Web-Serving.xml b/LDP/guide/docbook/Linux-Networking/Web-Serving.xml deleted file mode 100644 index 5914e480..00000000 --- a/LDP/guide/docbook/Linux-Networking/Web-Serving.xml +++ /dev/null @@ -1,76 +0,0 @@ - - -Web-Serving - - -The World Wide Web provides a simple method of publishing and linking -information across the Internet, and is responsible for popularising -the Internet to its current level. In the simplest case, a Web client -(or browser), such as Netscape or Internet Explorer, connects with a -Web server using a simple request/response protocol called HTTP -(Hypertext Transfer Protocol), and requests HTML (Hypertext Markup -Language) pages, images, Flash and other objects. - - - -In mode modern situations, the Web server can also geneate pages -dynamically based on information returned from the user. Either way -setting up your own Web server is extremely simple. There are many -choices for Web serving under Linux. Some servers are very mature, -such as Apache, and are perfect for small and large sites alike. -Other servers programmed to be light and fast, and to have only a -limited feature set to reduce complexity. A search on freshmeat.net -will reveal a multitude of servers. - - - -Most Linux distributions include Apache . -Apache is the number one server on the internet according to -http://www.netcraft.co.uk/survey/ . More than a half of all internet -sites are running Apache or one of it derivatives. Apache's advantages -include its modular design, stability and speed. Given the appropriate -hardware and configuration it can support the highest loads: Yahoo, -Altavista, GeoCities, and Hotmail are based on customized versions of -this server. - - - -Optional support for SSL (which enables secure transactions) is also -available at: - - - ˇ http://www.apache-ssl.org/ - ˇ http://raven.covalent.net/ - ˇ http://www.c2.net/ - -Dynamic Web content generation - - -Web scripting languages are even more common on Linux than databases -- basically, every language is available. This includes CGI, -PHP 3 and 4, Perl, JSP, ASP (via closed source applications from -Chill!soft and Halycon Software) and ColdFusion. - - - -PHP is an open source scripting language designed to churn out -dynamically produced Web content ranging from databases to browsers. -This inludes not only HTML, but also graphics, Macromedia Flash and -XML-based information. The latest versions of PHP provide impressive -speed improvements, install easily from packages and can be set up -quickly. PHP is the most popular Apache module and is used by over -two million sites, including Amazon.com, US telco giant Sprint, -Xoom Networks and Lycos. And unlike most other server side scripting -languages, developers (or those that employ them) can add their own -functions into the source to improve it. Supported databases include -those in the Database serving section and most ODBC compliant -databases. The language itself borrows its structure from Perl and C. - - - ˇ http://metalab.unc.edu/mdw/HOWTO/WWW-HOWTO.html - ˇ http://metalab.unc.edu/mdw/HOWTO/Virtual-Services-HOWTO.html - ˇ http://metalab.unc.edu/mdw/HOWTO/Intranet-Server-HOWTO.html - ˇ Web servers for Linux - - - diff --git a/LDP/guide/docbook/Linux-Networking/X11.xml b/LDP/guide/docbook/Linux-Networking/X11.xml deleted file mode 100644 index f96513bc..00000000 --- a/LDP/guide/docbook/Linux-Networking/X11.xml +++ /dev/null @@ -1,61 +0,0 @@ - - -X11 - - -The X Window System was developed at MIT in the late 1980s, rapidly -becoming the industry standard windowing system for Unix graphics -workstations. The software is freely available, very versatile, and is -suitable for a wide range of hardware platforms. Any X environment -consists of two distinct parts, the X server and one or more X -clients. It is important to realise the distinction between the server -and the client. The server controls the display directly and is -responsible for all input/output via the keyboard, mouse or display. -The clients, on the other hand, do not access the screen directly - -they communicate with the server, which handles all input and output. -It is the clients which do the "real" computing work - running -applications or whatever. The clients communicate with the server, -causing the server to open one or more windows to handle input and -output for that client. - - - -In short, the X Window System allows a user to log in into a remote -machine, execute a process (for example, open a web browser) and have -the output displayed on his own machine. Because the process is -actually being executed on the remote system, very little CPU power is -needed in the local one. Indeed, computers exist whose primary purpose -is to act as pure X servers. Such systems are called X terminals. - - - -A free port of the X Window System exists for Linux and can be found -at: Xfree . It is included in most Linux -distributions. - - - -For further information regarding X please see: - - -X11, LBX, DXPC, NXServer, SSH, MAS - -Related HOWTOs: - -ˇ Remote X Apps HOWTO -ˇ Linux XDMCP HOWTO -ˇ XDM and X Terminal mini-HOWTO -ˇ The Linux XFree86 HOWTO -ˇ ATI R200 + XFree86 4.x mini-HOWTO -ˇ Second Mouse in X mini-HOWTO -ˇ Linux Touch Screen HOWTO -ˇ XFree86 Video Timings HOWTO -ˇ Linux XFree-to-Xinside mini-HOWTO -ˇ XFree Local Multi-User HOWTO -ˇ Using Xinerama to MultiHead XFree86 V. 4.0+ -ˇ Connecting X Terminals to Linux Mini-HOWTO -ˇ How to change the title of an xterm -ˇ X Window System Architecture Overview HOWTO -ˇ The X Window User HOWTO - -