LDP/LDP/guide/docbook/Linux-Networking/Protocols-Standards-Service...

6935 lines
283 KiB
XML
Raw Blame History

<sect1 id="Protocols-Standards-Services">
<title>Protocols-and-Standards-Services</title>
<para>
IEEE (Institute of Electrical and Electronics Engineers) 802 Standards
802.1 Internetworking
802.2 Logical Link Control (LLC)
802.3 CSMA/CD (Ethernet) media access method
802.4 Token bus media access method
802.5 Token Ring Media access method
802.6 Metropolitan Area Netwoks (MANs)
802.7 Broadband technologies
802.8 Fiber optic technologies
802.9 Hybrid (voice and data) networking
802.10 Network security
802.11 Wireless Networking
802.12 High-speed LANs
</para>
Protocols
So it is clear that data units can be transmitted from a sender Data-link Layer to a (peer) receiver Data-Link ayer. The data unit (DU) is encapsulated in a frame. Each frame contains additional information. The meaning of this additional information and the rules that the sender and receiver must follow when processing this information consitutue the protocol. Hence, the frame constitutes a Protocol Data Unit (PDU). To distinguish between PDU's of different layers the PDU may be referred to as a DPDU (D for Data-link).
Piggy-backing
Each PDU may or may not contain data. In the later case, the PDU is being used expressly for the purpose of the protocol, eg. for the receiver to signal that a frame was corrupted.
Sometimes a PDU used for the purpose of protocol alone is called a control message. Sending an entire PDU without any data can lead to a waste of resources. It is possible to control information in a PDU which contains data. Thisis called piggy-backing and is a commonly used technique to save resources. The drawback is that sometimes there is no data available or ready to be sent, in which case the control message may be delayed until data becomes available.
Utopia protocol
The Utopia protocol assumes that the logical-link and receiver are ideal, e. the logical-link is error free, provides unlimited capacity and the receiver can receive PDU's at any rate. With these assumptions, each DU is transmitted as soon as it is given, without any other mechanisms or support. Similarly the receiver delivers each DU as sson as it arrives without checking for errors, duplicates or out of order delivery.
Out of order delivery is possible, even if the sender sent the frames in order! One possiblity is that an entire frame is lost (never to be received) due to noise. Utopia assumes that this cannot happen.
Stop-and-Wait Protocol
Consider sending frames from a fast machine to a slow machine. What happens if the slow machine is reading the data at half the rate that the fast machine is sending it? Eventually the sent data will be lost, never to be received.
For this reason it is important to provide some kind of flow control.
The Stop-and-Wait protocol requires the receiver to send an acknowledgement PDU in return for every frame received. Such a PDU is often called an ACK. The sender will wait for the ACK before sending the next frame.
Kinds of errors
Three kinds of errors can occur:
- the bits in the frame can be inverted, anywhere within the frame including the data bits or the frame's control bits.
- additional bits can be inserted into the frame, before the frame or after the frame and
- bits can be deleted from the frame
Such errors could cause the entire frame to be deleted. Errors don't necessarily happen because of noise. Sometimes an intermediate device devides to "drop" a frame, because eg. it's buffer may be full. These kinds of errors may lead to frames being mistaken as other frames, to ACK's being lost.... even to ACK's magically appearing!
Echo checking
Clearly, error control is required. Error control can be implemented at all layers with the choice of whn, where and how being made by those who develop the layers.
At the highest layer, the use must implement menual error control, eg. by inspecting the result of a file transfer to ensure that the contents are as expected, or by seeing that the keys typed on the terminal are echoed properly onto the display. In fact, error checking is a simple form of error control. In this case, the receiver sends back (echoes) a copy of the receive data unit, for the sender to check.
Automatic Repeat Request
Echo checking is not feasible when the computer is transmitting long data unit sequences. Typically the error control involves the receiver checking the frame for possible errors (perhaps implementing some amount of error correcting) and then either:
1. sending a positive acknowledgement as a form of receipt or
2. sending a negative acknowledgement (NACK) to request another copy of the frame be sent.
This type of error control is known as automatic repeat request (ARQ). ARQ is used in two ways:
- idle RQ and
- continuous RQ.
In either case, the use of ACK's, NACK's or both is implementation dependent and this can lead to a variety of protocols which differ slightly in their specification/operation. The continuous RQ typically uses one of the following transmission strategies:
- selective repeat or
- go-back-N
Sequence numbers
For all the ARQ error control methods, it is necessary to distinguish between different frames. The principle reason is so that duplicate frames can be avoided and to ensure that all frames are eventually received.
Furthermore, because the sequence number must be recorded as part of the frame it will occupy some number of bytes. For frames with fixed formats, the maximum size of the sequence number is also fiexed. So eg. the sequence number may be designated as 1 bit, 2 bits, 8 bits, etc....
Although a fixed suze sequence number may suggest that only a fixed number of frames can be transmitted, as will be found when examining the protocols, distinguishing each frame from every frame is not necessary and in some caes it is quite possible to have a 1-bit sequence number.
Timers
A timer provides an event after a given time interval. Timers are used to trigger protocol state transitions. Eg. recall the stop-and-wait protocol as previously described. If the ACK is lost then the sender will wait indefinitely. In this case, a timer may be used to trigger the protocol in the extendede absence of an ACK.
Idle RQ
The stop-and-wait protocol given earlier is an incomplete form of idle RQ. The basic RQ uses a timer at the sender and either ACK's only (implicit retransmission) or a combination of ACK's and NACK's (explicit retransmission).
When sending a frame, F(n), the sender starts a timer. The sender waits to receive either an ACK(n) or a NACK(n). If nothing is recieved before the timer expires then the timer is reset and F(n) is retransmitted (implicit). If ACK(n) is received then the sender repeats the process F(n+1). If NACK(n) is received before the timer expires then the sender resets the timer and retransmits F(n) (explicit).
Using a NACK
See that the NACK is used when receiving a corrupted frame and signals the sender to stop the timer and send F(n) immediately. Intuitively this decreases the amount of time wasted, and so increases the link utlization.
Link utilization
Clearly, the use of a NACK increases the use of the link. To quantify this requires first examining the basic link utilization of idle RQ.
See Page 48.
Idle RQ and sequence numbers
Clearly, the sequence numbers for idle RQ may be 1 bit. For example, the sender transmits frame F(0) and does not transmit frame F(1) until F(0) has been acknowledged. At any one time, there is only a single frame in question. The two sequence numbers are required in case that the ACK(0) is lost - the sender times out and resends F(0), however the receiver is now waiting for F(1) and can reject F(0).
In this way, the sender has to buffer only one frame, and the receiver has to record only the sequence number of the frame that was previously received.
Because of this, idle RQ does not require much buffer space and is therefore used when simple devices are communicating, such as between a terminal and a display.
Continuous RQ
In previous slide the idle RQ protocol was shown to provide poor link utilization when the time to transmit a frame was significantly less than the propogatio delay.
A number of physical layer services provide links with significantly long propogation delays, such as satellite communication and high bandwidth Ethernet.
Continuous RQ allows the sender to continue to transmit frames, even though no ACK's may have been received for previously transmitted frames.
Sliding windows
Continuous RQ may require buffering at both the sender and the receiver (as well as seen) and, with finite buffers, it is necessary to ensure that neither the sender's not the receiver's buffer becomes over loaded. A sliding window is a flow control which is used to maintain finite buffers at the sender and receiver.
There are numerous methods of implementing a sliding window. However this subject will introduce and use only one method.
Parameters for a sliding window
Each sliding window uses three parameters:
1. the size of the window K,
2. the lower window edge LWE and
3. the upper window edge UWE.
Normally, K remains as a constant while LWE and UWE vary as frames are transmitted and acknowledged. Both the sender and receiver have their own set of parameters and their own interpretations.
Sequence numbers and window size
The size of the sequence number and the window size are related and the relationship depends on the protocol.
Consider the idle RQ from previous slides. Idle RQ uses 1-bit sequence number which gives two different frame identifiers {0,1}. However the window size is K=1 at both the sender and receiver. The reason for this will be explained during go-back-N protocol (idle RQ is a simplified go-back-N). For now, examine the next slide to see how sequence numbers and sliding windows relate.
Go-back-N protocol
With an understanding of sequence numbers and sliding window it is possible to quickly understand the Go-back-N protocol.
The sender uses a window size of K. The receiver uses a window size of 1.
The receiver sends an ACK for each frame as it is received. If a corrupted frame as it is received. If a corrupted frame, F(n), is received or F(n) is missing (didn't arrive) then the receiver sends a special NACK(n) to request that the sender start restransmitting from frame F(n).
Selective repeat protocol
Selective repeat uses a window size of K at both the sender and receiver.
The protocol can be implemented in one of two ways: either the receiver sends only ACK's and the sender implicitly retransmits NACK's to explicity request a frame retransmission.
The seder buffers out-of-order frames until the missing frames are retransmitted.
As the name suggests, the receiver can "selectively" request retransmission of a frame.
When using ACK's only, the absence of an ACK triggers a retransmission.
When using ACK's only, if an ACK is lost, then a retransmission is triggered, but the retransmission frame is discarded. If NACK's are used, then the NACK explicity triggers a retransmission.
Explicit and implicit strategies
Explicit retransmission strategy uses NACK's. Implicit retransmission strategy does not use NACK's.
For explicit retransmission strategy an ACK(n) is sufficient to acknowledge all frames prior to and including F(n).
For implicit retransmission strategy an ACK(n) does not acknowledge all frames prior to F(n). The reason for this is that the strategy relies on the receiver detecting a missing ACK as the trigger for a retransmission.
Sequence numbers and K for selective repeat
Duplex protocols and lnks
So far we have discussed the operation of protocols that have data being transmitted from a sender to a receiver, called simplex protocols. In most cases these protocols required that some PDU's be sent back to the receiver.
Many links allow for data to be sent in both directions simultaneously. Indeed, a single medium may be divided logically into a send and receive channel using FSM. Such links are called duplex links and the protocols that allow data to flow in both directions simultaneously are duplex protocols.
In contrast, recall the primitive stop-and-wait protocol. At any other time, the link was being used to either send a frame or to receive ackowledgement. In this case it is sufficient for the link to be half-duplex. Duplex protocols can operate over a half-duplex link but with less efficiency.
The concepts are studied further when considering the medium access control sub-layer.
Protocol efficiency
The concept of link utilization does not consider the amount of data actually sent is less than the length of the frame. The protocol efficiency refers to the fraction of total data transmitted that represents the DU's themselves. This value is always less than 1 because each frame contains additional protocol information such as CRC and sequence number.
The link efficiency is further reduced to obtain the protocol efficiency. If the length of the DU is D and the length of the frame is L, then only D/L of the utilized link is actually used to transmit DU's.
Medium Access Control
The protocols given in the previous lectures assumed that the sender had a direct link to the receiver , and also that the link provided duplex communication, ie. so that frames could be sent at the same time that ACK's were being received. Recall that the repeater is a simple Layer 1 device that, any number of repeaters may exist between two data-link layer peers, transparently to the communication.
Also recall that the hub is a general repeater.
Frames from any of the data-link layers will be transmitted to every other data-link layer.
The use of a hub introduces new problems for the protocols because a transmitted frame will potentialy be received by a number of different data-link layers.
So each device needs an address. The address is called the medium access control (MAC) address - it is a unique identifier for every device connected to the shared medium.
The MAC sub-layer
To handle issues of MAC addressing, and other issues to do with avoiding collision of frames transmitted by two or more Layer 2 devices, the Layer 2 is broken into a MAC sub-layer and a Logical Link Contrl (LLC) sub-layer.
The protocols discussed so far are those implmented by the LLC sub-layer. The MAC sub-layer abstracts the complexity of medium access from the LLC sub-layer.
Channels
Recall that the Physical Layer is responsible for transmitting a bit stream on the transmission medium. The Physical Layer may share several bit streams over the one transmission medium using either TDM or FDM. In this way the transmission medium has been divided into a number of channels. For a prticular (shared) tranmission medium, each MAC sub-layer has access to the same set of channels. The problem for the MAC sub-layer is to decide how a particular channel will be shared.
MAC sub-layer protocols
What protocol can the MAC sub-layers implement to ensure that channels are used efficiently? (read Section 4.1 and 4.2 of your textbook)
You will see that the protocol of choice is one which takes into consideration the types of channels and also factors such as the propogation delay, frame length, error rate and the frame rate or load.
Classifying MAC protocols
There are two kinds of MAC protocols:
contention protocols: these protocols allow the possiblity of frames colliding.
contention-less protocols: these protocols don't allow the possibliity of frames colliding.
The choice of a contention or contention-less protocol is mainly made on the basis of the transmission medium. In other words different MAC sub-layer protocols have been developed according to the type of Physical Layer.
Logical Token Bus
Logical Token Bus is a contention-less protocol. With this protocol the MAC sub-layers pass around a token and only the MAC sub-layer with the token can use the channel. When a particular MAC sub-layer has finished sending a frame, it sends the token to the next MAC sub-layer, and so on. If a MAC sub-layer has no frame to transmit then it simply passes the token to the next MAC sub-layer. Clearly the protocol can be applied for both half duplex and full duplex channels.
CSMA
To overcome the low efficiency of the Logical Token Bus protocol the Carrier Sense Multiple Access (CSMA) protocol can be used. In this case devices are allowed to transmit a frame at any time (multiple access). However, each device will not transmit a frame if they are receiving a frame (carrier sense).
For the case of persistent CSMA, if a device detects that a frame is being recieved it will send its own frame immediately after the frame has been received. The the case of non-persistent CSMA the device will wait for some random period before transmitting .
Collisions can occur so this protocol is a contention protocol. If collision detection (CD) is used then colliding frames can be aborted.
ALOHA
ALOHA is a contention protocol and is used when carrier sense is not available. Mainly this i for satellite communication. The protocol allows devices to transmit at any time. Read your textbook to see the analysis of ALOHA for the case when slotted and unslotted ALOHA is used.
<AX25>
3.8. Amateur Radio
The Linux kernel has built-in support for amateur radio protocols.
Especially interesting is the AX.25 support. The AX.25 protocol offers
both connected and connectionless modes of operation, and is used
either by itself for point-point links, or to carry other protocols
such as TCP/IP and NetRom.
It is similar to X.25 level 2 in structure, with some extensions to
make it more useful in the amateur radio environment.
<20> Amateur radio on Linux web site <http://radio.linux.org.au/>
</AX25>
NDIS and ODI
The Network Device Interface Specification (NDIS) is a standard developed
by Microsoft and IBM to enable communication between protocols and network
card drivers. The purpose of NDIS is to abstract the functions of the
network driver so that protocols can work with any driver. NDIS works
within the data link layer of the OSI model.
NDIS allows software components to be written in a modular fashion, and
components that conform to a version of the NDIS specification are
guaranteed to communicate with eachother. The current version of NDIS
is 4.0.
The process of assigning a protocol to a network card is called binding.
NDIS allows multiple protocols to be bound to a single network card,
and multiple network cards to be bound to a single protocol (or multiple
protocols).
ODI (Open Datalink Interface), devloped by Novell and Apple, is an
implementation of the same functionality. While designed primarily for
the IPX protocol, ODI can be used with any protocol. Netware clients and
servers can have network cards bound to multiple protocols. Microsoft's
implementation of the IPX protocol, NWLink, also supports the ODI standard.
</sect1>
<sect1 id="Appletalk">
<title>Appletalk</title>
<para>
Appletalk is the network architecture/internetworking stack developed
by Apple to work with Macintosh computers. It allows a peer-to-peer
network model which provides basic functionality such as file and printer
sharing. Each machine can simultaneously act as a client and a server,
and the software and hardware necessary are included with every Apple
computer. Appletalk actually supports three network transports:
Ethernet, Token Ring, and a dedicated system called Localtalk.
</para>
<para>
LocalTalk is traditionally wired in a star or hybrid topology using custom
connectors and STP cable. A popular third-party system allows ordinary phone
cable to be used instead of STP. LocalTalk supports up to 32 node per network.
The implementations of Ethernet and Token Ring (EtherTalk and TokenTalk)
support for more sophisticated networks. Localtalk uses CSMA/CA access method.
Rather than detect collisions as with Ethernet, this method requires nodes to
wait a certain amount of time after detecting an existing signal on the network
before attempting to transmit, avoiding most collisions.
</para>
<para>
Linux provides full Appletalk networking. Netatalk is a kernel-level
implementation of the AppleTalk Protocol Suite, originally for BSD-
derived systems. It includes support for routing AppleTalk, serving
Unix and AFS filesystems over AFP (AppleShare), serving Unix printers
and accessing AppleTalk printers over PAP. Linux systems just show up
as another Macintosh on the network.
</para>
<para>
To enable the Appletalk ( AF_APPLETALK ) protocol in the kernel
please add the following options to your kernel configuration.
The Appletalk support has no special device names as it uses
existing network devices.
</para>
<para>
<screen>
Kernel Compile Options:
Networking options --->
<*> Appletalk DDP
</screen>
</para>
<para>
Appletalk support allows your Linux machine to interwork with Apple
networks. An important use for this is to share resources such as
printers and disks between both your Linux and Apple computers.
Additional software is required, this is called netatalk. Wesley Craig
netatalk@umich.edu represents a team called the `Research Systems Unix
Group' at the University of Michigan and they have produced the
netatalk package which provides software that implements the Appletalk
protocol stack and some useful utilities. The netatalk package will
either have been supplied with your Linux distribution, or you will
have to ftp it from its home site at the University of Michigan
</para>
<para>
To build and install the package do something like:
</para>
<para>
<screen>
user% tar xvfz .../netatalk-1.4b2.tar.Z
user% make
root# make install
</screen>
</para>
<para>
You may want to edit the `Makefile' before calling make to actually
compile the software. Specifically, you might want to change the
DESTDIR variable which defines where the files will be installed
later. The default of /usr/local/atalk is fairly safe.
</para>
8.2.1. Configuring the Appletalk software.
<para>
The first thing you need to do to make it all work is to ensure that
the appropriate entries in the /etc/services file are present. The
entries you need are:
</para>
<para>
<screen>
rtmp 1/ddp # Routing Table Maintenance Protocol
nbp 2/ddp # Name Binding Protocol
echo 4/ddp # AppleTalk Echo Protocol
zip 6/ddp # Zone Information Protocol
</screen>
</para>
<para>
The next step is to create the Appletalk configuration files in the
/usr/local/atalk/etc directory (or wherever you installed the
package).
</para>
<para>
The first file to create is the /usr/local/atalk/etc/atalkd.conf file.
Initially this file needs only one line that gives the name of the
network device that supports the network that your Apple machines are
on:
</para>
<para>
<screen>
eth0
</screen>
</para>
<para>
The Appletalk daemon program will add extra details after it is run.
</para>
8.2.2. Exporting a Linux filesystems via Appletalk.
<para>
You can export filesystems from your linux machine to the network so
that Apple machine on the network can share them.
</para>
<para>
To do this you must configure the
/usr/local/atalk/etc/AppleVolumes.system file. There is another
configuration file called /usr/local/atalk/etc/AppleVolumes.default
which has exactly the same format and describes which filesystems
users connecting with guest privileges will receive.
</para>
<para>
Full details on how to configure these files and what the various
options are can be found in the afpd man page.
</para>
<para>
A simple example might look like:
</para>
<para>
<screen>
/tmp Scratch
/home/ftp/pub "Public Area"
</screen>
</para>
<para>
Which would export your /tmp filesystem as AppleShare Volume `Scratch'
and your ftp public directory as AppleShare Volume `Public Area'. The
volume names are not mandatory, the daemon will choose some for you,
but it won't hurt to specify them anyway.
</para>
8.2.3. Sharing your Linux printer across Appletalk.
<para>
You can share your linux printer with your Apple machines quite
simply. You need to run the papd program which is the Appletalk
Printer Access Protocol Daemon. When you run this program it will
accept requests from your Apple machines and spool the print job to
your local line printer daemon for printing.
</para>
<para>
You need to edit the /usr/local/atalk/etc/papd.conf file to configure
the daemon. The syntax of this file is the same as that of your usual
/etc/printcap file. The name you give to the definition is registered
with the Appletalk naming protocol, NBP.
</para>
<para>
A sample configuration might look like:
</para>
<para>
<screen>
TricWriter:\
:pr=lp:op=cg:
</screen>
</para>
<para>
Which would make a printer named `TricWriter' available to your
Appletalk network and all accepted jobs would be printed to the linux
printer `lp' (as defined in the /etc/printcap file) using lpd. The
entry `op=cg' says that the linux user `cg' is the operator of the
printer.
</para>
8.2.4. Starting the appletalk software.
<para>
Ok, you should now be ready to test this basic configuration. There is
an rc.atalk file supplied with the netatalk package that should work
ok for you, so all you should have to do is:
</para>
<para>
<screen>
root# /usr/local/atalk/etc/rc.atalk
</screen>
</para>
<para>
and all should startup and run ok. You should see no error messages
and the software will send messages to the console indicating each
stage as it starts.
</para>
8.2.5. Testing the appletalk software.
<para>
To test that the software is functioning properly, go to one of your
Apple machines, pull down the Apple menu, select the Chooser, click on
AppleShare, and your Linux box should appear.
</para>
8.2.6. Caveats of the appletalk software.
<EFBFBD> You may need to start the Appletalk support before you configure
your IP network. If you have problems starting the Appletalk
programs, or if after you start them you have trouble with your IP
network, then try starting the Appletalk software before you run
your /etc/rc.d/rc.inet1 file.
<EFBFBD> The afpd (Apple Filing Protocol Daemon) severely messes up your
hard disk. Below the mount points it creates a couple of
directories called ``.AppleDesktop'' and Network Trash Folder.
Then, for each directory you access it will create a .AppleDouble
below it so it can store resource forks, etc. So think twice before
exporting /, you will have a great time cleaning up afterwards.
<EFBFBD> The afpd program expects clear text passwords from the Macs.
Security could be a problem, so be very careful when you run this
daemon on a machine connected to the Internet, you have yourself to
blame if somebody nasty does something bad.
<EFBFBD> The existing diagnostic tools such as netstat and ifconfig don't
support Appletalk. The raw information is available in the
/proc/net/ directory if you need it.
8.2.7. More information
<para>
For a much more detailed description of how to configure Appletalk for
Linux refer to Anders Brownworth Linux Netatalk-HOWTO page at
thehamptons.com.
</para>
<para>
Netatalk faq and HOWTO:
</para>
<para>
<EFBFBD> http://thehamptons.com/anders/netatalk/
<EFBFBD> http://www.umich.edu/~rsug/netatalk/
<EFBFBD> http://www.umich.edu/~rsug/netatalk/faq.html
</para>
</sect1 id="Appletalk">
<sect1 id="ARCnet">
<title>ARCnet</title>
<para>
ARCnet, developed in 1977, by Datapoint Corporation, is an older standard
that has largely been replaced by Ethernet in current networks. ARCnet,
uses RG-62 coaxial cable in a star, bus, or hybrid physical topology. This
networking scheme supports active and passive hubs, which must be connected
to an active hub. ARCnet requries 93-ohm terminators at the end of bus
cables, and on unused ports of passive hubs. It supports UTP, coaxial, or
fiber-optic cable. The distance between nodes is 400 feet with UTP cable,
and higher for coaxial or fiber-optic cable.
</para>
<para>
ARCnet uses a token-passing scheme similar to that of token ring. ARCnet
networks support a bandwidth of 2.5 Mbps. Newer standards (ARCnet Plus and
TCNS) support speeds of 20 Mbps and 100 Mbps, but have not really caught on.
</para>
<para>
ARCNet device names are `arc0e', `arc1e', `arc2e' etc. or `arc0s',
`arc1s', `arc2s' etc. The first card detected by the kernel is
assigned `arc0e' or `arc0s' and the rest are assigned sequentially in
the order they are detected. The letter at the end signifies whether
you've selected ethernet encapsulation packet format or RFC1051 packet
format.
</para>
<para>
<screen>
Kernel Compile Options:
Network device support --->
[*] Network device support
<*> ARCnet support
[ ] Enable arc0e (ARCnet "Ether-Encap" packet format)
[ ] Enable arc0s (ARCnet RFC1051 packet format)
</screen>
</para>
<para>
Once you have your kernel properly built to support your ethernet card
then configuration of the card is easy.
</para>
<para>
Typically you would use something like:
</para>
<para>
<screen>
root# ifconfig arc0e 192.168.0.1 netmask 255.255.255.0 up
root# route add -net 192.168.0.0 netmask 255.255.255.0 arc0e
</screen>
</para>
<para>
Please refer to the /usr/src/linux/Documentation/networking/arcnet.txt
and /usr/src/linux/Documentation/networking/arcnet-hardware.txt files
for further information.
</para>
<para>
ARCNet support on Linux was developed by Avery Pennarun, apenwarr@foxnet.net.
</para>
</sect1 id="ARCnet">
<sect1 id="ATM">
<title>ATM</title>
<para>
ATM (Asynchronous Transfer Mode), is a high speed packet switching format
that supports up to 622 Mbps. ATM can be used with T1 and T3 lines, FDDI,
and SONET OC1 and OC3 lines. ATM uses a technology called cell switching.
Data is sent in 53-byte packets called cells. Because packets are small and
uniform in size, they can be quickly routed by hardware switches. ATM uses
a virtual circuit between connection points for high reliability over
high-speed links.
</para>
<para>
ATM support for Linux is currently in pre-alpha stage. There is an
experimental release, which supports raw ATM connections (PVCs and
SVCs), IP over ATM, LAN emulation....
</para>
<para>
The Linux ATM-Linux home page is at, <http://lrcwww.epfl.ch/linux-atm/>
</para>
<para>
Werner Almesberger <werner.almesberger@lrc.di.epfl.ch> is managing a
project to provide Asynchronous Transfer Mode support for Linux.
Current information on the status of the project may be obtained from,
http://lrcwww.epfl.ch
</para>
</sect1 id="ATM">
<sect1 id="DDS-Switched56">
<title>DDS-Switched56</title>
<para>
DDS (Digital Data Service) and Switched 56 are types types of dedicated
digital line provided by phone carriers. DDS lines are more
expensive than dedicated analog lines, but support a more consistent quality.
DDS lines support a speed of 56 Kbps. A device called a CSU/DSU (Channel
Service Unit/Digital Service Unit) is used to connect the network to the
dedicated line.
</para>
<para>
Switched 56 is an alternative to DDS that provides the same type of
connection, but in a circuit-switched format. The line is available
on demand rather than continuously, and you are billed for the hours that
you use it. ISDN has largely replaced Switched 56 for this purpose.
</para>
</sect1 id="DDS-Switched56">
<sect1 id="DECnet">
<title>DECnet</title>
<para>
Support for DECnet is currently being worked on. You should expect it
to appear in a late 2.1.* kernel.
</para>
</sect1 id="DECnet">
<sect1 id="DLC">
<title>DLC</title>
<para>
DLC (Data Link Control) is a transport protocol developed by IBM for SNA
(System Network Architecture), a protocol suite for network communication
with mainframe computers. Particular versions of DLC are called SDLC
(Synchronous Data Link Control) and HDLC (High-level Data Link Control).
Along with its main uses in mainframe communication, DLC is the protocol
used by many network-aware printers such Hewlett-Packard's JetDirect
interface.
</para>
</sect1="DLC">
<sect1 id="EQL">
<title>EQL</title>
<para>
EQL provides a means of utilizing multiple point to point lines such
as PPP, SLIP or PLIP as a single logical link to carry TCP/IP. Often,
it is cheaper to use multiple lower speed lines than to have one high
speed line installed. In short, EQL is multiple line traffic equaliser.
</para>
<para>
EQL is integrated into the Linux kernel. The EQL device name is `eql'.
With the standard kernel source you may have only one EQL device per
machine.
</para>
<para>
<screen>
Kernel Compile Options:
Network device support --->
[*] Network device support
<*> EQL (serial line load balancing) support
</screen>
</para>
<para>
To support this mechanism the machine at the other end of the lines
must also support EQL. Linux, Livingstone Portmasters and newer dial-
in servers support compatible facilities.
</para>
<para>
To configure EQL you will need the EQL tools which are available from:
metalab.unc.edu.
</para>
<para>
Configuration is fairly straightforward. You start by configuring the
eql interface. The eql interface is just like any other network
device. You configure the IP address and mtu using the ifconfig
utility, so something like:
</para>
<para>
<screen>
root# ifconfig eql 192.168.10.1 mtu 1006
</screen>
</para>
<para>
Next you need to manually initiate each of the lines you will use.
These may be any combination of point to point network devices. How
you initiate the connections will depend on what sort of link they
are, refer to the appropriate sections for further information.
</para>
<para>
Lastly you need to associate the serial link with the EQL device, this
is called `enslaving' and is done with the eql_enslave command as
shown:
</para>
<para>
<screen>
root# eql_enslave eql sl0 28800
root# eql_enslave eql ppp0 14400
</screen>
</para>
<para>
The `estimated speed' parameter you supply eql_enslave doesn't do
anything directly. It is used by the EQL driver to determine what
share of the datagrams that device should receive, so you can fine
tune the balancing of the lines by playing with this value.
</para>
<para>
To disassociate a line from an EQL device you use the eql_emancipate
command as shown:
</para>
<para>
<screen>
root# eql_emancipate eql sl0
</screen>
</para>
<para>
You add routing as you would for any other point to point link, except
your routes should refer to the eql device rather than the actual
serial devices themselves, typically you would use:
</para>
<para>
<screen>
root# route add default eql
</screen>
</para>
<para>
The EQL driver was developed by Simon Janes, simon@ncm.com.
</para>
</sect1 id="EQL">
<sect1 id="Ethernet">
<title>Ethernet</title>
Ethernet
Ethernet is the most common network architecture worldwide. It was developed by Xerox,
Intel and DEC in the late 1960s and revised as Ethernet 2.0 in 1982. Ethernet networks
the CSMA/CD (carrier sense multiple access with collision detection) media access method,
defined in IEEE 802.3.
There are three Ethernet standards for different media:
See P41 of Oreilly "MSCE Networking"
10BaseT
10Base2
10Base5
Fast Ethernet
Fast Ethernet, also known as 100BaseT, is a new standard for 100 Mbps Ethernet. Fast Ethernet
can use two-pair Category 5 cable of four-pair Category 3-5 cable.
100BaseT uses a physical star topology identical to that used by 10BaseT, but requires that
all equipment (hubs, NICs, and repeaters) support 100 Mbps speeds. Some NICs and hubs can support
both standards, but all devices on the network need to be configured to use the same standard.
Several manufacturers devleloped 100 Mbps Ethernet devices before 100BaseT became a standard. The
most popular of these, 100VG-AnyLan, is still widely used. This standard uses a demand priority
access method rather than CSMA/CD, and also supports networks that combine Ethernet and Token
Ring packets.
> Start Binh
GigE
GigE Ethernet, also known as 1000BaseT or Gigabit Ethernet. GigE can only use Cat 5 cable.
GigE uses the same topology as that of Fast Ethernet (ie. physical star topology). Like Fast Ethernet
though it requires that hubs/switches on the LAN to be GigE capable. If not it will revert back to
100BaseT, and if this is not available to 10BaseT Ethernet.
> End Binh
* Ethernet-Howto
</sect1 id="Ethernet">
<sect1 id="FDDI">
<title>FDDI</title>
<para>
FDDI (Fiber Distributed Data Interface) is a high-speed, reliable, long-distance
networking scheme often used for network backbones and networks that require
high bandwidth. FDDI uses fiber optic cable wired in a true ring. It supports
speeds up to 100 Mbps and a maximum distance bewteen nodes of 100 kilometers
(62 miles).
</para>
<para>
FDDI uses token-passing scheme wired into two rings, primary and secondary. The
primary ring is used for normal networking. When a failure is detected, the
secondary ring is used in the opposite direction to compensate for the failure
in the primary ring.
</para>
<para>
The advantages of FDDI are their high speed, long distance, and reliablity.
The token-passing scheme used by FDDI is also more sophisticated than that
of Token Ring: it allows multiple packets to be on the ring at once, and
allows certain nodes to be given higher priority than the rest. The
disadvantage of FDDI is its high cost and the difficult in installing and
maintaing fiber optic cable.
</para>
<para>
FDDI device names are `fddi0', `fddi1', `fddi2' etc. The first card
detected by the kernel is assigned `fddi0' and the rest are assigned
sequentially in the order they are detected.
</para>
<para>
Larry Stefani, lstefani@ultranet.com, has developed a driver for the
Digital Equipment Corporation FDDI EISA and PCI cards.
</para>
<para>
When you have your kernel built to support the FDDI driver and
installed (the compilation options are given below), configuration
of the FDDI interface is virtually identical to that of an ethernet
interface. You just specify the need to replace the Ethernet interface
names with appropriate FDDI interface names in the ifconfig and route commands.
</para>
<para>
<screen>
Kernel Compile Options:
Network device support --->
[*] FDDI driver support
[*] Digital DEFEA and DEFPA adapter support
</screen>
</para>
</sect1 id="FDDI">
<sect1 id="Frame-Relay">
<title>Frame-Relay</title>
<para>
Frame relay is a protocol used with leased lines to support speeds up to
1.544 Mbps. Frame realy uses packet switching over a phone company's
network. Frame realy connections use a virtual circuit, called
a PVC (private virtual circuit), to establish connections. Once established,
connections use a low overhead and do not provide error correction.
</para>
<para>
A frame realy compatible router is used to attach the LAN to the frame
relay line. Frame relay lines are available in speeds ranging from 56 Kbps
to 1.544 Mbps, and varying proportionally in cost. One advantage of frame
relay is that bandwidth is available on demand: you can install a line
at 56 Kbps and later upgrade it to a higher speed by ordering the service
from the carrier, usually without replacing any equipment.
</para>
<para>
It was specifically designed and is well suited to data communications traffic
that is of a `bursty' or intermittent nature. You connect to a Frame Relay
network using a Frame Relay Access Device (FRAD). The Linux Frame Relay
supports IP over Frame Relay as described in RFC-1490.
</para>
<para>
The Frame Relay device names are `dlci00', `dlci01' etc for the DLCI
encapsulation devices and `sdla0', `sdla1' etc for the FRAD(s).
</para>
<para>
<screen>
Kernel Compile Options:
Network device support --->
<*> Frame relay DLCI support (EXPERIMENTAL)
(24) Max open DLCI
(8) Max DLCI per device
<*> SDLA (Sangoma S502/S508) support
</screen>
</para>
<para>
Mike McLagan, mike.mclagan@linux.org, developed the Frame Relay
support and configuration tools.
</para>
<para>
Currently the only FRAD supported are the Sangoma Technologies S502A,
S502E and S508.
</para>
<para>
To configure the FRAD and DLCI devices after you have rebuilt your
kernel you will need the Frame Relay configuration tools. These are
available from ftp.invlogic.com. Compiling and installing the tools
is straightforward, but the lack of a top level Makefile makes it a
fairly manual process:
</para>
<para>
<screen>
user% tar xvfz .../frad-0.15.tgz
user% cd frad-0.15
user% for i in common dlci frad; make -C $i clean; make -C $i; done
root# mkdir /etc/frad
root# install -m 644 -o root -g root bin/*.sfm /etc/frad
root# install -m 700 -o root -g root frad/fradcfg /sbin
root# install -m 700 -o root -g root dlci/dlcicfg /sbin
</screen>
</para>
<para>
Note that the previous commands use sh syntax, if you use a csh
flavour instead (like tcsh), the for loop will look different.
</para>
<para>
After installing the tools you need to create an /etc/frad/router.conf
file. You can use this template, which is a modified version of one of
the example files:
</para>
<para>
<screen>
# /etc/frad/router.conf
# This is a template configuration for frame relay.
# All tags are included. The default values are based on the code
# supplied with the DOS drivers for the Sangoma S502A card.
#
# A '#' anywhere in a line constitutes a comment
# Blanks are ignored (you can indent with tabs too)
# Unknown [] entries and unknown keys are ignored
#
[Devices]
Count=1 # number of devices to configure
Dev_1=sdla0 # the name of a device
#Dev_2=sdla1 # the name of a device
# Specified here, these are applied to all devices and can be overridden for
# each individual board.
#
Access=CPE
Clock=Internal
KBaud=64
Flags=TX
#
# MTU=1500 # Maximum transmit IFrame length, default is 4096
# T391=10 # T391 value 5 - 30, default is 10
# T392=15 # T392 value 5 - 30, default is 15
# N391=6 # N391 value 1 - 255, default is 6
# N392=3 # N392 value 1 - 10, default is 3
# N393=4 # N393 value 1 - 10, default is 4
# Specified here, these set the defaults for all boards
# CIRfwd=16 # CIR forward 1 - 64
# Bc_fwd=16 # Bc forward 1 - 512
# Be_fwd=0 # Be forward 0 - 511
# CIRbak=16 # CIR backward 1 - 64
# Bc_bak=16 # Bc backward 1 - 512
# Be_bak=0 # Be backward 0 - 511
#
#
# Device specific configuration
#
#
#
# The first device is a Sangoma S502E
#
[sdla0]
Type=Sangoma # Type of the device to configure, currently only
# SANGOMA is recognized
#
# These keys are specific to the 'Sangoma' type
#
# The type of Sangoma board - S502A, S502E, S508
Board=S502E
#
# The name of the test firmware for the Sangoma board
# Testware=/usr/src/frad-0.10/bin/sdla_tst.502
#
# The name of the FR firmware
# Firmware=/usr/src/frad-0.10/bin/frm_rel.502
#
Port=360 # Port for this particular card
Mem=C8 # Address of memory window, A0-EE, depending on card
IRQ=5 # IRQ number, do not supply for S502A
DLCIs=1 # Number of DLCI's attached to this device
DLCI_1=16 # DLCI #1's number, 16 - 991
# DLCI_2=17
# DLCI_3=18
# DLCI_4=19
# DLCI_5=20
#
# Specified here, these apply to this device only,
# and override defaults from above
#
# Access=CPE # CPE or NODE, default is CPE
# Flags=TXIgnore,RXIgnore,BufferFrames,DropAborted,Stats,MCI,AutoDLCI
# Clock=Internal # External or Internal, default is Internal
# Baud=128 # Specified baud rate of attached CSU/DSU
# MTU=2048 # Maximum transmit IFrame length, default is 4096
# T391=10 # T391 value 5 - 30, default is 10
# T392=15 # T392 value 5 - 30, default is 15
# N391=6 # N391 value 1 - 255, default is 6
# N392=3 # N392 value 1 - 10, default is 3
# N393=4 # N393 value 1 - 10, default is 4
#
# The second device is some other card
#
# [sdla1]
# Type=FancyCard # Type of the device to configure.
# Board= # Type of Sangoma board
# Key=Value # values specific to this type of device
#
# DLCI Default configuration parameters
# These may be overridden in the DLCI specific configurations
#
CIRfwd=64 # CIR forward 1 - 64
# Bc_fwd=16 # Bc forward 1 - 512
# Be_fwd=0 # Be forward 0 - 511
# CIRbak=16 # CIR backward 1 - 64
# Bc_bak=16 # Bc backward 1 - 512
# Be_bak=0 # Be backward 0 - 511
#
# DLCI Configuration
# These are all optional. The naming convention is
# [DLCI_D<devicenum>_<DLCI_Num>]
#
[DLCI_D1_16]
# IP=
# Net=
# Mask=
# Flags defined by Sangoma: TXIgnore,RXIgnore,BufferFrames
# DLCIFlags=TXIgnore,RXIgnore,BufferFrames
# CIRfwd=64
# Bc_fwd=512
# Be_fwd=0
# CIRbak=64
# Bc_bak=512
# Be_bak=0
[DLCI_D2_16]
# IP=
# Net=
# Mask=
# Flags defined by Sangoma: TXIgnore,RXIgnore,BufferFrames
# DLCIFlags=TXIgnore,RXIgnore,BufferFrames
# CIRfwd=16
# Bc_fwd=16
# Be_fwd=0
# CIRbak=16
# Bc_bak=16
# Be_bak=0
</screen>
</para>
<para>
When you've built your /etc/frad/router.conf file the only step
remaining is to configure the actual devices themselves. This is only
a little trickier than a normal network device configuration, you need
to remember to bring up the FRAD device before the DLCI encapsulation
devices. These commands are best hosted in a shell script, due to
their number:
</para>
<para>
<screen>
#!/bin/sh
# Configure the frad hardware and the DLCI parameters
/sbin/fradcfg /etc/frad/router.conf || exit 1
/sbin/dlcicfg file /etc/frad/router.conf
#
# Bring up the FRAD device
ifconfig sdla0 up
#
# Configure the DLCI encapsulation interfaces and routing
ifconfig dlci00 192.168.10.1 pointopoint 192.168.10.2 up
route add -net 192.168.10.0 netmask 255.255.255.0 dlci00
#
ifconfig dlci01 192.168.11.1 pointopoint 192.168.11.2 up
route add -net 192.168.11.0 netmask 255.255.255.0 dlci00
#
route add default dev dlci00
#
</screen>
</para>
</sect1 id="Frame-Relay">
<sect1 id="NetBEUI">
<title>NetBEUI</title>
<para>
NetBEUI (NetBIOS Extended User Interface) is a transport-layer protocol
developed by Microsoft and IBM. NetBEUI was mainly intended as a basic
protocol to support NetBIOS (Network Basic Input/Output System), the
Windows standard for workstation naming, communications, and file sharing.
</para>
<para>
NetBEUI is a fast protocol with a low overhead, which makes it a good
choice for small networks. However, it is a non-routable protocol.
Networks that use NetBEUI can be use bridges for traffic management,
but cannot use routers. Another disadvantage is its proprietary nature.
NetBEUI is supported by few systems other than Windows.
</para>
<para>
Although NetBEUI was developed by Microsoft and was the default protocol
for some operating systems (such as Windows for Workgroups and Windows 95),
Microsoft recommends TCP/IP over NetBEUI for most Windows NT networks.
</para>
</sect1 id="NetBEUI">
<sect1 id="IPX">
<title>IPX</title>
<para>
IPX and SPX are proprietary protocols that were developed during the
early 1980s by Novell for use in NetWare networks.
NetWare became the de facto standard network operating system (NOS) of
first generation LANs. Novell complemented its NOS with a
business-oriented application suite and client-side connection utilities.
They were based on protocols used in Xerox's XNS (Xerox Network Systems)
network architecture.
IPX (Internetwork Packet Exchange) is a connectionless protocol that works
at the network layer of the OSI model, and SPX (Sequenced Packet Exchange)
is a connection-orientated protocol that works at the transport layer.
</para>
<para>
These protocols are often easier to configure than TCP/IP and are routable,
so they make a good alternative for some networks, particularly small
peer-to-peer networks. However, TCP/IP is more suitable for larger
LANs and WANs.
</para>
<para>
Frame types are one aspect of IPX networks that sometimes does require
configuration. The frame type determines the order and type of data included
in the packet. Typical frame types used in NetWare networks
802.2 and 802.3.
</para>
<para>
Linux has a very clean IPX/SPX implementation, allowing it to be
configured as an:
<20> IPX router
<20> IPX bridge
<20> NCP client and/or NCP Server (for sharing files)
<20> Novell Print Client, Novell Print Server
And to:
<20> Enable PPP/IPX, allowing a Linux box to act as a PPP server/client
<20> Perform IPX tunnelling through IP, allowing the connection of two
IPX networks through an IP only link
</para>
How do I configure the kernel for IPX networking support?
<para>
<screen>
IPX ( AF_IPX )
Kernel Compile Options:
Networking options --->
[*] The IPX protocol
[ ] Full internal IPX network
</screen>
</para>
* IPX-SPX HOWTO
</sect1 id="IPX">
<sect1 id="Leased-Line">
<title>Leased-Line</title>
_______________________________________________________________________________
<sect>
<para>
Configuring your modem and pppd to use a 2 wire twisted pair leased
line.
</para>
1.2. What is a leased line
<para>
Any fixed, that is permanent, point to point data communications link,
which is leased from a telco or similar organisation. The leased line
involves cables, such as twisted pair, coax or fiber optic, and may
involve all sorts of other hardware such as (pupin) coils,
transformers, amplifiers and regenerators.
</para>
<para>
This document deals with:
Configuring your modem and pppd to use a 2 wire twisted pair
leased line.
</para>
<para>
This document does NOT deal with:
SLIP, getting or installing pppd, synchronous data
communication, baseband modems, xDSL.
</para>
1.3. Assumptions
<para>
You should already have a working pppd on your system. You also need
Minicom or a similar program to configure your modems.
</para>
2. Modem
<para>
A leased line is not connected to a telephone exchange and does not
provide DC power, dial tone, busy tone or ring signal. This means that
your modems are on their own and have to be able to deal with this
situation.
</para>
<para>
You should have 2 identical (including firmware version) external
modems supporting both leased line and dumb mode. Make sure your
modems can actually do this! Also make sure your modem is properly
documented. You also need:
</para>
<EFBFBD> 2 fully wired shielded RS232 cables. The shield should be connected
to the connector shell (not pin 1) at both ends (not at one end).
<EFBFBD> A RS232 test plug may be handy for test purposes.
<EFBFBD> 2 RJ11 cords, one for each end of the leased line.
<EFBFBD> A basic understanding of `AT' commands.
2.1. Modem Configuration
<para>
A note on modem configuration and init strings in general: Configure
your modem software such as minicom or (m)getty to use the highest
possible speed; 57600 bps for 14k4 and 115200 bps for 28k8 or faster
modems. Lots of people use very long and complicated init strings,
often starting with AT&F and containing lots of modem brand and -type
specific commands. This however is needlessly complicated. Most
programs feel happy with the same modem settings, so why not write
these settings in the non volatile memory of all your modems, and only
use `ATZ' as an init string in all your programs. This way you can
swap or upgrade your modems without ever having to reconfigure any of
your software.
</para>
<para>
Most programs require you to use the following settings;
</para>
<EFBFBD> Fixed baud rate (no auto baud)
<EFBFBD> Hardware bidirectional RTS-CTS flow control (no x-on/x-off)
<EFBFBD> 8 Bits, no parity, 1 stopbit
<EFBFBD> The modem should produce the TRUE DCD status (&C1)
<EFBFBD> The modem should NOT ignore the DTR status (&D2 or &D3)
<para>
Check this with AT&V or AT&Ix (consult your modem documentation)
</para>
<para>
These settings are not necessarily the same as the default factory
profile (&F), so starting an init string with AT&F is probably not a
good idea in the first place. The smart thing to do is probably to use
AT&F only when you have reason to believe that the modem setup stored
in the non volatile memory is really screwed up. If you think you
have found the right setup for your modems, write it to non volatile
memory with AT&W and test it thoroughly with Z-modem file transfers of
both ASCII text and binary files. Only if all of this works perfectly
should you configure your modems for leased line.
</para>
<para>
Find out how to put your modem into dumb mode and, more importantly,
how to get it out of dumb mode; The modem can only be reconfigured
when it is not in dumb mode. Make sure you actually configure your
modems at the highest possible speed. Once in dumb mode it will
ignore all `AT' commands and consequently will not adjust its speed to
that of the COM port, but will use the speed at which it was
configured instead (this speed is stored in a S-register by the AT&W
command).
</para>
<para>
Now configure your modem as follows;
</para>
<EFBFBD> Reset on DTR toggle (&D3, this is sometimes a S register). This
setting is required by some ISP's!
<EFBFBD> Leased line mode (&L1 or &L2, consult your modem documentation)
<EFBFBD> The remote modem auto answer (S0=1), the local originate (S0=0)
<EFBFBD> Disable result codes (Q1, sometimes the dumb mode does this for
you)
<EFBFBD> Dumb mode (\D1 or %D1, this is sometimes a jumper) In dumb mode the
modem will ignore all AT commands (sometimes you need to disable
the ESC char as well).
<para>
Write the configuration to non-volatile memory (&W).
</para>
2.2. Test
<para>
Now connect the modems to 2 computers using the RS232 cables and
connect the modems to each other using a RJ11 lead. Use a modem
program such as Minicom (Linux), procom or telix (DOS) on both
computers to test the modems. You should be able to type text from
one computer to the other and vice versa. If the screen produces
garbage check your COM port speed and other settings. Now disconnect
and reconnect the RJ11 cord. Wait for the connection to reestablish
itself. Disconnect and reconnect the RS232 cables, switch the modems
on and off, stop and restart Minicom. The modems should always
reconnect at the highest possible speed (some modems have speed
indicator leds). Check whether the modems actually ignores the ESC
(+++) character. If necessary disable the ESC character.
</para>
<para>
If all of this works you may want to reconfigure your modems; Switch
off the sound at the remote modem (M0) and put the local modem at low
volume (L1).
</para>
2.3. Examples
2.3.1. Hi-Tech
<para>
This is a rather vague `no name clone modem'. Its config string is
however typical and should work on most modems.
</para>
<para>
Originate (local):
ATL1&C1&D3&L2%D1&W&W1
</para>
<para>
Answer (remote):
ATM0L1&C1&D3&L2%D1S0=1&W&W1
</para>
2.3.2. Tornado FM 228 E
<para>
This is what should work;
</para>
<para>
Originate (local):
ATB15L1Q1&C1&D3&L2&W&W1
</para>
<para>
Answer (remote):
ATM0B15M0Q1&C1&D3&L2S0=1&W&W1
</para>
<para>
Move the dumb jumper from position 2-3 to 1-2.
</para>
<para>
Due to a firmware bug, the modems will only connect after being hard
reset (power off and on) while DTR is high. I designed a circuit which
hard resets the modem on the low to high transition of DTR. The
FreeBSD pppd however, isn't very happy about this. By combining the
setting &D0 with a circuit which resets on the high to low transition
instead, this problem can be avoided.
</para>
2.3.3. Tron DF
<para>
The ESC char should be disabled by setting S2 > 127;
</para>
<para>
Originate:
ATL1&L1Q1&C1&D3S2=171\D1&W
</para>
<para>
Answer:
ATM0&L2Q1&C1&D3S0=1S2=171\D1&W
</para>
2.3.4. US Robotics Courier V-Everything
<para>
The USR Sportster and USR Courier-I do not support leased line. You
need the Courier V-everything version for this job. There is a
webpage on the USR site `explaining' how to set-up your Courier for
leased line. However, if you follow these instructions you will end up
with a completely brain dead modem, which can not be controlled or
monitored by your pppd.
</para>
<para>
The USR Courier can be configured with dip switches, however you need
to feed it the config string first. First make sure it uses the right
factory profile. Unlike most other modems it has three; &F0, &F1 and
&F2. The default, which is also the one you should use, is &F1. If you
send it an AT&F, however it will load the factory profile &F0! For
the reset on DTR toggle you set bit 0 of S register 13. This means you
have to set S13 to 1. Furthermore you need set it to leased line mode
with &L1; ATS13=1&L1&W The dip switches are all default except for the
following:
<para>
<para>
3 OFF Disable result codes
4 ON Disable offline commands
5 ON For originate, OFF For answer
8 OFF Dumb mode
</para>
3. PPPD
<para>
You need a pppd (Point to Point Protocol Daemon) and a reasonable
knowledge of how it works. Consult the relevant RFC's or the Linux PPP
HOWTO if necessary. Since you are not going to use a login procedure,
you don't use (m)getty and you do not need a (fake) user associated
with the pppd controlling your link. You are not going to dial so you
don't need any chat scripts either. In fact, the modem circuit and
configuration you have just build, are rather like a fully wired null
modem cable. This means you have to configure your pppd the same way
as you would with a null modem cable.
</para>
<para>
For a reliable link, your setup should meet the following criteria;
</para>
<EFBFBD> Shortly after booting your system, pppd should raise the DTR signal
in your RS232 port, wait for DCD to go up, and negotiate the link.
<EFBFBD> If the remote system is down, pppd should wait until it is up
again.
<EFBFBD> If the link is up and then goes down, pppd should reset the modem
(it does this by dropping and then raising DTR), and then try to
reconnect.
<EFBFBD> If the quality of the link deteriorates too much, pppd should reset
the modem and then reestablish the link.
<EFBFBD> If the process controlling the link, that is the pppd, dies, a
watchdog should restart the pppd.
3.1. Configuration
<para>
Suppose the modem is connected to COM2, the local IP address is
`Loc_Ip' and the remote IP address is `Rem_Ip'. We want to use 576 as
our MTU. The /etc/ppp/options.ttyS1 would now be:
</para>
<para>
<screen>
crtscts
mru 576
mtu 576
passive
Loc_Ip:Rem_Ip
-chap
modem
#noauth
-pap
persist
</screen>
</para>
<para>
Stuff like `asyncmap 0', `lock', `modem' and `-detach' are probably
already in /etc/ppp/options. If not, add them to your
/etc/ppp/options.ttyS1. So, if the local system is 192.168.1.1 and
the remote system is 10.1.1.1, then /etc/ppp/options.ttyS1 on the
local system would be:
</para>
<para>
<screen>
crtscts
mru 576
mtu 576
passive
192.168.1.1:10.1.1.1
-chap
modem
#noauth
-pap
persist
</screen>
</para>
<para>
The options.ttyS1 on the remote system would be:
</para>
<para>
<screen>
crtscts
mru 576
mtu 576
passive
10.1.1.1:192.168.1.1
-chap
modem
#noauth
-pap
persist
</screen>
</para>
<para>
The passive option limits the number of (re)connection attempts. The
persist option will keep pppd alive in case of a disconnect or when it
can't connect in the first place. If you telnet a lot while doing
filetransfers (FTP or webbrowsing) at the same time, you might want to
use a smaller MTU and MRU such as 296. This will make the remote sys<79>
tem more responsive. If you don't care much about telnetting during
FTP, you could set the MTU and MRU to 1500. Keep in mind though, that
UDP cannot be fragmented. Speakfreely for instance uses 512 byte UDP
packets. So the minimum MTU for speakfreely is 552 bytes. The noauth
option may be necessary with some newer distributions.
</para>
3.2. Scripts
3.2.1. Starting the pppd and keeping it alive
<para>
You could start the pppd form a boot (rc) script. However, if you do
this, and the pppd dies, you are without a link. A more stable
solution, is to start the pppd from /etc/inittab;
</para>
<para>
<screen>
s1:23:respawn:/usr/sbin/pppd /dev/ttyS1 115200
</screen>
</para>
<para>
This way, the pppd will be restarted if it dies. Make sure you have a
`-detach' option (nodetach on newer systems) though, otherwise inittab
will start numerous instances of pppd, will complaining about
`respawning too fast'.
</para>
<para>
Note: Some older systems will not accept the speed `115200'. In this
case you will have to set the speed to 38400 en set the `spd_vhi' flag
with setserial. Some systems expect you to use a `cua' instead of
`ttyS' device.
</para>
3.2.2. Setting the routes
<para>
The default route can be set with the defaultroute option or with the
/etc/ppp/ip-up script;
</para>
<para>
<screen>
#!/bin/bash
case $2 in
/dev/ttyS1)
/sbin/route add -net 0.0.0.0 gw Rem_Ip netmask 0.0.0.0
;;
esac
</screen>
</para>
<para>
Ip-up can also be used to sync your clock using netdate.
</para>
<para>
Of course the route set in ip-up is not necessarily the default route.
Your ip-up sets the route to the remote network while the ip-up script
on the remote system sets the route to your network. If your network
is 192.168.1.0 and your ppp interface 192.168.1.1, the ip-up script on
the remote machine looks like this;
</para>
<para>
<screen>
#!/bin/bash
case $2 in
/dev/ttyS1)
/sbin/route add -net 192.168.1.0 gw 192.168.1.1 netmask 255.255.255.0
;;
esac
</screen>
</para>
<para>
The `case $2' and `/dev/ttyS1)' bits are there in case you use more
than one ppp link. Ip-up will run each time a link comes up, but only
the part between `/dev/ttySx)' and `;;' will be executed, setting the
right route for the right ttyS. You can find more about routing in
the Linux Networking HOWTO section on routing.
</para>
3.3. Test
<para>
Test the whole thing just like the modem test. If it works, get on
your bike and bring the remote modem to the remote side of your link.
If it doesn't work, one of the things you should check is the COM port
speed; Apparently, a common mistake is to configure the modems with
Minicom using one speed and then configure the pppd to use an other.
This will NOT work! You have to use the same speed all of the time!
</para>
</sect>
<sect>
T1-T4
<para>
A T1 line is a high-speed, dedicated, point-to-point leased line that
includes 24 seperate 64 Kbps channles for voice and data. Other lines
of this type, called T-carrier lines, support larger numbers of channels.
T1 and T3 lines are the most commonly used.
</para>
<para>
<screen>
Carrier Channels Total Bandwidth
T1 24 1.544 Mbps
T2 96 6.312 Mbps
T3 672 44.736 Mbps
T4 4032 274.176 Mbps
</screen>
</para>
<para>
While the specification for T-carrier lines does not mandate a particular
media type, T1 and T2 are typically carried on copper, and T3 and T4
typically use fiber optic media. DS1, DS2, DS3, and DS4 are an alternate
type of line equivalent to T1-T4, and typically use fiber optic media.
</para>
SONET (Synchronous Optical Network)
<para>
A leased-line system using fiber optic media to support data speeds up to
2.4 Gbps. SONET services are sold based on optical carier (OC) levels. OC
levels are calculated as multiples of the OC-1 speed, 51.840 Mbps. For
example, OC-3 level would correspond with a data speed of 155 Mbps and
OC-12 level would equate to a data transfer rate of 622 Mbps. OC-1 and
OC-3 are the most commonly used SONET lines.
</para>
</sect id="Leased-Line">
<sect1 id="PLIP">
<title>PLIP</title>
<para>
PLIP (Parallel Line IP), is like SLIP, in that it is used for
providing a point to point network connection between two machines,
except that it is designed to use the parallel printer ports on your
machine instead of the serial ports (a cabling diagram in included in
the cabling diagram section later in this document). Because it is
possible to transfer more than one bit at a time with a parallel port,
it is possible to attain higher speeds with the plip interface than
with a standard serial device. In addition, even the simplest of
parallel ports, printer ports, can be used in lieu of you having to
purchase comparatively expensive 16550AFN UART's for your serial
ports. PLIP uses a lot of CPU compared to a serial link and is most
certainly not a good option if you can obtain some cheap ethernet
cards, but it will work when nothing else is available and will work
quite well. You should expect a data transfer rate of about 20
kilobytes per second when a link is running well.
</para>
7.2. PLIP for Linux-2.0
<para>
PLIP device names are `plip0', `plip1 and plip2.
</para>
<para>
<screen>
Kernel Compile Options:
Network device support --->
<*> PLIP (parallel port) support
</screen>
</para>
<para>
The PLIP device drivers competes with the parallel device driver for
the parallel port hardware. If you wish to use both drivers then you
should compile them both as modules to ensure that you are able to
select which port you want to use for PLIP and which ports you want
for the printer driver. Refer to the ``Modules mini-HOWTO'' for more
information on kernel module configuration.
</para>
<para>
Please note that some laptops use chipsets that will not work with
PLIP because they do not allow some combinations of signals that PLIP
relies on, that printers don't use.
</para>
<para>
The Linux plip interface is compatible with the Crynwyr Packet Driver
PLIP and this will mean that you can connect your Linux machine to a
DOS machine running any other sort of tcp/ip software via plip.
</para>
<para>
In the 2.0.* series kernel the plip devices are mapped to i/o port and
IRQ as follows:
</para>
<para>
<screen>
device i/o IRQ
------ ----- ---
plip0 0x3bc 5
plip1 0x378 7
plip2 0x278 2
</screen>
</para>
<para>
If your parallel ports don't match any of the above combinations then
you can change the IRQ of a port using the ifconfig command using the
`irq' parameter (be sure to enable IRQ's on your printer ports in your
ROM BIOS if it supports this option). As an alternative, you can
specify ``io='' annd ``irq='' options on the insmod command line, if
you use modules. For example:
</para>
<para>
<screen>
root# insmod plip.o io=0x288 irq=5
</screen>
</para>
<para>
PLIP operation is controlled by two timeouts, whose default values are
probably ok in most cases. You will probably need to increase them if
you have an especially slow computer, in which case the timers to
increase are actually on the other computer. A program called
plipconfig exists that allows you to change these timer settings
without recompiling your kernel. It is supplied with many Linux
distributions.
</para>
<para>
To configure a plip interface, you will need to invoke the following
commands (or add them to your initialization scripts):
</para>
<para>
<screen>
root# /sbin/ifconfig plip1 localplip pointopoint remoteplip
root# /sbin/route add remoteplip plip1
</screen>
</para>
<para>
Here, the port being used is the one at I/O address 0x378; localplip
amd remoteplip are the names or IP addresses used over the PLIP cable.
I personally keep them in my /etc/hosts database:
</para>
<para>
<screen>
# plip entries
192.168.3.1 localplip
192.168.3.2 remoteplip
</screen>
</para>
<para>
The pointopoint parameter has the same meaning as for SLIP, in that it
specifies the address of the machine at the other end of the link.
</para>
<para>
In almost all respects you can treat a plip interface as though it
were a SLIP interface, except that neither dip nor slattach need be,
nor can be, used.
</para>
<para>
Further information on PLIP may be obtained from the ``PLIP mini-
HOWTO''.
</para>
7.3. PLIP for Linux-2.2
<para>
During development of the 2.1 kernel versions, support for the
parallel port was changed to a better setup.
</para>
<para>
<screen>
Kernel Compile Options:
General setup --->
[*] Parallel port support
Network device support --->
<*> PLIP (parallel port) support
</screen>
</para>
<para>
The new code for PLIP behaves like the old one (use the same ifconfig
and route commands as in the previous section, but initialization of
the device is different due to the advanced parallel port support.
</para>
<para>
The ``first'' PLIP device is always called ``plip0'', where first is
the first device detected by the system, similarly to what happens for
Ethernet devices. The actual parallel port being used is one of the
available ports, as shown in /proc/parport. For example, if you have
only one parallel port, you'll only have a directory called
/proc/parport/0.
</para>
<para>
If your kernel didn't detect the IRQ number used by your port,
``insmod plip'' will fail; in this case just write the right number to
/proc/parport/0/irq and reinvoke insmod.
</para>
<para>
Complete information about parallel port management is available in
the file Documentation/parport.txt, part of your kernel sources.
</para>
<EFBFBD> PLIP information can be found in The Network Administrator Guide
<http://metalab.unc.edu/mdw/LDP/nag/nag.html>
PLIP allows the cheap connection of two machines.
It uses a parallel port and a special cable, achieving speeds of
10kBps to 20kBps.
</sect1 id="PLIP">
<sect1 id="PPP-and-SLIP">
<title>PPP and SLIP</title>
<para>
The Linux kernel has built-in support for PPP (Point-to-Point-
Protocol) and SLIP (Serial Line IP). PPP is the most popular
way individual users access their ISPs (Internet Service
Providers).
<EFBFBD> Linux PPP HOWTO <http://metalab.unc.edu/mdw/HOWTO/PPP-HOWTO.html>
<EFBFBD> PPP/SLIP emulator <http://metalab.unc.edu/mdw/HOWTO/mini/SLIP-PPP-
Emulator.html>
</sect1 id="PPP-and-SLIP">
<sect1 id="Token-Ring">
<title>Token-Ring</title>
<para>
The Token Ring architecture is defined in IEEE 802.5. IBM has further defined
the standard to include particular types of devices and cables. Token Ring uses
a logical ring topology and a physical star topology. The hubs for Token Rung
are called multistation access units, or MAUs.
</para>
<para>
The Token Ring standard supports either 4 Mbps or 16 Mbps speeds. Cable can be
STP, UTP, or fiber. One popular wiring scheme uses Category 5 cable. There are
also a varity of cable types defined by IBM (referred to as Type 1 through
Type 9). Distances between nodes can range from 45 meters for UTP to a kilometer
or more for fiber optic cable.
</para>
<para>
Token Ring networks use a token-passing access scheme. A token data frame is
passed from one computer to the net around the ring. Each computer can
transmit data only when it has the token. This access method provides equal
access to the network for all nodes, and handles heavy loads better than
Ethernet's contention-based method.
</para>
<para>
The nodes in a Token Ring network monitor each other for reliablity. The
first computer in the network becomes an Active Monitor, and the others
are Passive Monitors. Each computer monitors its nearest upstream
neighbour. When an error occurs, the computer broadcasts a beacon packet
indicating the error.
</para>
<para>
The NICs in all computers respond to the beacon by running self-tests, and
removing themselves from the network if necessary. Node in the network can
also automatically remove packets sent to a computer that is having a
problem. This makes Token Ring a reliable choice for networking.
</para>
See Token-Ring HOWTO for more details on running Token Ring on your local
network.
</sect1 id="Token-Ring">
<sect1 id="X25">
<title>X25</title>
<para>
X.25 is a circuit based protocol developed in the 1970s for packet switching
by the C.C.I.T.T. (a standards body recognized by Telecommunications
companies in most parts of the world), allowing customers to share access to
a PDN (Public Data Network). These networks, such as Sprintnet and Tymnet,
were the most practical way to connect large companies at the time,
and are still used by some companies. PDNs are networks that have local
dial-up access points in cities throughout the country and use dedicated lines
to network between these cities. Companies would dial up in two locations to
connect their computers.
</para>
<para>
Computers, routers, or other devices that access a PDN using the X.25
protocols are called data terminal equipment, or DTEs. DTEs without built-in
support for X.25 is a protocol with a relatively high overhead, since it
provides error control and accounting for users of the network.
</para>
<para>
The X.25 protocol supports speeds up to 64 Kbps. This makes it impractical for
many networks, but it is an inexpensive alternative for low-bandwidth
applications. X,25 is a protocol with a relatively high overhead, since it
provides error control and accouting for users of the network.
</para>
<para>
An implementation of X.25 and LAPB are being worked on and recent
2.1.* kernels include the work in progress. Jonathon Naylor
jsn@cs.nott.ac.uk is leading the development and a
mailing list has been established to discuss Linux X.25 related
matters. To subscribe send a message to: majordomo@vger.rutgers.edu
with the text "subscribe linux-x25" in the body of the message.
Early versions of the configuration tools may be obtained from
Jonathon's ftp site at ftp.cs.nott.ac.uk.
</para>
</sect1 id="X25">
<sect1 id="IPv6">
<title>IPv6</title>
<para>
2.1. What is IPv6?
IPv6, sometimes also referred to as IPng (IP Next Generation)
is a new layer 3 protocol (see [http://www.linuxports.com/howto/
intro_to_networking/c4412.htm#PAGE103HTML] linuxports/howto/
intro_to_networking/ISO - OSI Model) which will supersede IPv4 (also known as
IP).
It was designed to address many issues including, the shortage of
available IP addresses, lack of mechanisms to handle time-sensitive
traffic, lack of network layer security, etc.
IPv4 was designed long time ago ([http://www.faqs.org/rfcs/rfc760.html]
RFC 760 / Internet Protocol from January 1980) and since its inception, there
have been many requests for more addresses and enhanced capabilities. Latest
RFC is [http://www.faqs.org/rfcs/rfc2460.html] RFC 2460 / Internet Protocol
Version 6 Specification. Major changes in IPv6 are the redesign of the
header, including the increase of address size from 32 bits to 128 bits.
Because layer 3 is responsible for end-to-end packet transport using packet
routing based on addresses, it must include the new IPv6 addresses (source
and destination), like IPv4. It is anticpated that the larger name space
and accompanying improved addressing scheme, which will prove to provide
a major improvement on routing performance.
For more information about the IPv6 history take a look at older IPv6 related
RFCs listed e.g. at [http://www.switch.ch/lan/ipv6/references.html] SWITCH
IPv6 Pilot / References.
-----------------------------------------------------------------------------
2.2. History of IPv6 in Linux
The years 1992, 1993 and 1994 of the IPv6 History (in general) are covered by
following document: [http://www.laynetworks.com/users/webs/IPv6.htm#CH3] IPv6
or IPng (IP next generation).
To-do: better time-line, more content...
-----------------------------------------------------------------------------
2.2.1. Beginning
The first IPv6 related network code was added to the Linux kernel 2.1.8 in
November 1996 by Pedro Roque. It was based on the BSD API:
diff -u --recursive --new-file v2.1.7/linux/include/linux/in6.h
<EFBFBD> linux/include/linux/in6.h
--- v2.1.7/linux/include/linux/in6.h Thu Jan 1 02:00:00 1970
+++ linux/include/linux/in6.h Sun Nov 3 11:04:42 1996
@@ -0,0 +1,99 @@
+/*
+ * Types and definitions for AF_INET6
+ * Linux INET6 implementation
+ * + * Authors:
+ * Pedro Roque <******>
+ *
+ * Source:
+ * IPv6 Program Interfaces for BSD Systems
+ * <draft-ietf-ipngwg-bsd-api-05.txt>
The shown lines were copied from patch-2.1.8 (e-mail address was blanked on
copy&paste).
-----------------------------------------------------------------------------
2.2.2. In between
Because of lack of manpower, the IPv6 implementation in the kernel was unable
to follow the discussed drafts or newly released RFCs. In October 2000, a
project was started in Japan, called [http://www.linux-ipv6.org/] USAGI,
whose aim was to implement all missing, or outdated IPv6 support in Linux. It
tracks the current IPv6 implementation in FreeBSD made by the [http://
www.kame.net/] KAME project. From time to time they create snapshots against
current vanilla Linux kernel sources.
-----------------------------------------------------------------------------
2.2.3. Current
Unfortunately, the [http://www.linux-ipv6.org/] USAGI patch is so big, that
current Linux networking maintainers are unable to include it in the
production source of the Linux kernel 2.4.x series. Therefore the 2.4.x
series is missing some (many) extensions and also does not confirm to all
current drafts and RFCs (see [http://www.ietf.org/html.charters/
ipv6-charter.html] IP Version 6 Working Group (ipv6) Charter). This can cause
some interoperability problems with other operating systems.
-----------------------------------------------------------------------------
2.2.4. Future
[http://www.linux-ipv6.org/] USAGI is now making use of the new Linux kernel
development series 2.5.x to insert all of their current extensions into this
development release. Hopefully the 2.6.x kernel series will contain a true
and up-to-date IPv6 implementation.
-----------------------------------------------------------------------------
</sect1 id="IPv6">
<sect1 id="STRIP">
<title>STRIP</title>
<para>
STRIP (Starnode Radio IP) is a protocol designed specifically for
a range of Metricom radio modems for a research project being
conducted by Stanford University called the MosquitoNet Project.
There is a lot of interesting reading here, even if you aren't
directly interested in the project.
</para>
<para>
The Metricom radios connect to a serial port, employ spread spectrum
technology and are typically capable of about 100kbps. Information on
the Metricom radios is available from the: Metricom Web Server.
</para>
<para>
At present the standard network tools and utilities do not support the
STRIP driver, so you will have to download some customized tools from
the MosquitoNet web server. Details on what software you need is
available at the: MosquitoNet STRIP Page.
</para>
<para>
A summary of configuration is that you use a modified slattach program
to set the line discipline of a serial tty device to STRIP and then
configure the resulting `st[0-9]' device as you would for ethernet
with one important exception, for technical reasons STRIP does not
support the ARP protocol, so you must manually configure the ARP
entries for each of the hosts on your subnet. This shouldn't prove too
onerous. STRIP device names are `st0', `st1', etc.... The relevant
kernel compilation options are given below.
</para>
<para>
<screen>
Kernel Compile Options:
Network device support --->
[*] Network device support
....
[*] Radio network interfaces
< > STRIP (Metricom starmode radio IP)
</screen>
</para>
</sect1 id="STRIP">
<sect1 id="WaveLAN">
<title>WaveLAN</title>
<para>
The WaveLAN card is a spread spectrum wireless lan card. The card
looks very like an ethernet card in practice and is configured in much
the same way.
</para>
<para>
You can get information on the Wavelan card from wavelan.com.
</para>
<para>
Wavelan device names are `eth0', `eth1', etc.
<para>
<screen>
Kernel Compile Options:
Network device support --->
[*] Network device support
....
[*] Radio network interfaces
....
<*> WaveLAN support
</screen>
</para>
</sect1 id="WaveLAN">
<sect1 id="ISDN">
<title>ISDN</title>
<para>
The Integrated Services Digital Network (ISDN) is a series of
standards that specify a general purpose switched digital data
network. An ISDN `call' creates a synchronous point to point data
service to the destination. ISDN is generally delivered on a high
speed link that is broken down into a number of discrete channels.
There are two different types of channels, the `B Channels' which will
actually carry the user data and a single channel called the `D
channel' which is used to send control information to the ISDN
exchange to establish calls and other functions. In Australia for
example, ISDN may be delivered on a 2Mbps link that is broken into 30
discrete 64kbps B channels with one 64kbps D channel. Any number of
channels may be used at a time and in any combination. You could for
example establish 30 separate calls to 30 different destinations at
64kbps each, or you could establish 15 calls to 15 different
destinations at 128kbps each (two channels used per call), or just a
small number of calls and leave the rest idle. A channel may be used
for either incoming or outgoing calls. The original intention of ISDN
was to allow Telecommunications companies to provide a single data
service which could deliver either telephone (via digitised voice) or
data services to your home or business without requiring you to make
any special configuration changes.
</para>
<para>
There are a few different ways to connect your computer to an ISDN
service. One way is to use a device called a `Terminal Adaptor' which
plugs into the Network Terminating Unit that you telecommunications
carrier will have installed when you got your ISDN service and
presents a number of serial interfaces. One of those interfaces is
used to enter commands to establish calls and configuration and the
others are actually connected to the network devices that will use the
data circuits when they are established. Linux will work in this sort
of configuration without modification, you just treat the port on the
Terminal Adaptor like you would treat any other serial device.
Another way, which is the way the kernel ISDN support is designed for
allows you to install an ISDN card into your Linux machine and then
has your Linux software handle the protocols and make the calls
itself.
</para>
<para>
The Linux kernel has built-in ISDN capabilies. Isdn4linux controls
ISDN PC cards and can emulate a modem with the Hayes command set ("AT"
commands). The possibilities range from simply using a terminal
program to connections via HDLC (using included devices) to full
connection to the Internet with PPP to audio applications.
<EFBFBD> FAQ for isdn4linux: http://ww.isdn4linux.de/faq/
</para>
<para>
<screen>
Kernel Compile Options:
ISDN subsystem --->
<*> ISDN support
[ ] Support synchronous PPP
[ ] Support audio via ISDN
< > ICN 2B and 4B support
< > PCBIT-D support
< > Teles/NICCY1016PC/Creatix support
</screen>
</para>
<para>
The Linux implementation of ISDN supports a number of different types
of internal ISDN cards. These are those listed in the kernel
configuration options:
</para>
<20> ICN 2B and 4B
<20> Octal PCBIT-D
<20> Teles ISDN-cards and compatibles
<para>
Some of these cards require software to be downloaded to them to make
them operational. There is a separate utility to do this with.
</para>
<para>
Full details on how to configure the Linux ISDN support is available
from the /usr/src/linux/Documentation/isdn/ directory and an FAQ
dedicated to isdn4linux is available at www.lrz-muenchen.de. (You can
click on the english flag to get an english version).
</para>
<para>
A note about PPP. The PPP suite of protocols will operate over either
asynchronous or synchronous serial lines. The commonly distributed PPP
daemon for Linux `pppd' supports only asynchronous mode. If you wish
to run the PPP protocols over your ISDN service you need a specially
modified version. Details of where to find it are available in the
documentation referred to above.
</para>
</sect1 id="ISDN">
<sect1 id="NIS">
<title>NIS</title>
<para>
The Network Information Service (NIS) provides a simple network lookup
service consisting of databases and processes. Its purpose is to
provide information that has to be known throughout the network to all
machines on the network. For example, it enables an administrator to
allow users access to any machine in a network running NIS without a
password entry existing on each machine; only the main database needs
to be maintained.
</para>
</sect1 id="NIS">
<sect1 id="Services">
<title>Services</title>
<para>
</para>
</sect1 id="Services">
<sect1 id="Database">
<title>Database</title>
<para>
Most databases are supported under Linux, including Oracle, DB2, Sybase, Informix, MySQL, PostgreSQL,
InterBase and Paradox. Databases, and the Structures Query Language they work with, are complex, and this
chapter has neither the space or depth to deal with them. Read the next section on PHP to learn how to set
a dynamically generated Web portal in about five minutes.
We'll be using MySQL because it's extremely fast, capable of handling large databases (200G databases aren't
unheard of), and has recently been made open source. It also works well with PHP. While currently
lacking transaction support (due to speed concerns), a future version of MySQL will have this opt
</para>
* Connecting to MS SQL 6.x+ via Openlink/PHP/ODBC mini-HOWTO
* Sybase Adaptive Server Anywhere for Linux HOWTO
</sect1 id="Database">
<sect1 id="DHCP">
<title>DHCP</title>
<para>
Endeavouring to maintain static IP addressing to maintain static IP addressing
information, such as IP addresses, subnet masks, DNS names and other
information on client machines can be difficult. Documentation becomes lost or
out-of-date, and network reconfigurations require details to be modified
manually on every machine.
</para>
<para>
DHCP (Dynamic Host Configuration Protocol) solves this problem by providing
arbitrary information (including IP addressing) to clients upon request.
Almost all client OSes support it and it is standard in most large networks.
</para>
<para>
The impact that it has is most prevalent it eases network administration,
especially in large networks or networks which have lots of mobile users.
</para>
2. DHCP protocol
DHCP (Dynamic Host Configuration Protocol), is used to control
vital networking parameters of hosts (running clients) with the help
of a server. DHCP is backward compatible with BOOTP. For more
information see RFC 2131 (old RFC 1541) and other. (See Internet
Resources section at the end of the document). You can also read
[32]http://web.syr.edu/~jmwobus/comfaqs/dhcp.faq.html.
</sect1 id="DHCP">
<sect1 id="DNS">
<title>DNS</title>
Setting Up Your New Domain Mini-HOWTO.
</sect1 id="DNS">
<sect1 id="FTP">
<title>FTP</title>
<para>
File Transport Protocol (FTP) is an efficient way to transfer files between
machines across networks and clients and servers exist for almost all platforms
making FTP the most convenient (and therefore popular) method of transferring
files. FTP was first developed by the University of California, Berkeley for
inclusion in 4.2BSD (Berkeley Unix). The RFC (Request for Comments)
documents for the protocol is now known as RFC 959 and is available at
ftp://nic.merit.edu/documents/rfc/rfc0959.txt.
</para>
<para>
There are two typical modes of running an FTP server - either anonymously or
account-based. Anonymous FTP servers are by far the most popular; they allow
any machine to access the FTP server and the files stored on it with the same
permissions. No usernames or passwords are transmitted down the wire.
Account-based FTP allows users to login with real usernames and passwords.
While it provides greater access control than anonymous FTP, transmitting real
usernames and password unencrypted over the Internet is generally avoided for
security reasons.
</para>
<para>
An FTP client is the userland application that provides access to FTP
servers. There are many FTP clients available. Some are graphical, and
some are text-based.
</para>
* FTP HOWTO
</sect1 id="FTP">
<sect1 id="LDAP">
<title>LDAP</title>
Information about installing, configuring, running and maintaining a LDAP
(Lightweight Directory Access Protocol) Server on a Linux machine is
presented on this section. This section also presents details about how to
create LDAP databases, how to add, how to update and how to delete
information on the directory. This paper is mostly based on the University of
Michigan LDAP information pages and on the OpenLDAP Administrator's Guide.
</sect1 id="LDAP">
<sect1 id="NFS">
<title>NFS</title>
NFS (Network File System)
The TCP/IP suite's equivalent of file sharing. This protocol operates at the Process/Application
layer of the DOD model, similar to the application layer of the OSI model.
SLIP (Serial Line Internet Protocol) and PPP (Point-to-Point Protocol)
Two protocols commonly used for dial-up access to the Internet. They are typically used with
TCP/IP; while SLIP works only with TCP/IP, PPP can be used with other protocols.
SLIP was the first protocol for dial-up Internet access. It opeates at the physical layer of the
OSI model, and provides a simple interface to a UNIX or other dial-up host for Internet access.
SLIP does not provide security, so authentication is handled through prompts before initiating
the SLIP connection.
PPP is a more recent development. It operates at the physical and data link layers of the OSI
model. In addition to the features of SLIP, PPP supports data compression, security (authentication),
and error control. PPP can also dynamically assign network addresses.
Since PPP provides easier authentication and better security, it should be used for dial-up connections
whenever possible. However, you may need to use SLIRP to communicate with dial-up servers (particularly
older UNIC machines and dedicated hardware servers) that don't support PPP.
> Start Config-HOWTO
2.15. Automount Points
If you don't like the mounting/unmounting thing, consider using autofs(5). You tell the autofs daemon what to automount and where starting with a file, /etc/auto.master. Its structure is simple:
/misc/etc/auto.misc
/mnt/etc/auto.mnt
In this example you tell autofs to automount media in /misc and /mnt, while the mountpoints are specified in/etc/auto.misc and /etc/auto.mnt. An example of /etc/auto.misc:
# an NFS export
server -romy.buddy.net:/pub/export
# removable media
cdrom -fstype=iso9660,ro:/dev/hdb
floppy-fstype=auto:/dev/fd0
Start the automounter. From now on, whenever you try to access the inexistent mount point /misc/cdrom, il will be created and the CD-ROM will be mounted.
>End Config-HOWTO
5.4. Unix Environment
The preferred way to share files in a Unix networking environment is
through NFS. NFS stands for Network File Sharing and it is a protocol
originally developed by Sun Microsystems. It is a way to share files
between machines as if they were local. A client "mounts" a filesystem
"exported" by an NFS server. The mounted filesystem will appear to the
client machine as if it was part of the local filesystem.
It is possible to mount the root filesystem at startup time, thus
allowing diskless clients to boot up and access all files from a
server. In other words, it is possible to have a fully functional
computer without a hard disk.
Coda is a network filesystem (like NFS) that supports disconnected
operation, persistant caching, among other goodies. It's included in
2.2.x kernels. Really handy for slow or unreliable networks and
laptops.
NFS-related documents:
<20> http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root.html
<20> http://metalab.unc.edu/mdw/HOWTO/Diskless-HOWTO.html
<20> http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root-Client-mini-
HOWTO/index.html
<20> http://www.redhat.com/support/docs/rhl/NFS-Tips/NFS-Tips.html
<20> http://metalab.unc.edu/mdw/HOWTO/NFS-HOWTO.html
CODA can be found at: http://www.coda.cs.cmu.edu/
<para>
5.4. Unix Environment
The preferred way to share files in a Unix networking environment is
through NFS. NFS stands for Network File Sharing and it is a protocol
originally developed by Sun Microsystems. It is a way to share files
between machines as if they were local. A client "mounts" a filesystem
"exported" by an NFS server. The mounted filesystem will appear to the
client machine as if it was part of the local filesystem.
It is possible to mount the root filesystem at startup time, thus
allowing diskless clients to boot up and access all files from a
server. In other words, it is possible to have a fully functional
computer without a hard disk.
Coda is a network filesystem (like NFS) that supports disconnected
operation, persistant caching, among other goodies. It's included in
2.2.x kernels. Really handy for slow or unreliable networks and
laptops.
NFS-related documents:
<20> http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root.html
<20> http://metalab.unc.edu/mdw/HOWTO/Diskless-HOWTO.html
<20> http://metalab.unc.edu/mdw/HOWTO/mini/NFS-Root-Client-mini-
HOWTO/index.html
<20> http://www.redhat.com/support/docs/rhl/NFS-Tips/NFS-Tips.html
<20> http://metalab.unc.edu/mdw/HOWTO/NFS-HOWTO.html
CODA can be found at: http://www.coda.cs.cmu.edu/
Samba is the Linux implementation of SMB under Linux. NFS is the Unix equivalent - a way to import and
export local files to and from remote machines. Like SMB, NFS sends information including user
passwords unencrypted, is its best to limit it to within your local network.
As you know, all storage in Linux is visible within a single tree structure, and new hard disks,
CD-ROMs, Zip drives and other spaces are mounted on a particular directory. NFS shares are also
attached to the system in this manner. NFS is included in most Linux kernels, and the tools
necessary to be an NFS server and client come in most distributions.
However, users of Linux kernel 2.2 hoping to use NFS may wish to upgrade to
kernel 2.4; while the earlier version of Linux NFS did work well, it was far slower than
most other Unix implementations of this protocol.
>Start Config-HOWTO
2.15. Automount Points
If you don't like the mounting/unmounting thing, consider using autofs(5). You tell the autofs daemon what to automount and where starting with a file, /etc/auto.master. Its structure is simple:
/misc/etc/auto.misc
/mnt/etc/auto.mnt
In this example you tell autofs to automount media in /misc and /mnt, while the mountpoints are specified in/etc/auto.misc and /etc/auto.mnt. An example of /etc/auto.misc:
# an NFS export
server -romy.buddy.net:/pub/export
# removable media
cdrom -fstype=iso9660,ro:/dev/hdb
floppy-fstype=auto:/dev/fd0
Start the automounter. From now on, whenever you try to access the inexistent mount point /misc/cdrom, il will be created and the CD-ROM will be mounted.
>End Config-HOWTO
> Linux NFS-HOWTO
> NFS-Root mini-HOWTO
> NFS-Root-Client Mini-HOWTO
> The Linux NIS(YP)/NYS/NIS+ HOWTO
</para>
Linux NFS-HOWTO
2. Introduction
2.1. What is NFS?
The Network File System (NFS) was developed to allow machines to mount a disk
partition on a remote machine as if it were on a local hard drive. This
allows for fast, seamless sharing of files across a network.
It also gives the potential for unwanted people to access your hard drive
over the network (and thereby possibly read your email and delete all your
files as well as break into your system) if you set it up incorrectly. So
please read the Security section of this document carefully if you intend to
implement an NFS setup.
There are other systems that provide similar functionality to NFS. Samba
([http://www.samba.org] http://www.samba.org) provides file services to
Windows clients. The Andrew File System from IBM ([http://www.transarc.com/
Product/EFS/AFS/index.html] http://www.transarc.com/Product/EFS/AFS/
index.html), recently open-sourced, provides a file sharing mechanism with
some additional security and performance features. The Coda File System
([http://www.coda.cs.cmu.edu/] http://www.coda.cs.cmu.edu/) is still in
development as of this writing but is designed to work well with disconnected
clients. Many of the features of the Andrew and Coda file systems are slated
for inclusion in the next version of NFS (Version 4) ([http://www.nfsv4.org]
http://www.nfsv4.org). The advantage of NFS today is that it is mature,
standard, well understood, and supported robustly across a variety of
platforms.
</sect1 id="NFS">
<sect1 id="Samba">
8.11. SAMBA - `NetBEUI', `NetBios', `CIFS' support.
SAMBA is an implementation of the Session Management Block protocol.
Samba allows Microsoft and other systems to mount and use your disks
and printers.
SAMBA and its configuration are covered in detail in the SMB-HOWTO.
5.2. Windows Environment
Samba is a suite of applications that allow most Unices (and in
particular Linux) to integrate into a Microsoft network both as a
client and a server. Acting as a server it allows Windows 95, Windows
for Workgroups, DOS and Windows NT clients to access Linux files and
printing services. It can completely replace Windows NT for file and
printing services, including the automatic downloading of printer
drivers to clients. Acting as a client allows the Linux workstation to
mount locally exported windows file shares.
According to the SAMBA Meta-FAQ:
"Many users report that compared to other SMB implementations Samba is more stable,
faster, and compatible with more clients. Administrators of some large installations say
that Samba is the only SMB server available which will scale to many tens of thousands
of users without crashing"
<20> Samba project home page <http://samba.anu.edu.au/samba/>
<20> SMB HOWTO <http://metalab.unc.edu/mdw/HOWTO/SMB-HOWTO.html>
<20> Printing HOWTO <http://metalab.unc.edu/mdw/HOWTO/Printing-
HOWTO.html>
<glossentry>
<glossterm>
samba
</glossterm>
<glossdef>
<para>
A LanManager like file and printer server for Unix. The Samba software suite is a collection of programs that implements the SMB protocol for unix systems, allowing you to serve files and printers to Windows, NT, OS/2 and DOS clients. This protocol is sometimes also referred to as the LanManager or NetBIOS protocol. This package contains all the components necessary to turn your Debian GNU/Linux box into a powerful file and printer server. Currently, the Samba Debian packages consist of the following: samba - A LanManager like file and printer server for Unix. samba-common - Samba common files used by both the server and the client. smbclient - A LanManager like simple client for Unix. swat - Samba Web Administration Tool samba-doc - Samba documentation. smbfs - Mount and umount commands for the smbfs (kernels 2.0.x and above). libpam-smbpass - pluggable authentication module for SMB password database libsmbclient - Shared library that allows applications to talk to SMB servers libsmbclient-dev - libsmbclient shared libraries winbind: Service to resolve user and group information from Windows NT servers It is possible to install a subset of these packages depending on your particular needs. For example, to access other SMB servers you should only need the smbclient and samba-common packages. From Debian 3.0r0 APT
<ulink url="http://www.tldp.org/LDP/Linux-Dictionary/html/index.html">http://www.tldp.org/LDP/Linux-Dictionary/html/index.html</ulink>
</para>
</glossdef>
</glossentry>
<glossentry>
<glossterm>
Samba
</glossterm>
<glossdef>
<para>
A lot of emphasis has been placed on peaceful coexistence between UNIX and Windows. Unfortunately, the two systems come from very different cultures and they have difficulty getting along without mediation. ...and that, of course, is Samba&apos;s job. Samba &lt;http://samba.org/&gt; runs on UNIX platforms, but speaks to Windows clients like a native. It allows a UNIX system to move into a Windows ``Network Neighborhood&apos;&apos; without causing a stir. Windows users can happily access file and print services without knowing or caring that those services are being offered by a UNIX host. All of this is managed through a protocol suite which is currently known as the ``Common Internet File System,&apos;&apos; or CIFS &lt;http://www.cifs.com&gt;. This name was introduced by Microsoft, and provides some insight into their hopes for the future. At the heart of CIFS is the latest incarnation of the Server Message Block (SMB) protocol, which has a long and tedious history. Samba is an open source CIFS implementation, and is available for free from the http://samba.org/ mirror sites. Samba and Windows are not the only ones to provide CIFS networking. OS/2 supports SMB file and print sharing, and there are commercial CIFS products for Macintosh and other platforms (including several others for UNIX). Samba has been ported to a variety of non-UNIX operating systems, including VMS, AmigaOS, and NetWare. CIFS is also supported on dedicated file server platforms from a variety of vendors. In other words, this stuff is all over the place. From Rute-Users-Guide
<ulink url="http://www.tldp.org/LDP/Linux-Dictionary/html/index.html">http://www.tldp.org/LDP/Linux-Dictionary/html/index.html</ulink>
</para>
</glossdef>
</glossentry>
<glossentry>
<glossterm>
Samba
</glossterm>
<glossdef>
<para>
Samba adds Windows-networking support to UNIX. Whereas NFS is the most popular protocol for sharing files among UNIX machines, SMB is the most popular protocol for sharing files among Windows machines. The Samba package adds the ability for UNIX systems to interact with Windows systems. Key point: The Samba package comprises the following: smbd The Samba service allowing other machines (often Windows) to read files from a UNIX machine. nmbd Provides support for NetBIOS. Logically, the SMB protocol is layered on top of NetBIOS, which is in turn layered on top of TCP/IP. smbmount An extension to the mount program that allows a UNIX machine to connect to another machine implicitly. Files can be accessed as if they were located on the local machines. smbclient Allows files to be access through SMB in an explicity manner. This is a command-line tool much like the FTP tool that allows files to be copied. Unlike smbmount, files cannot be accessed as if they were local. smb.conf The configuration file for Samba. From Hacking-Lexicon
<ulink url="http://www.tldp.org/LDP/Linux-Dictionary/html/index.html">http://www.tldp.org/LDP/Linux-Dictionary/html/index.html</ulink>
</para>
</glossdef>
</glossentry>
Samba Authenticated Gateway HOWTO
Ricardo Alexandre Mattar
v1.2, 2004-05-21
</sect1 id="SAMBA">
<sect1 id="SSH">
<title>SSH</title>
<para>
The Secure Shell, or SSH, provides a way of running command line and
graphical applications, and transferring files, over an encrypted
connection. SSH uses up to 2,048-bit encryption with a variety of
cryptographic schemes to make sure that if a cracker intercepts your
connection, all they can see is useless gibberish. It is both a
protocol and a suite of small command line applications which can be
used for various functions.
</para>
<para>
SSH replaces the old Telnet application, and can be used for secure
remote administration of machines across the Internet. However, it
has more features.
</para>
<para>
SSH increases the ease of running applications remotely by setting up
permissions automatically. If you can log into a machine, it allows you
to run a graphical application on it, unlike Telnet, which requires users
to type lots of geeky xhost and xauth commands. SSH also has inbuild
compression, which allows your graphic applications to run much faster
over the network.
</para>
<para>
SCP (Secure Copy) and SFTP (Secure FTP) allow transfer of files over the
remote link, either via SSH's own command line utilities or graphical tools
like Gnome's GFTP. Like Telnet, SSH is cross-platform. You can find SSH
servers and clients for Linux, Unix, all flavours of Windows, BeOS, PalmOS,
Java and Embedded OSes used in routers.
</para>
<para>
Encrypted remote shell sessions are available through SSH
(http://www.ssh.fi/sshprotocols2/index.html
<http://www.ssh.fi/sshprotocols2/index.html>) thus effectively
allowing secure remote administration.
</para>
</sect1 id="SSH">
<sect1 id="Telnet">
<title>Telnet</title>
<para>
Created in the early 1970s, Telnet provides a method of running command
line applications on a remote computer as if that person were actually at
the remote site. Telnet is one of the most powerful tools for Unix, allowing
for true remote administration. It is also an interesting program from the
point of view of users, because it allows remote access to all their files
and programs from anywhere in the Internet. Combined with an X server (as
well as some rather arcane manipluation of authentication 'cookies' and
'DISPLAY' environment variables), there is no difference (apart from the
delay) between being at the console or on the other side of the planet.
However, since the 'telnet' protocol sends data 'en-clair' and there are
now more efficient protocols with features such as built-in
compression and 'tunneling' which allows for greater ease of usage of graphical
applications across the network as well as more secure connections it is an
effectively a dead protocol. Like the 'r' (such as rlogin and rsh) related
protocols it is still used though, within internal networks for the reasons
of ease of installation and use as well as backwards compatibility and also
as a means by which to configure networking devices such as routers
and firewalls.
</para>
<para>
Please consult RFC 854 for further details behind its implementation.
</para>
<para>
<20> Telnet related software
<http://metalab.unc.edu/pub/Linux/system/network/telnet/>
</para>
</sect1 id="Telnet">
<sect1 id="TFTP">
<title>TFTP</title>
<para>
Trivial File Transfer Protocol TFTP is a bare-bones protocol used by
devices that boot from the network. It is runs on top of UDP, so it
doesn&apos;t require a real TCP/IP stack. Misunderstanding: Many people
describe TFTP as simply a trivial version of FTP without authentication.
This misses the point. The purpose of TFTP is not to reduce the complexity
of file transfer, but to reduce the complexity of the underlying TCP/IP
stack so that it can fit inside boot ROMs. Key point: TFTP is almost
always used with BOOTP. BOOTP first configures the device, then TFTP
transfers the boot image named by BOOTP which is then used to boot the
device. Key point: Many systems come with unnecessary TFTP servers. Many
TFTP servers have bugs, like the backtracking problem or buffer overflows.
As a consequence, many systems can be exploited with TFTP even though
virtually nobody really uses it. Key point: A TFTP file transfer client
is built into many operating systems (UNIX, Windows, etc....). These clients
are often used to download rootkits when being broken into. Therefore,
removing the TFTP client should be part of your hardening procedure.
For further details on the TFTP protocol please see RFC's 1350, 1782,
1783, 1784, and 1785.
</para>
<para>
Most likely, you'll interface with the TFTP protocol using the TFTP command
line client, 'tftp', which allows users to transfer files to and from a
remote machine. The remote host may be specified on the command line, in
which case tftp uses host as the default host for future transfers.
</para>
<para>
Setting up TFTP is almost as easy as DHCP.
First install from the rpm package:
<screen>
# rpm -ihv tftp-server-*.rpm
</screen>
</para>
<para>
Create a directory for the files:
<screen>
# mkdir /tftpboot
# chown nobody:nobody /tftpboot
</screen>
</para>
<para>
The directory /tftpboot is owned by user nobody, because this is the default
user id set up by tftpd to access the files. Edit the file /etc/xinetd.d/tftp
to look like the following:
</para>
<para>
<screen>
service tftp
{
socket_type = dgram
protocol = udp
wait = yes
user = root
server = /usr/sbin/in.tftpd
server_args = -c -s /tftpboot
disable = no
per_source = 11
cps = 100 2
}
</screen>
</para>
<para>
The changes from the default file are the parameter disable = no (to enable
the service) and the server argument -c. This argument allows for the
creation of files, which is necessary if you want to save boot or disk
images. You may want to make TFTP read only in normal operation.
</para>
<para>
Then reload xinetd:
<screen>
/etc/rc.d/init.d/xinetd reload
</screen>
</para>
<para>
You can use the tftp command, available from the tftp (client) rpm package,
to test the server. At the tftp prompt, you can issue the commands put and
get.
</para>
</sect1 id="TFTP">
<sect1 id="VNC">
<title>VNC</title>
8.13. Tunnelling, mobile IP and virtual private networks
The Linux kernel allows the tunnelling (encapsulation) of protocols.
It can do IPX tunnelling through IP, allowing the connection of two
IPX networks through an IP only link. It can also do IP-IP tunnelling,
which it is essential for mobile IP support, multicast support and
amateur radio. (see
http://metalab.unc.edu/mdw/HOWTO/NET3-4-HOWTO-6.html#ss6.8)
Mobile IP specifies enhancements that allow transparent routing of IP
datagrams to mobile nodes in the Internet. Each mobile node is always
identified by its home address, regardless of its current point of
attachment to the Internet. While situated away from its home, a
mobile node is also associated with a care-of address, which provides
information about its current point of attachment to the Internet.
The protocol provides for registering the care-of address with a home
agent. The home agent sends datagrams destined for the mobile node
through a tunnel to the care-of address. After arriving at the end of
the tunnel, each datagram is then delivered to the mobile node.
Point-to-Point Tunneling Protocol (PPTP) is a networking technology
that allows the use of the Internet as a secure virtual private
network (VPN). PPTP is integrated with the Remote Access Services
(RAS) server which is built into Windows NT Server. With PPTP, users
can dial into a local ISP, or connect directly to the Internet, and
access their network as if they were at their desks. PPTP is a closed
protocol and its security has recently being compromised. It is highly
recomendable to use other Linux based alternatives, since they rely on
open standards which have been carefully examined and tested.
<20> A client implementation of the PPTP for Linux is available here
<http://www.pdos.lcs.mit.edu/~cananian/Projects/PPTP/>
<20> More on Linux PPTP can be found here
<http://bmrc.berkeley.edu/people/chaffee/linux_pptp.html>
Mobile IP:
<20> http://www.hpl.hp.com/personal/Jean_Tourrilhes/MobileIP/mip.html
<20> http://metalab.unc.edu/mdw/HOWTO/NET3-4-HOWTO-6.html#ss6.12
Virtual Private Networks related documents:
<20> http://metalab.unc.edu/mdw/HOWTO/mini/VPN.html
<20> http://sites.inka.de/sites/bigred/devel/cipe.html
7.4. VNC
VNC stands for Virtual Network Computing. It is, in essence, a remote
display system which allows one to view a computing 'desktop'
environment not only on the machine where it is running, but from
anywhere on the Internet and from a wide variety of machine
architectures. Both clients and servers exist for Linux as well as for
many other platforms. It is possible to execute MS-Word in a Windows
NT or 95 machine and have the output displayed in a Linux machine. The
opposite is also true; it is possible to execute an application in a
Linux machine and have the output displayed in any other Linux or
Windows machine. One of the available clients is a Java applet,
allowing the remote display to be run inside a web browser. Another
client is a port for Linux using the SVGAlib graphics library,
allowing 386s with as little as 4 MB of RAM to become fully functional
X-Terminals.
<20> VNC web site <http://www.orl.co.uk/vnc/>
<para>
Virtual Network Computing (VNC) allows a user to operate a session running on another machine.
Although Linux and all other Unix-like OSes already have this functionality built in, VNC
provides further advantages because it's cross-platform, running on Linux, BSD, Unix, Win32,
MacOS, and PalmOS. This makes it far more versatile.
For example, let's assume the machine that you are attempting to connect to is running Linux.
You can use VNC to access applications running on that other Linux desktop. You can also use
VNC to provide technical support to users on Window's based machines by taking control of
their desktops from the comfort of your server room. VNC is usually installed as seperate
packages for the client and server, typically named 'vnc' and 'vnc-server'.
VNC uses screen numbers to connect clients to servers. This is because Unix machines allow
multiple graphical sessions to be stated simultaneously (check this out by logging in to a
virtual terminal and typing startx -- :1).
For platforms (Windows, MacOS, Palm, etc) which don't have this capability, you'll connect
to 'screen 0' and take over the session of the existing user. For Unix systems, you'll need
to specify a higher number and receive a new desktop.
If you prefer the Windows-style approach where the VNC client takes over the currently
running display, you can use x0rfbserver - see the sidebox below.
VNC Servers and Clients
On Linux, the VNC server (which allows the machine to be used remotely) is actually
run as a replacement X server. To be able to start a VNC session to a machine, log
into it and run vncserver. You'll be prompted for a password - in future you can
change this password with the vncpasswd command. After you enter the password, you'll
be told the display number of the newly created machine.
It is possible to control a remote macine by using the vncviewer command. If it is
typed on its own it will prompt for a remote machine, or you can use:
vncviewer [host]:[screen-number]
> The VPN HOWTO, deprecated!!!!
> VPN HOWTO
> Linux VPN Masquerade HOWTO
</para>
10. References
10.1. Web Sites
Cipe Home Page <http://sites.inka.de/~bigred/devel/cipe.html>
Masq Home Page <http://ipmasq.cjb.net>
Samba Home Page <http://samba.anu.edu.au>
Linux HQ <http://www.linuxhq.com> ---great site for lots of linux
info
10.2. Documentation
cipe.info: info file included with cipe distribution
Firewall HOWTO, by Mark Grennan, markg@netplus.net
IP Masquerade mini-HOWTO,by Ambrose Au, ambrose@writeme.com
IPChains-Howto, by Paul Russell, Paul.Russell@rustcorp.com.au
</sect1 id="VNC">
<sect1 id="Web-Serving">
<title>Web-Serving</title>
<para>
The World Wide Web provides a simple method of publishing and linking
information across the Internet, and is responsible for popularising
the Internet to its current level. In the simplest case, a Web client
(or browser), such as Netscape or Internet Explorer, connects with a
Web server using a simple request/response protocol called HTTP
(Hypertext Transfer Protocol), and requests HTML (Hypertext Markup
Language) pages, images, Flash and other objects.
</para>
<para>
In mode modern situations, the Web server can also geneate pages
dynamically based on information returned from the user. Either way
setting up your own Web server is extremely simple. There are many
choices for Web serving under Linux. Some servers are very mature,
such as Apache, and are perfect for small and large sites alike.
Other servers programmed to be light and fast, and to have only a
limited feature set to reduce complexity. A search on freshmeat.net
will reveal a multitude of servers.
</para>
<para>
Most Linux distributions include Apache <http://www.apache.org>.
Apache is the number one server on the internet according to
http://www.netcraft.co.uk/survey/ . More than a half of all internet
sites are running Apache or one of it derivatives. Apache's advantages
include its modular design, stability and speed. Given the appropriate
hardware and configuration it can support the highest loads: Yahoo,
Altavista, GeoCities, and Hotmail are based on customized versions of
this server.
</para>
<para>
Optional support for SSL (which enables secure transactions) is also
available at:
</para>
<20> http://www.apache-ssl.org/
<20> http://raven.covalent.net/
<20> http://www.c2.net/
Dynamic Web content generation
<para>
Web scripting languages are even more common on Linux than databases
- basically, every language is available. This includes CGI,
PHP 3 and 4, Perl, JSP, ASP (via closed source applications from
Chill!soft and Halycon Software) and ColdFusion.
</para>
<para>
PHP is an open source scripting language designed to churn out
dynamically produced Web content ranging from databases to browsers.
This inludes not only HTML, but also graphics, Macromedia Flash and
XML-based information. The latest versions of PHP provide impressive
speed improvements, install easily from packages and can be set up
quickly. PHP is the most popular Apache module and is used by over
two million sites, including Amazon.com, US telco giant Sprint,
Xoom Networks and Lycos. And unlike most other server side scripting
languages, developers (or those that employ them) can add their own
functions into the source to improve it. Supported databases include
those in the Database serving section and most ODBC compliant
databases. The language itself borrows its structure from Perl and C.
</para>
<20> http://metalab.unc.edu/mdw/HOWTO/WWW-HOWTO.html
<20> http://metalab.unc.edu/mdw/HOWTO/Virtual-Services-HOWTO.html
<20> http://metalab.unc.edu/mdw/HOWTO/Intranet-Server-HOWTO.html
<20> Web servers for Linux
<http://www.linuxlinks.com/Software/Internet/WebServers/>
</sect1 id="Web-Serving">
<sect1 id="X11">
<title>X11</title>
<para>
The X Window System was developed at MIT in the late 1980s, rapidly
becoming the industry standard windowing system for Unix graphics
workstations. The software is freely available, very versatile, and is
suitable for a wide range of hardware platforms. Any X environment
consists of two distinct parts, the X server and one or more X
clients. It is important to realise the distinction between the server
and the client. The server controls the display directly and is
responsible for all input/output via the keyboard, mouse or display.
The clients, on the other hand, do not access the screen directly -
they communicate with the server, which handles all input and output.
It is the clients which do the "real" computing work - running
applications or whatever. The clients communicate with the server,
causing the server to open one or more windows to handle input and
output for that client.
</para>
<para>
In short, the X Window System allows a user to log in into a remote
machine, execute a process (for example, open a web browser) and have
the output displayed on his own machine. Because the process is
actually being executed on the remote system, very little CPU power is
needed in the local one. Indeed, computers exist whose primary purpose
is to act as pure X servers. Such systems are called X terminals.
</para>
<para>
A free port of the X Window System exists for Linux and can be found
at: Xfree <http://www.xfree86.org/>. It is included in most Linux
distributions.
<para>
<para>
For further information regarding X please see:
</para>
X11, LBX, DXPC, NXServer, SSH, MAS
Related HOWTOs:
<EFBFBD> Remote X Apps HOWTO
<EFBFBD> Linux XDMCP HOWTO
<EFBFBD> XDM and X Terminal mini-HOWTO
<EFBFBD> The Linux XFree86 HOWTO
<EFBFBD> ATI R200 + XFree86 4.x mini-HOWTO
<EFBFBD> Second Mouse in X mini-HOWTO
<EFBFBD> Linux Touch Screen HOWTO
<EFBFBD> XFree86 Video Timings HOWTO
<EFBFBD> Linux XFree-to-Xinside mini-HOWTO
<EFBFBD> XFree Local Multi-User HOWTO
<EFBFBD> Using Xinerama to MultiHead XFree86 V. 4.0+
<EFBFBD> Connecting X Terminals to Linux Mini-HOWTO
<EFBFBD> How to change the title of an xterm
<EFBFBD> X Window System Architecture Overview HOWTO
<EFBFBD> The X Window User HOWTO
</sect1 id="X11">
<sect1 id="Email">
<title>Email</title>
<para>
Alongside the Web, mail is the top reason for the popularity of the Internet. Email is an inexpensive and fast method of time-shifted messaging which, much like the Web, is actually based around sending and receiving plain text files. The protocol used is called the Simple Mail Transfer Protocol (SMTP). The server programs that implement SMTP to move mail from one server to another are called Mail Transfer Agents (MTAs).
</para>
<para>
In times gone by, users would Telnet into the SMTP server itself and use a command line program like elm or pine to check ther mail. These days, users run email clients like Netscape, Evolution, Kmail or Outlook on their desktop to check their email off a local SMTP server. Additional protocols like POP3 and IMAP4 are used between the SMTP server and desktop mail client to allow clients to manipulate files on, and download from, their local mail server. The programs that implement POP3 and IMAP4 are called Mail Delivery Agents (MDAs). They are generally separate from MTAs.
</para>
* Linux Mail-Queue mini-HOWTO
* The Linux Mail User HOWTO
</sect1 id="Email-Hosting">
<sect1 id="Proxy-Caching">
8.11. Proxy Server
The term proxy means "to do something on behalf of someone else." In
networking terms, a proxy server computer can act on the behalf of
several clients. An HTTP proxy is a machine that receives requests for
web pages from another machine (Machine A). The proxy gets the page
requested and returns the result to Machine A. The proxy may have a
cache with the requested pages, so if another machine asks for the
same page the copy in the cache will be returned instead. This allows
efficient use of bandwidth resources and less response time. As a side
effect, as client machines are not directly connected to the outside
world this is a way of securing the internal network. A well-
configured proxy can be as effective as a good firewall.
Several proxy servers exist for Linux. One popular solution is the
Apache proxy module. A more complete and robust implementation of an
HTTP proxy is SQUID.
<20> Apache <http://www.apache.org>
<20> Squid <http://squid.nlanr.net/>
<title>Proxy-Caching</title>
<para>
When a web browser retreives information from the Internet, it stores a copy of that information
in a cache on the local machine. When a user requests that information in future, the browser will
check to seee if the original source has updated; if not, the browser will simply use the cached version
rather than fetch the data again.
By doing this, there is less information that needs to be downloadded, which makes the connection seem responsive
to users and reduces bandwidth costs.
But if there are many browsers accessing the Internet through the same connection, it makes better sense to have
a single, centralised cache so that once a single machine has requested some information, the next
machine to try and download that information can also access it more quickly. This is the
theory behind the proxy cache. Squid is by far the most popular cache used on the Web, and can also be used
to accelerate Web serving.
Although Squid is useful for an ISP, large businesses or even a small office can afford to use Squid to
speed up transfers and save money, and it can easily be used to the same effect in a home with a few
flatmates sharing a cable or ADSL connection.
</para>
Traffic Control HOWTO
Version 1.0.1
Martin A. Brown
[http://www.securepipe.com/] SecurePipe, Inc.
Network Administration
<mabrown@securepipe.com>
"Nov 2003"
Revision History
Revision 1.0.1 2003-11-17 Revised by: MAB
Added link to Leonardo Balliache's documentation
Revision 1.0 2003-09-24 Revised by: MAB
reviewed and approved by TLDP
Revision 0.7 2003-09-14 Revised by: MAB
incremental revisions, proofreading, ready for TLDP
Revision 0.6 2003-09-09 Revised by: MAB
minor editing, corrections from Stef Coene
Revision 0.5 2003-09-01 Revised by: MAB
HTB section mostly complete, more diagrams, LARTC pre-release
Revision 0.4 2003-08-30 Revised by: MAB
added diagram
Revision 0.3 2003-08-29 Revised by: MAB
substantial completion of classless, software, rules, elements and components
sections
Revision 0.2 2003-08-23 Revised by: MAB
major work on overview, elements, components and software sections
Revision 0.1 2003-08-15 Revised by: MAB
initial revision (outline complete)
Traffic control encompasses the sets of mechanisms and operations by which
packets are queued for transmission/reception on a network interface. The
operations include enqueuing, policing, classifying, scheduling, shaping and
dropping. This HOWTO provides an introduction and overview of the
capabilities and implementation of traffic control under Linux.
<EFBFBD> 2003, Martin A. Brown
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1 or any
later version published by the Free Software Foundation; with no
invariant sections, with no Front-Cover Texts, with no Back-Cover Text. A
copy of the license is located at [http://www.gnu.org/licenses/fdl.html]
http://www.gnu.org/licenses/fdl.html.
-----------------------------------------------------------------------------
Table of Contents
1. Introduction to Linux Traffic Control
1.1. Target audience and assumptions about the reader
1.2. Conventions
1.3. Recommended approach
1.4. Missing content, corrections and feedback
2. Overview of Concepts
2.1. What is it?
2.2. Why use it?
2.3. Advantages
2.4. Disdvantages
2.5. Queues
2.6. Flows
2.7. Tokens and buckets
2.8. Packets and frames
3. Traditional Elements of Traffic Control
3.1. Shaping
3.2. Scheduling
3.3. Classifying
3.4. Policing
3.5. Dropping
3.6. Marking
4. Components of Linux Traffic Control
4.1. qdisc
4.2. class
4.3. filter
4.4. classifier
4.5. policer
4.6. drop
4.7. handle
5. Software and Tools
5.1. Kernel requirements
5.2. iproute2 tools (tc)
5.3. tcng, Traffic Control Next Generation
5.4. IMQ, Intermediate Queuing device
6. Classless Queuing Disciplines (qdiscs)
6.1. FIFO, First-In First-Out (pfifo and bfifo)
6.2. pfifo_fast, the default Linux qdisc
6.3. SFQ, Stochastic Fair Queuing
6.4. ESFQ, Extended Stochastic Fair Queuing
6.5. GRED, Generic Random Early Drop
6.6. TBF, Token Bucket Filter
7. Classful Queuing Disciplines (qdiscs)
7.1. HTB, Hierarchical Token Bucket
7.2. PRIO, priority scheduler
7.3. CBQ, Class Based Queuing
8. Rules, Guidelines and Approaches
8.1. General Rules of Linux Traffic Control
8.2. Handling a link with a known bandwidth
8.3. Handling a link with a variable (or unknown) bandwidth
8.4. Sharing/splitting bandwidth based on flows
8.5. Sharing/splitting bandwidth based on IP
9. Scripts for use with QoS/Traffic Control
9.1. wondershaper
9.2. ADSL Bandwidth HOWTO script (myshaper)
9.3. htb.init
9.4. tcng.init
9.5. cbq.init
10. Diagram
10.1. General diagram
11. Annotated Traffic Control Links
1. Introduction to Linux Traffic Control
Linux offers a very rich set of tools for managing and manipulating the
transmission of packets. The larger Linux community is very familiar with the
tools available under Linux for packet mangling and firewalling (netfilter,
and before that, ipchains) as well as hundreds of network services which can
run on the operating system. Few inside the community and fewer outside the
Linux community are aware of the tremendous power of the traffic control
subsystem which has grown and matured under kernels 2.2 and 2.4.
This HOWTO purports to introduce the concepts of traffic control, the
traditional elements (in general), the components of the Linux traffic
control implementation and provide some guidelines . This HOWTO represents
the collection, amalgamation and synthesis of the [http://lartc.org/howto/]
LARTC HOWTO, documentation from individual projects and importantly the LARTC
mailing list over a period of study.
The impatient soul, who simply wishes to experiment right now, is
recommended to the [http://tldp.org/HOWTO/Traffic-Control-tcng-HTB-HOWTO/]
Traffic Control using tcng and HTB HOWTO and [http://lartc.org/howto/] LARTC
HOWTO for immediate satisfaction.
-----------------------------------------------------------------------------
1.1. Target audience and assumptions about the reader
The target audience for this HOWTO is the network administrator or savvy
home user who desires an introduction to the field of traffic control and an
overview of the tools available under Linux for implementing traffic control.
I assume that the reader is comfortable with UNIX concepts and the command
line and has a basic knowledge of IP networking. Users who wish to implement
traffic control may require the ability to patch, compile and install a
kernel or software package [1]. For users with newer kernels (2.4.20+, see
also Section 5.1), however, the ability to install and use software may be
all that is required.
Broadly speaking, this HOWTO was written with a sophisticated user in mind,
perhaps one who has already had experience with traffic control under Linux.
I assume that the reader may have no prior traffic control experience.
-----------------------------------------------------------------------------
1.2. Conventions
This text was written in [http://www.docbook.org/] DocBook ([http://
www.docbook.org/xml/4.2/index.html] version 4.2) with vim. All formatting has
been applied by [http://xmlsoft.org/XSLT/] xsltproc based on DocBook XSL and
LDP XSL stylesheets. Typeface formatting and display conventions are similar
to most printed and electronically distributed technical documentation.
-----------------------------------------------------------------------------
1.3. Recommended approach
I strongly recommend to the eager reader making a first foray into the
discipline of traffic control, to become only casually familiar with the tc
command line utility, before concentrating on tcng. The tcng software package
defines an entire language for describing traffic control structures. At
first, this language may seem daunting, but mastery of these basics will
quickly provide the user with a much wider ability to employ (and deploy)
traffic control configurations than the direct use of tc would afford.
Where possible, I'll try to prefer describing the behaviour of the Linux
traffic control system in an abstract manner, although in many cases I'll
need to supply the syntax of one or the other common systems for defining
these structures. I may not supply examples in both the tcng language and the
tc command line, so the wise user will have some familiarity with both.
-----------------------------------------------------------------------------
1.4. Missing content, corrections and feedback
There is content yet missing from this HOWTO. In particular, the following
items will be added at some point to this documentation.
<EFBFBD><EFBFBD>*<2A> A description and diagram of GRED, WRR, PRIO and CBQ.
<EFBFBD><EFBFBD>*<2A> A section of examples.
<EFBFBD><EFBFBD>*<2A> A section detailing the classifiers.
<EFBFBD><EFBFBD>*<2A> A section discussing the techniques for measuring traffic.
<EFBFBD><EFBFBD>*<2A> A section covering meters.
<EFBFBD><EFBFBD>*<2A> More details on tcng.
I welcome suggestions, corrections and feedback at <mabrown@securepipe.com
>. All errors and omissions are strictly my fault. Although I have made every
effort to verify the factual correctness of the content presented herein, I
cannot accept any responsibility for actions taken under the influence of
this documentation.
-----------------------------------------------------------------------------
2. Overview of Concepts
This section will introduce traffic control and examine reasons for it,
identify a few advantages and disadvantages and introduce key concepts used
in traffic control.
-----------------------------------------------------------------------------
2.1. What is it?
Traffic control is the name given to the sets of queuing systems and
mechanisms by which packets are received and transmitted on a router. This
includes deciding which (and whether) packets to accept at what rate on the
input of an interface and determining which packets to transmit in what order
at what rate on the output of an interface.
In the overwhelming majority of situations, traffic control consists of a
single queue which collects entering packets and dequeues them as quickly as
the hardware (or underlying device) can accept them. This sort of queue is a
FIFO.
Note The default qdisc under Linux is the pfifo_fast, which is slightly more
complex than the FIFO.
There are examples of queues in all sorts of software. The queue is a way
of organizing the pending tasks or data (see also Section 2.5). Because
network links typically carry data in a serialized fashion, a queue is
required to manage the outbound data packets.
In the case of a desktop machine and an efficient webserver sharing the
same uplink to the Internet, the following contention for bandwidth may
occur. The web server may be able to fill up the output queue on the router
faster than the data can be transmitted across the link, at which point the
router starts to drop packets (its buffer is full!). Now, the desktop machine
(with an interactive application user) may be faced with packet loss and high
latency. Note that high latency sometimes leads to screaming users! By
separating the internal queues used to service these two different classes of
application, there can be better sharing of the network resource between the
two applications.
Traffic control is the set of tools which allows the user to have granular
control over these queues and the queuing mechanisms of a networked device.
The power to rearrange traffic flows and packets with these tools is
tremendous and can be complicated, but is no substitute for adequate
bandwidth.
The term Quality of Service (QoS) is often used as a synonym for traffic
control.
-----------------------------------------------------------------------------
2.2. Why use it?
Packet-switched networks differ from circuit based networks in one very
important regard. A packet-switched network itself is stateless. A
circuit-based network (such as a telephone network) must hold state within
the network. IP networks are stateless and packet-switched networks by
design; in fact, this statelessness is one of the fundamental strengths of
IP.
The weakness of this statelessness is the lack of differentiation between
types of flows. In simplest terms, traffic control allows an administrator to
queue packets differently based on attributes of the packet. It can even be
used to simulate the behaviour of a circuit-based network. This introduces
statefulness into the stateless network.
There are many practical reasons to consider traffic control, and many
scenarios in which using traffic control makes sense. Below are some examples
of common problems which can be solved or at least ameliorated with these
tools.
The list below is not an exhaustive list of the sorts of solutions
available to users of traffic control, but introduces the types of problems
that can be solved by using traffic control to maximize the usability of a
network connection.
Common traffic control solutions
<EFBFBD><EFBFBD>*<2A> Limit total bandwidth to a known rate; TBF, HTB with child class(es).
<EFBFBD><EFBFBD>*<2A> Limit the bandwidth of a particular user, service or client; HTB
classes and classifying with a filter. traffic.
<EFBFBD><EFBFBD>*<2A> Maximize TCP throughput on an asymmetric link; prioritize transmission
of ACK packets, wondershaper.
<EFBFBD><EFBFBD>*<2A> Reserve bandwidth for a particular application or user; HTB with
children classes and classifying.
<EFBFBD><EFBFBD>*<2A> Prefer latency sensitive traffic; PRIO inside an HTB class.
<EFBFBD><EFBFBD>*<2A> Managed oversubscribed bandwidth; HTB with borrowing.
<EFBFBD><EFBFBD>*<2A> Allow equitable distribution of unreserved bandwidth; HTB with
borrowing.
<EFBFBD><EFBFBD>*<2A> Ensure that a particular type of traffic is dropped; policer attached
to a filter with a drop action.
Remember, too that sometimes, it is simply better to purchase more
bandwidth. Traffic control does not solve all problems!
-----------------------------------------------------------------------------
2.3. Advantages
When properly employed, traffic control should lead to more predictable
usage of network resources and less volatile contention for these resources.
The network then meets the goals of the traffic control configuration. Bulk
download traffic can be allocated a reasonable amount of bandwidth even as
higher priority interactive traffic is simultaneously serviced. Even low
priority data transfer such as mail can be allocated bandwidth without
tremendously affecting the other classes of traffic.
In a larger picture, if the traffic control configuration represents policy
which has been communicated to the users, then users (and, by extension,
applications) know what to expect from the network.
-----------------------------------------------------------------------------
2.4. Disdvantages
Complexity is easily one of the most significant disadvantages of using
traffic control. There are ways to become familiar with traffic control tools
which ease the learning curve about traffic control and its mechanisms, but
identifying a traffic control misconfiguration can be quite a challenge.
Traffic control when used appropriately can lead to more equitable
distribution of network resources. It can just as easily be installed in an
inappropriate manner leading to further and more divisive contention for
resources.
The computing resources required on a router to support a traffic control
scenario need to be capable of handling the increased cost of maintaining the
traffic control structures. Fortunately, this is a small incremental cost,
but can become more significant as the configuration grows in size and
complexity.
For personal use, there's no training cost associated with the use of
traffic control, but a company may find that purchasing more bandwidth is a
simpler solution than employing traffic control. Training employees and
ensuring depth of knowledge may be more costly than investing in more
bandwidth.
Although traffic control on packet-switched networks covers a larger
conceptual area, you can think of traffic control as a way to provide [some
of] the statefulness of a circuit-based network to a packet-switched network.
-----------------------------------------------------------------------------
2.5. Queues
Queues form the backdrop for all of traffic control and are the integral
concept behind scheduling. A queue is a location (or buffer) containing a
finite number of items waiting for an action or service. In networking, a
queue is the place where packets (our units) wait to be transmitted by the
hardware (the service). In the simplest model, packets are transmitted in a
first-come first-serve basis [2]. In the discipline of computer networking
(and more generally computer science), this sort of a queue is known as a
FIFO.
Without any other mechanisms, a queue doesn't offer any promise for traffic
control. There are only two interesting actions in a queue. Anything entering
a queue is enqueued into the queue. To remove an item from a queue is to
dequeue that item.
A queue becomes much more interesting when coupled with other mechanisms
which can delay packets, rearrange, drop and prioritize packets in multiple
queues. A queue can also use subqueues, which allow for complexity of
behaviour in a scheduling operation.
From the perspective of the higher layer software, a packet is simply
enqueued for transmission, and the manner and order in which the enqueued
packets are transmitted is immaterial to the higher layer. So, to the higher
layer, the entire traffic control system may appear as a single queue [3]. It
is only by examining the internals of this layer that the traffic control
structures become exposed and available.
-----------------------------------------------------------------------------
2.6. Flows
A flow is a distinct connection or conversation between two hosts. Any
unique set of packets between two hosts can be regarded as a flow. Under TCP
the concept of a connection with a source IP and port and destination IP and
port represents a flow. A UDP flow can be similarly defined.
Traffic control mechanisms frequently separate traffic into classes of
flows which can be aggregated and transmitted as an aggregated flow (consider
DiffServ). Alternate mechanisms may attempt to divide bandwidth equally based
on the individual flows.
Flows become important when attempting to divide bandwidth equally among a
set of competing flows, especially when some applications deliberately build
a large number of flows.
-----------------------------------------------------------------------------
2.7. Tokens and buckets
Two of the key underpinnings of a shaping mechanisms are the interrelated
concepts of tokens and buckets.
In order to control the rate of dequeuing, an implementation can count the
number of packets or bytes dequeued as each item is dequeued, although this
requires complex usage of timers and measurements to limit accurately.
Instead of calculating the current usage and time, one method, used widely in
traffic control, is to generate tokens at a desired rate, and only dequeue
packets or bytes if a token is available.
Consider the analogy of an amusement park ride with a queue of people
waiting to experience the ride. Let's imagine a track on which carts traverse
a fixed track. The carts arrive at the head of the queue at a fixed rate. In
order to enjoy the ride, each person must wait for an available cart. The
cart is analogous to a token and the person is analogous to a packet. Again,
this mechanism is a rate-limiting or shaping mechanism. Only a certain number
of people can experience the ride in a particular period.
To extend the analogy, imagine an empty line for the amusement park ride
and a large number of carts sitting on the track ready to carry people. If a
large number of people entered the line together many (maybe all) of them
could experience the ride because of the carts available and waiting. The
number of carts available is a concept analogous to the bucket. A bucket
contains a number of tokens and can use all of the tokens in bucket without
regard for passage of time.
And to complete the analogy, the carts on the amusement park ride (our
tokens) arrive at a fixed rate and are only kept available up to the size of
the bucket. So, the bucket is filled with tokens according to the rate, and
if the tokens are not used, the bucket can fill up. If tokens are used the
bucket will not fill up. Buckets are a key concept in supporting bursty
traffic such as HTTP.
The TBF qdisc is a classical example of a shaper (the section on TBF
includes a diagram which may help to visualize the token and bucket
concepts). The TBF generates rate tokens and only transmits packets when a
token is available. Tokens are a generic shaping concept.
In the case that a queue does not need tokens immediately, the tokens can
be collected until they are needed. To collect tokens indefinitely would
negate any benefit of shaping so tokens are collected until a certain number
of tokens has been reached. Now, the queue has tokens available for a large
number of packets or bytes which need to be dequeued. These intangible tokens
are stored in an intangible bucket, and the number of tokens that can be
stored depends on the size of the bucket.
This also means that a bucket full of tokens may be available at any
instant. Very predictable regular traffic can be handled by small buckets.
Larger buckets may be required for burstier traffic, unless one of the
desired goals is to reduce the burstiness of the flows.
In summary, tokens are generated at rate, and a maximum of a bucket's worth
of tokens may be collected. This allows bursty traffic to be handled, while
smoothing and shaping the transmitted traffic.
The concepts of tokens and buckets are closely interrelated and are used in
both TBF (one of the classless qdiscs) and HTB (one of the classful qdiscs).
Within the tcng language, the use of two- and three-color meters is
indubitably a token and bucket concept.
-----------------------------------------------------------------------------
2.8. Packets and frames
The terms for data sent across network changes depending on the layer the
user is examining. This document will rather impolitely (and incorrectly)
gloss over the technical distinction between packets and frames although they
are outlined here.
The word frame is typically used to describe a layer 2 (data link) unit of
data to be forwarded to the next recipient. Ethernet interfaces, PPP
interfaces, and T1 interfaces all name their layer 2 data unit a frame. The
frame is actually the unit on which traffic control is performed.
A packet, on the other hand, is a higher layer concept, representing layer
3 (network) units. The term packet is preferred in this documentation,
although it is slightly inaccurate.
-----------------------------------------------------------------------------
3. Traditional Elements of Traffic Control
-----------------------------------------------------------------------------
3.1. Shaping
Shapers delay packets to meet a desired rate.
Shaping is the mechanism by which packets are delayed before transmission
in an output queue to meet a desired output rate. This is one of the most
common desires of users seeking bandwidth control solutions. The act of
delaying a packet as part of a traffic control solution makes every shaping
mechanism into a non-work-conserving mechanism, meaning roughly: "Work is
required in order to delay packets."
Viewed in reverse, a non-work-conserving queuing mechanism is performing a
shaping function. A work-conserving queuing mechanism (see PRIO) would not be
capable of delaying a packet.
Shapers attempt to limit or ration traffic to meet but not exceed a
configured rate (frequently measured in packets per second or bits/bytes per
second). As a side effect, shapers can smooth out bursty traffic [4]. One of
the advantages of shaping bandwidth is the ability to control latency of
packets. The underlying mechanism for shaping to a rate is typically a token
and bucket mechanism. See also Section 2.7 for further detail on tokens and
buckets.
-----------------------------------------------------------------------------
3.2. Scheduling
Schedulers arrange and/or rearrange packets for output.
Scheduling is the mechanism by which packets are arranged (or rearranged)
between input and output of a particular queue. The overwhelmingly most
common scheduler is the FIFO (first-in first-out) scheduler. From a larger
perspective, any set of traffic control mechanisms on an output queue can be
regarded as a scheduler, because packets are arranged for output.
Other generic scheduling mechanisms attempt to compensate for various
networking conditions. A fair queuing algorithm (see SFQ) attempts to prevent
any single client or flow from dominating the network usage. A round-robin
algorithm (see WRR) gives each flow or client a turn to dequeue packets.
Other sophisticated scheduling algorithms attempt to prevent backbone
overload (see GRED) or refine other scheduling mechanisms (see ESFQ).
-----------------------------------------------------------------------------
3.3. Classifying
Classifiers sort or separate traffic into queues.
Classifying is the mechanism by which packets are separated for different
treatment, possibly different output queues. During the process of accepting,
routing and transmitting a packet, a networking device can classify the
packet a number of different ways. Classification can include marking the
packet, which usually happens on the boundary of a network under a single
administrative control or classification can occur on each hop individually.
The Linux model (see Section 4.3) allows for a packet to cascade across a
series of classifiers in a traffic control structure and to be classified in
conjunction with policers (see also Section 4.5).
-----------------------------------------------------------------------------
3.4. Policing
Policers measure and limit traffic in a particular queue.
Policing, as an element of traffic control, is simply a mechanism by which
traffic can be limited. Policing is most frequently used on the network
border to ensure that a peer is not consuming more than its allocated
bandwidth. A policer will accept traffic to a certain rate, and then perform
an action on traffic exceeding this rate. A rather harsh solution is to drop
the traffic, although the traffic could be reclassified instead of being
dropped.
A policer is a yes/no question about the rate at which traffic is entering
a queue. If the packet is about to enter a queue below a given rate, take one
action (allow the enqueuing). If the packet is about to enter a queue above a
given rate, take another action. Although the policer uses a token bucket
mechanism internally, it does not have the capability to delay a packet as a
shaping mechanism does.
-----------------------------------------------------------------------------
3.5. Dropping
Dropping discards an entire packet, flow or classification.
Dropping a packet is a mechanism by which a packet is discarded.
-----------------------------------------------------------------------------
3.6. Marking
Marking is a mechanism by which the packet is altered.
Note This is not fwmark. The iptablestarget MARKand the ipchains--markare
used to modify packet metadata, not the packet itself.
Traffic control marking mechanisms install a DSCP on the packet itself,
which is then used and respected by other routers inside an administrative
domain (usually for DiffServ).
-----------------------------------------------------------------------------
4. Components of Linux Traffic Control
Table 1. Correlation between traffic control elements and Linux components
+-------------------+-------------------------------------------------------+
|traditional element|Linux component |
+-------------------+-------------------------------------------------------+
|shaping |The class offers shaping capabilities. |
+-------------------+-------------------------------------------------------+
|scheduling |A qdisc is a scheduler. Schedulers can be simple such |
| |as the FIFO or complex, containing classes and other |
| |qdiscs, such as HTB. |
+-------------------+-------------------------------------------------------+
|classifying |The filter object performs the classification through |
| |the agency of a classifier object. Strictly speaking, |
| |Linux classifiers cannot exist outside of a filter. |
+-------------------+-------------------------------------------------------+
|policing |A policer exists in the Linux traffic control |
| |implementation only as part of a filter. |
+-------------------+-------------------------------------------------------+
|dropping |To drop traffic requires a filter with a policer which |
| |uses "drop" as an action. |
+-------------------+-------------------------------------------------------+
|marking |The dsmark qdisc is used for marking. |
+-------------------+-------------------------------------------------------+
-----------------------------------------------------------------------------
4.1. qdisc
Simply put, a qdisc is a scheduler (Section 3.2). Every output interface
needs a scheduler of some kind, and the default scheduler is a FIFO. Other
qdiscs available under Linux will rearrange the packets entering the
scheduler's queue in accordance with that scheduler's rules.
The qdisc is the major building block on which all of Linux traffic control
is built, and is also called a queuing discipline.
The classful qdiscs can contain classes, and provide a handle to which to
attach filters. There is no prohibition on using a classful qdisc without
child classes, although this will usually consume cycles and other system
resources for no benefit.
The classless qdiscs can contain no classes, nor is it possible to attach
filter to a classless qdisc. Because a classless qdisc contains no children
of any kind, there is no utility to classifying. This means that no filter
can be attached to a classless qdisc.
A source of terminology confusion is the usage of the terms root qdisc and
ingress qdisc. These are not really queuing disciplines, but rather locations
onto which traffic control structures can be attached for egress (outbound
traffic) and ingress (inbound traffic).
Each interface contains both. The primary and more common is the egress
qdisc, known as the root qdisc. It can contain any of the queuing disciplines
(qdiscs) with potential classes and class structures. The overwhelming
majority of documentation applies to the root qdisc and its children. Traffic
transmitted on an interface traverses the egress or root qdisc.
For traffic accepted on an interface, the ingress qdisc is traversed. With
its limited utility, it allows no child class to be created, and only exists
as an object onto which a filter can be attached. For practical purposes, the
ingress qdisc is merely a convenient object onto which to attach a policer to
limit the amount of traffic accepted on a network interface.
In short, you can do much more with an egress qdisc because it contains a
real qdisc and the full power of the traffic control system. An ingress qdisc
can only support a policer. The remainder of the documentation will concern
itself with traffic control structures attached to the root qdisc unless
otherwise specified.
-----------------------------------------------------------------------------
4.2. class
Classes only exist inside a classful qdisc (e.g., HTB and CBQ). Classes are
immensely flexible and can always contain either multiple children classes or
a single child qdisc [5]. There is no prohibition against a class containing
a classful qdisc itself, which facilitates tremendously complex traffic
control scenarios.
Any class can also have an arbitrary number of filters attached to it,
which allows the selection of a child class or the use of a filter to
reclassify or drop traffic entering a particular class.
A leaf class is a terminal class in a qdisc. It contains a qdisc (default
FIFO) and will never contain a child class. Any class which contains a child
class is an inner class (or root class) and not a leaf class.
-----------------------------------------------------------------------------
4.3. filter
The filter is the most complex component in the Linux traffic control
system. The filter provides a convenient mechanism for gluing together
several of the key elements of traffic control. The simplest and most obvious
role of the filter is to classify (see Section 3.3) packets. Linux filters
allow the user to classify packets into an output queue with either several
different filters or a single filter.
<EFBFBD><EFBFBD>*<2A> A filter must contain a classifier phrase.
<EFBFBD><EFBFBD>*<2A> A filter may contain a policer phrase.
Filters can be attached either to classful qdiscs or to classes, however
the enqueued packet always enters the root qdisc first. After the filter
attached to the root qdisc has been traversed, the packet may be directed to
any subclasses (which can have their own filters) where the packet may
undergo further classification.
-----------------------------------------------------------------------------
4.4. classifier
Filter objects, which can be manipulated using tc, can use several
different classifying mechanisms, the most common of which is the u32
classifier. The u32 classifier allows the user to select packets based on
attributes of the packet.
The classifiers are tools which can be used as part of a filter to identify
characteristics of a packet or a packet's metadata. The Linux classfier
object is a direct analogue to the basic operation and elemental mechanism of
traffic control classifying.
-----------------------------------------------------------------------------
4.5. policer
This elemental mechanism is only used in Linux traffic control as part of a
filter. A policer calls one action above and another action below the
specified rate. Clever use of policers can simulate a three-color meter. See
also Section 10.
Although both policing and shaping are basic elements of traffic control
for limiting bandwidth usage a policer will never delay traffic. It can only
perform an action based on specified criteria. See also Example 5.
-----------------------------------------------------------------------------
4.6. drop
This basic traffic control mechanism is only used in Linux traffic control
as part of a policer. Any policer attached to any filter could have a drop
action.
Note The only place in the Linux traffic control system where a packet can be
explicitly dropped is a policer. A policer can limit packets enqueued at
a specific rate, or it can be configured to drop all traffic matching a
particular pattern [6].
There are, however, places within the traffic control system where a packet
may be dropped as a side effect. For example, a packet will be dropped if the
scheduler employed uses this method to control flows as the GRED does.
Also, a shaper or scheduler which runs out of its allocated buffer space
may have to drop a packet during a particularly bursty or overloaded period.
-----------------------------------------------------------------------------
4.7. handle
Every class and classful qdisc (see also Section 7) requires a unique
identifier within the traffic control structure. This unique identifier is
known as a handle and has two constituent members, a major number and a minor
number. These numbers can be assigned arbitrarily by the user in accordance
with the following rules [7].
The numbering of handles for classes and qdiscs
major
This parameter is completely free of meaning to the kernel. The user
may use an arbitrary numbering scheme, however all objects in the traffic
control structure with the same parent must share a major handle number.
Conventional numbering schemes start at 1 for objects attached directly
to the root qdisc.
minor
This parameter unambiguously identifies the object as a qdisc if minor
is 0. Any other value identifies the object as a class. All classes
sharing a parent must have unique minor numbers.
The special handle ffff:0 is reserved for the ingress qdisc.
The handle is used as the target in classid and flowid phrases of tc filter
statements. These handles are external identifiers for the objects, usable by
userland applications. The kernel maintains internal identifiers for each
object.
-----------------------------------------------------------------------------
5. Software and Tools
-----------------------------------------------------------------------------
5.1. Kernel requirements
Many distributions provide kernels with modular or monolithic support for
traffic control (Quality of Service). Custom kernels may not already provide
support (modular or not) for the required features. If not, this is a very
brief listing of the required kernel options.
The user who has little or no experience compiling a kernel is recommended
to Kernel HOWTO. Experienced kernel compilers should be able to determine
which of the below options apply to the desired configuration, after reading
a bit more about traffic control and planning.
Example 1. Kernel compilation options [8]
#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_CSZ=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_QOS=y
CONFIG_NET_ESTIMATOR=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_CLS_POLICE=y
A kernel compiled with the above set of options will provide modular
support for almost everything discussed in this documentation. The user may
need to modprobe module before using a given feature. Again, the confused
user is recommended to the Kernel HOWTO, as this document cannot adequately
address questions about the use of the Linux kernel.
-----------------------------------------------------------------------------
5.2. iproute2 tools (tc)
iproute2 is a suite of command line utilities which manipulate kernel
structures for IP networking configuration on a machine. For technical
documentation on these tools, see the iproute2 documentation and for a more
expository discussion, the documentation at [http://linux-ip.net/]
linux-ip.net. Of the tools in the iproute2 package, the binary tc is the only
one used for traffic control. This HOWTO will ignore the other tools in the
suite.
Because it interacts with the kernel to direct the creation, deletion and
modification of traffic control structures, the tc binary needs to be
compiled with support for all of the qdiscs you wish to use. In particular,
the HTB qdisc is not supported yet in the upstream iproute2 package. See
Section 7.1 for more information.
The tc tool performs all of the configuration of the kernel structures
required to support traffic control. As a result of its many uses, the
command syntax can be described (at best) as arcane. The utility takes as its
first non-option argument one of three Linux traffic control components,
qdisc, class or filter.
Example 2. tc command usage
[root@leander]# tc
Usage: tc [ OPTIONS ] OBJECT { COMMAND | help }
where OBJECT := { qdisc | class | filter }
OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] }
Each object accepts further and different options, and will be incompletely
described and documented below. The hints in the examples below are designed
to introduce the vagaries of tc command line syntax. For more examples,
consult the [http://lartc.org/howto/] LARTC HOWTO. For even better
understanding, consult the kernel and iproute2 code.
Example 3. tc qdisc
[root@leander]# tc qdisc add \ (1)
> dev eth0 \ (2)
> root \ (3)
> handle 1:0 \ (4)
> htb (5)
(1) Add a queuing discipline. The verb could also be del.
(2) Specify the device onto which we are attaching the new queuing
discipline.
(3) This means "egress" to tc. The word root must be used, however. Another
qdisc with limited functionality, the ingress qdisc can be attached to
the same device.
(4) The handle is a user-specified number of the form major:minor. The
minor number for any queueing discipline handle must always be zero (0).
An acceptable shorthand for a qdisc handle is the syntax "1:", where the
minor number is assumed to be zero (0) if not specified.
(5) This is the queuing discipline to attach, HTB in this example. Queuing
discipline specific parameters will follow this. In the example here, we
add no qdisc-specific parameters.
Above was the simplest use of the tc utility for adding a queuing
discipline to a device. Here's an example of the use of tc to add a class to
an existing parent class.
Example 4. tc class
[root@leander]# tc class add \ (1)
> dev eth0 \ (2)
> parent 1:1 \ (3)
> classid 1:6 \ (4)
> htb \ (5)
> rate 256kbit \ (6)
> ceil 512kbit (7)
(1) Add a class. The verb could also be del.
(2) Specify the device onto which we are attaching the new class.
(3) Specify the parent handle to which we are attaching the new class.
(4) This is a unique handle (major:minor) identifying this class. The minor
number must be any non-zero (0) number.
(5) Both of the classful qdiscs require that any children classes be
classes of the same type as the parent. Thus an HTB qdisc will contain
HTB classes.
(6) (7)
This is a class specific parameter. Consult Section 7.1 for more detail
on these parameters.
Example 5. tc filter
[root@leander]# tc filter add \ (1)
> dev eth0 \ (2)
> parent 1:0 \ (3)
> protocol ip \ (4)
> prio 5 \ (5)
> u32 \ (6)
> match ip port 22 0xffff \ (7)
> match ip tos 0x10 0xff \ (8)
> flowid 1:6 \ (9)
> police \ (10)
> rate 32000bps \ (11)
> burst 10240 \ (12)
> mpu 0 \ (13)
> action drop/continue (14)
(1) Add a filter. The verb could also be del.
(2) Specify the device onto which we are attaching the new filter.
(3) Specify the parent handle to which we are attaching the new filter.
(4) This parameter is required. It's use should be obvious, although I
don't know more.
(5) The prio parameter allows a given filter to be preferred above another.
The pref is a synonym.
(6) This is a classifier, and is a required phrase in every tc filter
command.
(7) (8)
These are parameters to the classifier. In this case, packets with a
type of service flag (indicating interactive usage) and matching port 22
will be selected by this statement.
(9) The flowid specifies the handle of the target class (or qdisc) to which
a matching filter should send its selected packets.
(10)
This is the policer, and is an optional phrase in every tc filter
command.
(11) The policer will perform one action above this rate, and another action
below (see action parameter).
(12) The burst is an exact analog to burst in HTB (burst is a buckets
concept).
(13) The minimum policed unit. To count all traffic, use an mpu of zero (0).
(14) The action indicates what should be done if the rate based on the
attributes of the policer. The first word specifies the action to take if
the policer has been exceeded. The second word specifies action to take
otherwise.
As evidenced above, the tc command line utility has an arcane and complex
syntax, even for simple operations such as these examples show. It should
come as no surprised to the reader that there exists an easier way to
configure Linux traffic control. See the next section, Section 5.3.
-----------------------------------------------------------------------------
5.3. tcng, Traffic Control Next Generation
FIXME; sing the praises of tcng. See also [http://tldp.org/HOWTO/
Traffic-Control-tcng-HTB-HOWTO/] Traffic Control using tcng and HTB HOWTO
and tcng documentation.
Traffic control next generation (hereafter, tcng) provides all of the power
of traffic control under Linux with twenty percent of the headache.
-----------------------------------------------------------------------------
5.4. IMQ, Intermediate Queuing device
FIXME; must discuss IMQ. See also Patrick McHardy's website on [http://
trash.net/~kaber/imq/] IMQ.
-----------------------------------------------------------------------------
6. Classless Queuing Disciplines (qdiscs)
Each of these queuing disciplines can be used as the primary qdisc on an
interface, or can be used inside a leaf class of a classful qdiscs. These are
the fundamental schedulers used under Linux. Note that the default scheduler
is the pfifo_fast.
-----------------------------------------------------------------------------
6.1. FIFO, First-In First-Out (pfifo and bfifo)
Note This is not the default qdisc on Linux interfaces. Be certain to see
Section 6.2 for the full details on the default (pfifo_fast) qdisc.
The FIFO algorithm forms the basis for the default qdisc on all Linux
network interfaces (pfifo_fast). It performs no shaping or rearranging of
packets. It simply transmits packets as soon as it can after receiving and
queuing them. This is also the qdisc used inside all newly created classes
until another qdisc or a class replaces the FIFO.
[fifo-qdisc]
A real FIFO qdisc must, however, have a size limit (a buffer size) to
prevent it from overflowing in case it is unable to dequeue packets as
quickly as it receives them. Linux implements two basic FIFO qdiscs, one
based on bytes, and one on packets. Regardless of the type of FIFO used, the
size of the queue is defined by the parameter limit. For a pfifo the unit is
understood to be packets and for a bfifo the unit is understood to be bytes.
Example 6. Specifying a limit for a packet or byte FIFO
[root@leander]# cat bfifo.tcc
/*
* make a FIFO on eth0 with 10kbyte queue size
*
*/
dev eth0 {
egress {
fifo (limit 10kB );
}
}
[root@leander]# tcc < bfifo.tcc
# ================================ Device eth0 ================================
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
tc qdisc add dev eth0 handle 2:0 parent 1:0 bfifo limit 10240
[root@leander]# cat pfifo.tcc
/*
* make a FIFO on eth0 with 30 packet queue size
*
*/
dev eth0 {
egress {
fifo (limit 30p );
}
}
[root@leander]# tcc < pfifo.tcc
# ================================ Device eth0 ================================
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
tc qdisc add dev eth0 handle 2:0 parent 1:0 pfifo limit 30
-----------------------------------------------------------------------------
6.2. pfifo_fast, the default Linux qdisc
The pfifo_fast qdisc is the default qdisc for all interfaces under Linux.
Based on a conventional FIFO qdisc, this qdisc also provides some
prioritization. It provides three different bands (individual FIFOs) for
separating traffic. The highest priority traffic (interactive flows) are
placed into band 0 and are always serviced first. Similarly, band 1 is always
emptied of pending packets before band 2 is dequeued.
[pfifo_fast-qdisc]
There is nothing configurable to the end user about the pfifo_fast qdisc.
For exact details on the priomap and use of the ToS bits, see the pfifo-fast
section of the LARTC HOWTO.
-----------------------------------------------------------------------------
6.3. SFQ, Stochastic Fair Queuing
The SFQ qdisc attempts to fairly distribute opportunity to transmit data to
the network among an arbitrary number of flows. It accomplishes this by using
a hash function to separate the traffic into separate (internally maintained)
FIFOs which are dequeued in a round-robin fashion. Because there is the
possibility for unfairness to manifest in the choice of hash function, this
function is altered periodically. Perturbation (the parameter perturb) sets
this periodicity.
[sfq-qdisc]
Example 7. Creating an SFQ
[root@leander]# cat sfq.tcc
/*
* make an SFQ on eth0 with a 10 second perturbation
*
*/
dev eth0 {
egress {
sfq( perturb 10s );
}
}
[root@leander]# tcc < sfq.tcc
# ================================ Device eth0 ================================
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
tc qdisc add dev eth0 handle 2:0 parent 1:0 sfq perturb 10
Unfortunately, some clever software (e.g. Kazaa and eMule among others)
obliterate the benefit of this attempt at fair queuing by opening as many TCP
sessions (flows) as can be sustained. In many networks, with well-behaved
users, SFQ can adequately distribute the network resources to the contending
flows, but other measures may be called for when obnoxious applications have
invaded the network.
See also Section 6.4 for an SFQ qdisc with more exposed parameters for the
user to manipulate.
-----------------------------------------------------------------------------
6.4. ESFQ, Extended Stochastic Fair Queuing
Conceptually, this qdisc is no different than SFQ although it allows the
user to control more parameters than its simpler cousin. This qdisc was
conceived to overcome the shortcoming of SFQ identified above. By allowing
the user to control which hashing algorithm is used for distributing access
to network bandwidth, it is possible for the user to reach a fairer real
distribution of bandwidth.
Example 8. ESFQ usage
Usage: ... esfq [ perturb SECS ] [ quantum BYTES ] [ depth FLOWS ]
[ divisor HASHBITS ] [ limit PKTS ] [ hash HASHTYPE]
Where:
HASHTYPE := { classic | src | dst }
FIXME; need practical experience and/or attestation here.
-----------------------------------------------------------------------------
6.5. GRED, Generic Random Early Drop
FIXME; I have never used this. Need practical experience or attestation.
Theory declares that a RED algorithm is useful on a backbone or core
network, but not as useful near the end-user. See the section on flows to see
a general discussion of the thirstiness of TCP.
-----------------------------------------------------------------------------
6.6. TBF, Token Bucket Filter
This qdisc is built on tokens and buckets. It simply shapes traffic
transmitted on an interface. To limit the speed at which packets will be
dequeued from a particular interface, the TBF qdisc is the perfect solution.
It simply slows down transmitted traffic to the specified rate.
Packets are only transmitted if there are sufficient tokens available.
Otherwise, packets are deferred. Delaying packets in this fashion will
introduce an artificial latency into the packet's round trip time.
[tbf-qdisc]
Example 9. Creating a 256kbit/s TBF
[root@leander]# cat tbf.tcc
/*
* make a 256kbit/s TBF on eth0
*
*/
dev eth0 {
egress {
tbf( rate 256 kbps, burst 20 kB, limit 20 kB, mtu 1514 B );
}
}
[root@leander]# tcc < tbf.tcc
# ================================ Device eth0 ================================
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
tc qdisc add dev eth0 handle 2:0 parent 1:0 tbf burst 20480 limit 20480 mtu 1514 rate 32000bps
-----------------------------------------------------------------------------
7. Classful Queuing Disciplines (qdiscs)
The flexibility and control of Linux traffic control can be unleashed
through the agency of the classful qdiscs. Remember that the classful queuing
disciplines can have filters attached to them, allowing packets to be
directed to particular classes and subqueues.
There are several common terms to describe classes directly attached to the
root qdisc and terminal classes. Classess attached to the root qdisc are
known as root classes, and more generically inner classes. Any terminal class
in a particular queuing discipline is known as a leaf class by analogy to the
tree structure of the classes. Besides the use of figurative language
depicting the structure as a tree, the language of family relationships is
also quite common.
-----------------------------------------------------------------------------
7.1. HTB, Hierarchical Token Bucket
HTB uses the concepts of tokens and buckets along with the class-based
system and filters to allow for complex and granular control over traffic.
With a complex borrowing model, HTB can perform a variety of sophisticated
traffic control techniques. One of the easiest ways to use HTB immediately is
that of shaping.
By understanding tokens and buckets or by grasping the function of TBF, HTB
should be merely a logical step. This queuing discipline allows the user to
define the characteristics of the tokens and bucket used and allows the user
to nest these buckets in an arbitrary fashion. When coupled with a
classifying scheme, traffic can be controlled in a very granular fashion.
Below is example output of the syntax for HTB on the command line with the
tc tool. Although the syntax for tcng is a language of its own, the rules for
HTB are the same.
Example 10. tc usage for HTB
Usage: ... qdisc add ... htb [default N] [r2q N]
default minor id of class to which unclassified packets are sent {0}
r2q DRR quantums are computed as rate in Bps/r2q {10}
debug string of 16 numbers each 0-3 {0}
... class add ... htb rate R1 burst B1 [prio P] [slot S] [pslot PS]
[ceil R2] [cburst B2] [mtu MTU] [quantum Q]
rate rate allocated to this class (class can still borrow)
burst max bytes burst which can be accumulated during idle period {computed}
ceil definite upper class rate (no borrows) {rate}
cburst burst but for ceil {computed}
mtu max packet size we create rate map for {1600}
prio priority of leaf; lower are served first {0}
quantum how much bytes to serve from leaf at once {use r2q}
TC HTB version 3.3
-----------------------------------------------------------------------------
7.1.1. Software requirements
Unlike almost all of the other software discussed, HTB is a newer queuing
discipline and your distribution may not have all of the tools and capability
you need to use HTB. The kernel must support HTB; kernel version 2.4.20 and
later support it in the stock distribution, although earlier kernel versions
require patching. To enable userland support for HTB, see [http://
luxik.cdi.cz/~devik/qos/htb/] HTB for an iproute2 patch to tc.
-----------------------------------------------------------------------------
7.1.2. Shaping
One of the most common applications of HTB involves shaping transmitted
traffic to a specific rate.
All shaping occurs in leaf classes. No shaping occurs in inner or root
classes as they only exist to suggest how the borrowing model should
distribute available tokens.
-----------------------------------------------------------------------------
7.1.3. Borrowing
A fundamental part of the HTB qdisc is the borrowing mechanism. Children
classes borrow tokens from their parents once they have exceeded rate. A
child class will continue to attempt to borrow until it reaches ceil, at
which point it will begin to queue packets for transmission until more tokens
/ctokens are available. As there are only two primary types of classes which
can be created with HTB the following table and diagram identify the various
possible states and the behaviour of the borrowing mechanisms.
Table 2. HTB class states and potential actions taken
+------+-----+--------------+-----------------------------------------------+
|type |class|HTB internal |action taken |
|of |state|state | |
|class | | | |
+------+-----+--------------+-----------------------------------------------+
|leaf |< |HTB_CAN_SEND |Leaf class will dequeue queued bytes up to |
| |rate | |available tokens (no more than burst packets) |
+------+-----+--------------+-----------------------------------------------+
|leaf |> |HTB_MAY_BORROW|Leaf class will attempt to borrow tokens/ |
| |rate,| |ctokens from parent class. If tokens are |
| |< | |available, they will be lent in quantum |
| |ceil | |increments and the leaf class will dequeue up |
| | | |to cburst bytes |
+------+-----+--------------+-----------------------------------------------+
|leaf |> |HTB_CANT_SEND |No packets will be dequeued. This will cause |
| |ceil | |packet delay and will increase latency to meet |
| | | |the desired rate. |
+------+-----+--------------+-----------------------------------------------+
|inner,|< |HTB_CAN_SEND |Inner class will lend tokens to children. |
|root |rate | | |
+------+-----+--------------+-----------------------------------------------+
|inner,|> |HTB_MAY_BORROW|Inner class will attempt to borrow tokens/ |
|root |rate,| |ctokens from parent class, lending them to |
| |< | |competing children in quantum increments per |
| |ceil | |request. |
+------+-----+--------------+-----------------------------------------------+
|inner,|> |HTB_CANT_SEND |Inner class will not attempt to borrow from its|
|root |ceil | |parent and will not lend tokens/ctokens to |
| | | |children classes. |
+------+-----+--------------+-----------------------------------------------+
This diagram identifies the flow of borrowed tokens and the manner in which
tokens are charged to parent classes. In order for the borrowing model to
work, each class must have an accurate count of the number of tokens used by
itself and all of its children. For this reason, any token used in a child or
leaf class is charged to each parent class until the root class is reached.
Any child class which wishes to borrow a token will request a token from
its parent class, which if it is also over its rate will request to borrow
from its parent class until either a token is located or the root class is
reached. So the borrowing of tokens flows toward the leaf classes and the
charging of the usage of tokens flows toward the root class.
[htb-borrow]
Note in this diagram that there are several HTB root classes. Each of these
root classes can simulate a virtual circuit.
-----------------------------------------------------------------------------
7.1.4. HTB class parameters
default
An optional parameter with every HTB qdisc object, the default default
is 0, which cause any unclassified traffic to be dequeued at hardware
speed, completely bypassing any of the classes attached to the root
qdisc.
rate
Used to set the minimum desired speed to which to limit transmitted
traffic. This can be considered the equivalent of a committed information
rate (CIR), or the guaranteed bandwidth for a given leaf class.
ceil
Used to set the maximum desired speed to which to limit the transmitted
traffic. The borrowing model should illustrate how this parameter is
used. This can be considered the equivalent of "burstable bandwidth".
burst
This is the size of the rate bucket (see Tokens and buckets). HTB will
dequeue burst bytes before awaiting the arrival of more tokens.
cburst
This is the size of the ceil bucket (see Tokens and buckets). HTB will
dequeue cburst bytes before awaiting the arrival of more ctokens.
quantum
This is a key parameter used by HTB to control borrowing. Normally, the
correct quantum is calculated by HTB, not specified by the user. Tweaking
this parameter can have tremendous effects on borrowing and shaping under
contention, because it is used both to split traffic between children
classes over rate (but below ceil) and to transmit packets from these
same classes.
r2q
Also, usually calculated for the user, r2q is a hint to HTB to help
determine the optimal quantum for a particular class.
mtu
prio
-----------------------------------------------------------------------------
7.1.5. Rules
Below are some general guidelines to using HTB culled from [http://
docum.org/] http://docum.org/ and the LARTC mailing list. These rules are
simply a recommendation for beginners to maximize the benefit of HTB until
gaining a better understanding of the practical application of HTB.
<EFBFBD><EFBFBD>*<2A> Shaping with HTB occurs only in leaf classes. See also Section 7.1.2.
<EFBFBD><EFBFBD>*<2A> Because HTB does not shape in any class except the leaf class, the sum
of the rates of leaf classes should not exceed the ceil of a parent
class. Ideally, the sum of the rates of the children classes would match
the rate of the parent class, allowing the parent class to distribute
leftover bandwidth (ceil - rate) among the children classes.
This key concept in employing HTB bears repeating. Only leaf classes
actually shape packets; packets are only delayed in these leaf classes.
The inner classes (all the way up to the root class) exist to define how
borrowing/lending occurs (see also Section 7.1.3).
<EFBFBD><EFBFBD>*<2A> The quantum is only only used when a class is over rate but below ceil.
<EFBFBD><EFBFBD>*<2A> The quantum should be set at MTU or higher. HTB will dequeue a single
packet at least per service opportunity even if quantum is too small. In
such a case, it will not be able to calculate accurately the real
bandwidth consumed [9].
<EFBFBD><EFBFBD>*<2A> Parent classes lend tokens to children in increments of quantum, so for
maximum granularity and most instantaneously evenly distributed
bandwidth, quantum should be as low as possible while still no less than
MTU.
<EFBFBD><EFBFBD>*<2A> A distinction between tokens and ctokens is only meaningful in a leaf
class, because non-leaf classes only lend tokens to child classes.
<EFBFBD><EFBFBD>*<2A> HTB borrowing could more accurately be described as "using".
-----------------------------------------------------------------------------
7.2. PRIO, priority scheduler
The PRIO classful qdisc works on a very simple precept. When it is ready to
dequeue a packet, the first class is checked for a packet. If there's a
packet, it gets dequeued. If there's no packet, then the next class is
checked, until the queuing mechanism has no more classes to check.
This section will be completed at a later date.
-----------------------------------------------------------------------------
7.3. CBQ, Class Based Queuing
CBQ is the classic implementation (also called venerable) of a traffic
control system. This section will be completed at a later date.
-----------------------------------------------------------------------------
8. Rules, Guidelines and Approaches
-----------------------------------------------------------------------------
8.1. General Rules of Linux Traffic Control
There are a few general rules which ease the study of Linux traffic
control. Traffic control structures under Linux are the same whether the
initial configuration has been done with tcng or with tc.
<EFBFBD><EFBFBD>*<2A> Any router performing a shaping function should be the bottleneck on
the link, and should be shaping slightly below the maximum available link
bandwidth. This prevents queues from forming in other routers, affording
maximum control of packet latency/deferral to the shaping device.
<EFBFBD><EFBFBD>*<2A> A device can only shape traffic it transmits [10]. Because the traffic
has already been received on an input interface, the traffic cannot be
shaped. A traditional solution to this problem is an ingress policer.
<EFBFBD><EFBFBD>*<2A> Every interface must have a qdisc. The default qdisc (the pfifo_fast
qdisc) is used when another qdisc is not explicitly attached to the
interface.
<EFBFBD><EFBFBD>*<2A> One of the classful qdiscs added to an interface with no children
classes typically only consumes CPU for no benefit.
<EFBFBD><EFBFBD>*<2A> Any newly created class contains a FIFO. This qdisc can be replaced
explicitly with any other qdisc. The FIFO qdisc will be removed
implicitly if a child class is attached to this class.
<EFBFBD><EFBFBD>*<2A> Classes directly attached to the root qdisc can be used to simulate
virtual circuits.
<EFBFBD><EFBFBD>*<2A> A filter can be attached to classes or one of the classful qdiscs.
-----------------------------------------------------------------------------
8.2. Handling a link with a known bandwidth
HTB is an ideal qdisc to use on a link with a known bandwidth, because the
innermost (root-most) class can be set to the maximum bandwidth available on
a given link. Flows can be further subdivided into children classes, allowing
either guaranteed bandwidth to particular classes of traffic or allowing
preference to specific kinds of traffic.
-----------------------------------------------------------------------------
8.3. Handling a link with a variable (or unknown) bandwidth
In theory, the PRIO scheduler is an ideal match for links with variable
bandwidth, because it is a work-conserving qdisc (which means that it
provides no shaping). In the case of a link with an unknown or fluctuating
bandwidth, the PRIO scheduler simply prefers to dequeue any available packet
in the highest priority band first, then falling to the lower priority
queues.
-----------------------------------------------------------------------------
8.4. Sharing/splitting bandwidth based on flows
Of the many types of contention for network bandwidth, this is one of the
easier types of contention to address in general. By using the SFQ qdisc,
traffic in a particular queue can be separated into flows, each of which will
be serviced fairly (inside that queue). Well-behaved applications (and users)
will find that using SFQ and ESFQ are sufficient for most sharing needs.
The Achilles heel of these fair queuing algorithms is a misbehaving user or
application which opens many connections simultaneously (e.g., eMule,
eDonkey, Kazaa). By creating a large number of individual flows, the
application can dominate slots in the fair queuing algorithm. Restated, the
fair queuing algorithm has no idea that a single application is generating
the majority of the flows, and cannot penalize the user. Other methods are
called for.
-----------------------------------------------------------------------------
8.5. Sharing/splitting bandwidth based on IP
For many administrators this is the ideal method of dividing bandwidth
amongst their users. Unfortunately, there is no easy solution, and it becomes
increasingly complex with the number of machine sharing a network link.
To divide bandwidth equitably between N IP addresses, there must be N
classes.
-----------------------------------------------------------------------------
9. Scripts for use with QoS/Traffic Control
-----------------------------------------------------------------------------
9.1. wondershaper
More to come, see [http://lartc.org/wondershaper/] wondershaper.
-----------------------------------------------------------------------------
9.2. ADSL Bandwidth HOWTO script (myshaper)
More to come, see [http://www.tldp.org/HOWTO/
ADSL-Bandwidth-Management-HOWTO/implementation.html] myshaper.
-----------------------------------------------------------------------------
9.3. htb.init
More to come, see htb.init.
-----------------------------------------------------------------------------
9.4. tcng.init
More to come, see tcng.init.
-----------------------------------------------------------------------------
9.5. cbq.init
More to come, see cbq.init.
-----------------------------------------------------------------------------
10. Diagram
-----------------------------------------------------------------------------
10.1. General diagram
Below is a general diagram of the relationships of the components of a
classful queuing discipline (HTB pictured). A larger version of the diagram
is [http://linux-ip.net/traffic-control/htb-class.png] available.
Example 11. An example HTB tcng configuration
/*
*
* possible mock up of diagram shown at
* http://linux-ip.net/traffic-control/htb-class.png
*
*/
$m_web = trTCM (
cir 512 kbps, /* commited information rate */
cbs 10 kB, /* burst for CIR */
pir 1024 kbps, /* peak information rate */
pbs 10 kB /* burst for PIR */
) ;
dev eth0 {
egress {
class ( <$web> ) if tcp_dport == PORT_HTTP && __trTCM_green( $m_web );
class ( <$bulk> ) if tcp_dport == PORT_HTTP && __trTCM_yellow( $m_web );
drop if __trTCM_red( $m_web );
class ( <$bulk> ) if tcp_dport == PORT_SSH ;
htb () { /* root qdisc */
class ( rate 1544kbps, ceil 1544kbps ) { /* root class */
$web = class ( rate 512kbps, ceil 512kbps ) { sfq ; } ;
$bulk = class ( rate 512kbps, ceil 1544kbps ) { sfq ; } ;
}
}
}
}
[htb-class]
-----------------------------------------------------------------------------
11. Annotated Traffic Control Links
This section identifies a number of links to documentation about traffic
control and Linux traffic control software. Each link will be listed with a
brief description of the content at that site.
<EFBFBD><EFBFBD>*<2A> HTB site, HTB user guide and HTB theory (Martin "devik" Devera)
Hierarchical Token Bucket, HTB, is a classful queuing discipline.
Widely used and supported it is also fairly well documented in the user
guide and at [http://www.docum.org/] Stef Coene's site (see below).
<EFBFBD><EFBFBD>*<2A> General Quality of Service docs (Leonardo Balliache)
There is a good deal of understandable and introductory documentation on
his site, and in particular has some excellent overview material. See in
particular, the detailed [http://opalsoft.net/qos/DS.htm] Linux QoS
document among others.
<EFBFBD><EFBFBD>*<2A> tcng (Traffic Control Next Generation) and tcng manual (Werner
Almesberger)
The tcng software includes a language and a set of tools for creating
and testing traffic control structures. In addition to generating tc
commands as output, it is also capable of providing output for non-Linux
applications. A key piece of the tcng suite which is ignored in this
documentation is the tcsim traffic control simulator.
The user manual provided with the tcng software has been converted to
HTML with latex2html. The distribution comes with the TeX documentation.
<EFBFBD><EFBFBD>*<2A> iproute2 and iproute2 manual (Alexey Kuznetsov)
This is a the source code for the iproute2 suite, which includes the
essential tc binary. Note, that as of
iproute2-2.4.7-now-ss020116-try.tar.gz, the package did not support HTB,
so a patch available from the [http://luxik.cdi.cz/~devik/qos/htb/] HTB
site will be required.
The manual documents the entire suite of tools, although the tc utility
is not adequately documented here. The ambitious reader is recommended to
the LARTC HOWTO after consuming this introduction.
<EFBFBD><EFBFBD>*<2A> Documentation, graphs, scripts and guidelines to traffic control under
Linux (Stef Coene)
Stef Coene has been gathering statistics and test results, scripts and
tips for the use of QoS under Linux. There are some particularly useful
graphs and guidelines available for implementing traffic control at
Stef's site.
<EFBFBD><EFBFBD>*<2A> [http://lartc.org/howto/] LARTC HOWTO (bert hubert, et. al.)
The Linux Advanced Routing and Traffic Control HOWTO is one of the key
sources of data about the sophisticated techniques which are available
for use under Linux. The Traffic Control Introduction HOWTO should
provide the reader with enough background in the language and concepts of
traffic control. The LARTC HOWTO is the next place the reader should look
for general traffic control information.
<EFBFBD><EFBFBD>*<2A> Guide to IP Networking with Linux (Martin A. Brown)
Not directly related to traffic control, this site includes articles
and general documentation on the behaviour of the Linux IP layer.
<EFBFBD><EFBFBD>*<2A> Werner Almesberger's Papers
Werner Almesberger is one of the main developers and champions of
traffic control under Linux (he's also the author of tcng, above). One of
the key documents describing the entire traffic control architecture of
the Linux kernel is his Linux Traffic Control - Implementation Overview
which is available in [http://www.almesberger.net/cv/papers/tcio8.pdf]
PDF or [http://www.almesberger.net/cv/papers/tcio8.ps.gz] PS format.
<EFBFBD><EFBFBD>*<2A> Linux DiffServ project
Mercilessly snipped from the main page of the DiffServ site...
Differentiated Services (short: Diffserv) is an architecture for
providing different types or levels of service for network traffic.
One key characteristic of Diffserv is that flows are aggregated in
the network, so that core routers only need to distinguish a
comparably small number of aggregated flows, even if those flows
contain thousands or millions of individual flows.
Notes
[1] See Section 5 for more details on the use or installation of a
particular traffic control mechanism, kernel or command line utility.
[2] This queueing model has long been used in civilized countries to
distribute scant food or provisions equitably. William Faulkner is
reputed to have walked to the front of the line for to fetch his share
of ice, proving that not everybody likes the FIFO model, and providing
us a model for considering priority queuing.
[3] Similarly, the entire traffic control system appears as a queue or
scheduler to the higher layer which is enqueuing packets into this
layer.
[4] This smoothing effect is not always desirable, hence the HTB parameters
burst and cburst.
[5] A classful qdisc can only have children classes of its type. For
example, an HTB qdisc can only have HTB classes as children. A CBQ qdisc
cannot have HTB classes as children.
[6] In this case, you'll have a filter which uses a classifier to select the
packets you wish to drop. Then you'll use a policer with a with a drop
action like this police rate 1bps burst 1 action drop/drop.
[7] I do not know the range nor base of these numbers. I believe they are
u32 hexadecimal, but need to confirm this.
[8] The options listed in this example are taken from a 2.4.20 kernel source
tree. The exact options may differ slightly from kernel release to
kernel release depending on patches and new schedulers and classifiers.
[9] HTB will report bandwidth usage in this scenario incorrectly. It will
calculate the bandwidth used by quantum instead of the real dequeued
packet size. This can skew results quickly.
[10] In fact, the Intermediate Queuing Device (IMQ) simulates an output
device onto which traffic control structures can be attached. This
clever solution allows a networking device to shape ingress traffic in
the same fashion as egress traffic. Despite the apparent contradiction
of the rule, IMQ appears as a device to the kernel. Thus, there has been
no violation of the rule, but rather a sneaky reinterpretation of that
rule.
ProxyARP Subnetting HOWTO
Bob Edwards
Robert.Edwards@anu.edu.au
v2.0, 27 August 2000
This HOWTO discusses using Proxy Address Resolution Protocol (ARP)
with subnetting in order to make a small network of machines visible
on another Internet Protocol (IP) subnet (I call it sub-subnetting).
This makes all the machines on the local network (network 0 from now
on) appear as if they are connected to the main network (network 1).
This is only relevent if all machines are connected by Ethernet or
ether devices (ie. it won't work for SLIP/PPP/CSLIP etc.)
_________________________________________________________________
Table of Contents
1. [1]Acknowledgements
2. [2]Why use Proxy ARP with subnetting?
3. [3]How Proxy ARP with subnetting works
4. [4]Setting up Proxy ARP with subnetting
5. [5]Other alternatives to Proxy ARP with subnetting
6. [6]Other Applications of Proxy ARP with subnetting
7. [7]Copying conditions
1. Acknowledgements
This document, and my Proxy ARP implementation could not have been
made possible without the help of:
* Andrew Tridgell, who implemented the subnetting options for arp in
Linux, and who personally assisted me in getting it working
* the Proxy-ARP mini-HOWTO, by Al Longyear
* the Multiple-Ethernet mini-HOWTO, by Don Becker
* the arp(8) source code and man page by Fred N. van Kempen and
Bernd Eckenfels
_________________________________________________________________
2. Why use Proxy ARP with subnetting?
The applications for using Proxy ARP with subnetting are fairly
specific.
In my case, I had a wireless Ethernet card that plugs into an 8-bit
ISA slot. I wanted to use this card to provide connectivity for a
number of machines at once. Being an ISA card, I could use it on a
Linux machine, after I had written an appropriate device driver for it
- this is the subject of another document. From here, it was only
necessary to add a second Ethernet interface to the Linux machine and
then use some mechanism to join the two networks together.
For the purposes of discussion, let network 0 be the local Ethernet
connected to the Linux box via an NE-2000 clone Ethernet interface on
eth0. Network 1 is the main network connected via the wireless
Ethernet card on eth1. Machine A is the Linux box with both
interfaces. Machine B is any TCP/IP machine on network 0 and machine C
is likewise on network 1.
Normally, to provide the connectivity, I would have done one of the
following:
* Used the IP-Bridge software (see the Bridge mini-HOWTO) to bridge
the traffic between the two network interfaces. Unfortunately, the
wireless Ethernet interface cannot be put into "Promiscuous" mode
(ie. it can't see all packets on network 1). This is mainly due to
the lower bandwidth of the wireless Ethernet (2MBit/sec) meaning
that we don't want to carry any traffic not specifically destined
to another wireless Ethernet machine - in our case machine A - or
broadcasts. Also, bridging is rather CPU intensive!
* Alternatively, use subnets and an IP-router to pass packets
between the two networks (see the IP-Subnetworking mini-HOWTO).
This is a protocol specific solution, where the Linux kernel can
handle the Internet Protocol (IP) packets, but other protocols
(such as AppleTalk) need extra software to route. This also
requires the allocation of a new IP subnet (network) number, which
is not always an option.
In my case, getting a new subnet (network) number was not an option,
so I wanted a solution that allowed all the machines on network 0 to
appear as if they were on network 1. This is where Proxy ARP comes in.
Other solutions are used to connect other (non-IP) protocols, such as
netatalk to provide AppleTalk routing.
_________________________________________________________________
3. How Proxy ARP with subnetting works
The Proxy ARP is actually only used to get packets from network 1 to
network 0. To get packets back the other way, the normal IP routing
functionality is employed.
In my case, network 1 has an 8-bit subnet mask (255.255.255.0). I have
chosen the subnet mask for network 0 to be 4-bit (255.255.255.240),
allowing 14 IP nodes on network 0 (2 ^ 4 = 16, less two for the all
zeros and all ones cases). Note that any size of subnet mask up to,
but not including, the size of the mask of the other network is
allowable here (eg. 2, 3, 4, 5, 6 or 7 bits in this case - for one
bit, just use normal Proxy ARP!)
All the IP numbers for network 0 (16 in total) appear in network 1 as
a subset. Note that it is very important, in this case, not to allow
any machine connected directly to network 1 to have an IP number in
this range! In my case, I have "reserved" the IP numbers of network 1
ending in 64 .. 79 for network 0. In this case, the IP numbers ending
in 64 and 79 can't actually be used by nodes - 79 is the broadcast
address for network 0.
Machine A is allocated two IP numbers, one within the network 0 range
for it's real Ethernet interface (eth0) and the other within the
network 1 range, but outside of the network 0 range, for the wireless
Ethernet interface (eth1).
Say machine C (on network 1) wants to send a packet to machine B (on
network 0). Because the IP number of machine B makes it look to
machine C as though it is on the same physical network, machine C will
use the Address Resolution Protocol (ARP) to send a broadcast message
on network 1 requesting the machine with the IP number of machine B to
respond with it's hardware (Ethernet or MAC layer) address. Machine B
won't see this request, as it isn't actually on network 1, but machine
A, on both networks, will see it.
The first bit of magic now happens as the Linux kernel arp code on
machine A, with a properly configured Proxy ARP with subnetting entry,
determines that the ARP request has come in on the network 1 interface
(eth1) and that the IP number being ARP'd for is in the subnet range
for network 0. Machine A then sends it's own hardware (Ethernet)
address back to machine C as an ARP response packet.
Machine C then updates it's ARP cache with an entry for machine B, but
with the hardware (Ethernet) address of machine A (in this case, the
wireless Ethernet interface). Machine C can now send the packet for
machine B to this hardware (Ethernet) address, and machine A receives
it.
Machine A notices that the destination IP number in the packet is that
of machine B, not itself. Machine A's Linux kernel IP routing code
attempts to forward the packet to machine B by looking at it's routing
tables to determine which interface contains the network number for
machine B. However, the IP number for machine B is valid for both the
network 0 interface (eth0), and for the network 1 interface (eth1).
At this point, something else clever happens. Because the subnet mask
for the network 0 interface has more 1 bits (it is more specific) than
the subnet mask for the network 1 interface, the Linux kernel routing
code will match the IP number for machine B to the network 0
interface, and not keep looking for the potential match with the
network 1 interface (the one the packet came in on).
Now machine A needs to find out the "real" hardware (Ethernet) address
for machine B (assuming that it doesn't already have it in the ARP
cache). Machine A uses an ARP request, but this time the Linux kernel
arp code notes that the request isn't coming from the network 1
interface (eth1), and so doesn't respond with the Proxy address of
eth1. Instead, it sends the ARP request on the network 0 interface
(eth0), where machine B will see it and respond with it's own (real)
hardware (Ethernet) address. Now machine A can send the packet (from
machine C) onto machine B.
Machine B gets the packet from machine C (via machine A) and then
wants to send back a response. This time, machine B notices that
machine C in on a different subnet (machine B's subnet mask of
255.255.255.240 excludes all machines not in the network 0 IP address
range). Machine B is setup with a "default" route to machine A's
network 0 IP number and sends the packet to machine A. This time,
machine A's Linux kernel routing code determines the destination IP
number (of machine C) as being on network 1 and sends the packet onto
machine C via Ethernet interface eth1.
Similar (less complicated) things occur for packets originating from
and destined to machine A from other machines on either of the two
networks.
Similarly, it should be obvious that if another machine (D) on network
0 ARP's for machine B, machine A will receive the ARP request on it's
network 0 interface (eth0) and won't respond to the request as it is
set up to only Proxy on it's network 1 interface (eth1).
Note also that all of machines B and C (and D) are not required to do
anything unusual, IP-wise. In my case, there is a mixture of Suns,
Macs and PC/Windoze 95 machines on network 0 all connecting through
Linux machine A to the rest of the world.
Finally, note that once the hardware (Ethernet) addresses are
discovered by each of machines A, B, C (and D), they are placed in the
ARP cache and subsequent packet transfers occur without the ARP
overhead. The ARP caches normally expire entries after 5 minutes of
non-activity.
_________________________________________________________________
4. Setting up Proxy ARP with subnetting
I set up Proxy ARP with subnetting on a Linux kernel version 2.0.30
machine, but I am told that the code works right back to some kernel
version in the 1.2.x era.
The first thing to note is that the ARP code is in two parts: the part
inside the kernel that sends and receives ARP requests and responses
and updates the ARP cache etc.; and other part is the arp(8) command
that allows the super user to modify the ARP cache manually and anyone
to examine it.
The first problem I had was that the arp(8) command that came with my
Slackware 3.1 distribution was ancient (1994 era!!!) and didn't
communicate with the kernel arp code correctly at all (mainly
evidenced by the strange output that it gave for "arp -a").
The arp(8) command in "net-tools-1.33a" available from a variety of
places, including (from the README file that came with it)
[8]ftp.linux.org.uk:/pub/linux/Networking/base/ works properly and
includes new man pages that explain stuff a lot better than the older
arp(8) man page.
Armed with a decent arp(8) command, all the changes I made were in the
/etc/rc.d/rc.inet1 script (for Slackware - probably different for
other flavours). First of all, we need to change the broadcast
address, network number and netmask of eth0:
NETMASK=255.255.255.240 # for a 4-bit host part
NETWORK=x.y.z.64 # our new network number (replace x.y.z with your net)
BROADCAST=x.y.z.79 # in my case
Then a line needs to be added to configure the second Ethernet port
(after any module loading that might be required to load the driver
code):
/sbin/ifconfig eth1 (name on net 1) broadcast (x.y.z.255) netmask 255.255.255.0
Then we add a route for the new interface:
/sbin/route add -net (x.y.z.0) netmask 255.255.255.0
And you will probably need to change the default gateway to the one
for network 1.
At this point, it is appropriate to add the Proxy ARP entry:
/sbin/arp -i eth1 -Ds ${NETWORK} eth1 netmask ${NETMASK} pub
This tells ARP to add a static entry (the s) to the cache for network
${NETWORK}. The -D tells ARP to use the same hardware address as
interface eth1 (the second eth1), thus saving us from having to look
up the hardware address for eth1 and hardcoding it in. The netmask
option tells ARP that we want to use subnetting (ie. Proxy for all (IP
number) & ${NETMASK} == ${NETWORK} & ${NETMASK}). The pub option tells
ARP to publish this ARP entry, ie. it is a Proxy entry, so respond on
behalf of these IP numbers. The -i eth1 option tells ARP to only
respond to requests that come in on interface eth1.
Hopefully, at this point, when the machine is rebooted, all the
machines on network 0 will appear to be on network 1. You can check
that the Proxy ARP with subnetting entry has been correctly installed
on machine A. On my machine (names changed to protect the innocent) it
is:
bash$ /sbin/arp -an
Address HWtype HWaddress Flags Mask Iface
x.y.z.1 ether 00:00:0C:13:6F:17 C * eth1
x.y.z.65 ether 00:40:05:49:77:01 C * eth0
x.y.z.67 ether 08:00:20:0B:79:47 C * eth0
x.y.z.5 ether 00:00:3B:80:18:E5 C * eth1
x.y.z.64 ether 00:40:96:20:CD:D2 CMP 255.255.255.240 eth1
Alternatively, you can examine the /proc/net/arp file with eg. cat(1).
The last line is the proxy entry for the subnet. The CMP flags
indicate that it is a static (Manually entered) entry and that it is
to be Published. The entry is only going to reply to ARP requests on
eth1 where the requested IP number, once masked, matches the network
number, also masked. Note that arp(8) has automatically determined the
hardware address of eth1 and inserted this for the address to use (the
-Ds option).
Likewise, it is probably prudent to check that the routing table has
been set up correctly. Here is mine (again, the names are changed to
protect the innocent):
#/bin/netstat -rn
Kernel routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
x.y.z.64 0.0.0.0 255.255.255.240 U 0 0 71 eth0
x.y.z.0 0.0.0.0 255.255.255.0 U 0 0 389 eth1
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 7 lo
0.0.0.0 x.y.z.1 0.0.0.0 UG 1 0 573 eth1
Alternatively, you can examine the /proc/net/route file with eg.
cat(1).
Note that the first entry is a proper subset of the second, but the
routing table has ranked them in netmask order, so the eth0 entry will
be checked before the eth1 entry.
_________________________________________________________________
5. Other alternatives to Proxy ARP with subnetting
There are several other alternatives to using Proxy ARP with
subnetting in this situation, apart from the ones mentioned about
(bridging and straight routing):
* IP-Masquerading (see the IP-Masquerade mini-HOWTO), in which
network 0 is "hidden" behind machine A from the rest of the
Internet. As machines on network 0 attempt to connect outside
through machine A, it re-addresses the source address and port
number of the packets and makes them look like they are coming
from itself, rather than from the machine on the hidden network 0.
This is an elegant solution, although it prevents any machine on
network 1 from initiating a connection to any machine on network
0, as the machines on network 0 effectively don't exist outside of
network 0. This effectively increases security of the machines on
network 0, but is also means that servers on network 1 cannot
check the identity of clients on network 0 using IP numbers (eg.
NFS servers use IP hostnames for access to mountable file
systems).
* Another option is IP in IP tunneling, which isn't supported on all
platforms (such as Macs and Windoze machines) so I opted not to go
this way.
* Use Proxy ARP without subnetting. This is certainly possible, it
just means that a separate entry needs to be created for each
machine on network 0, instead of a single entry for all machines
(current and future) on network 0.
* Possibly IP Aliasing might also be useful here, but I haven't
looked into this at all.
_________________________________________________________________
6. Other Applications of Proxy ARP with subnetting
There is only one other application that I know about that uses Proxy
ARP with subnetting, also here at the Australian National University.
It is the one that Andrew Tridgell originally wrote the subnetting
extensions to Proxy ARP for. However, Andrew reliably informs me that
there are, in fact, several other sites around the world using it as
well (I don't have any details).
The other A.N.U. application involves a teaching lab set up to teach
students how to configure machines to use TCP/IP, including setting up
the gateway. The network used is a Class C network, and Andrew needed
to "subnet" it for security, traffic control and the educational
reason mentioned above. He did this using Proxy ARP, and then decided
that a single entry in the ARP cache for the whole subnet would be
faster and cleaner than one for each host on the subnet. Voila...Proxy
ARP with subnetting!
_________________________________________________________________
7. Copying conditions
Copyright 1997 by Bob Edwards <[9]Robert.Edwards@anu.edu.au>
Voice: (+61) 2 6249 4090
Unless otherwise stated, Linux HOWTO documents are copyrighted by
their respective authors. Linux HOWTO documents may be reproduced and
distributed in whole or in part, in any medium physical or electronic,
as long as this copyright notice is retained on all copies. Commercial
redistribution is allowed and encouraged; however, the author would
like to be notified of any such distributions. All translations,
derivative works, or aggregate works incorporating any Linux HOWTO
documents must be covered under this copyright notice. That is, you
may not produce a derivative work from a HOWTO and impose additional
restrictions on its distribution. Exceptions to these rules may be
granted under certain conditions; please contact the Linux HOWTO
coordinator at the address given below. In short, we wish to promote
dissemination of this information through as many channels as
possible. However, we do wish to retain copyright on the HOWTO
documents, and would like to be notified of any plans to redistribute
the HOWTOs. If you have questions, please contact the Linux HOWTO
coordinator, at <[10]linux-howto@metalab.unc.edu> via email.
References
1. Proxy-ARP-Subnet.html#INTRO
2. Proxy-ARP-Subnet.html#WHY
3. Proxy-ARP-Subnet.html#HOW
4. Proxy-ARP-Subnet.html#SETUP
5. Proxy-ARP-Subnet.html#ALTERNATIVES
6. Proxy-ARP-Subnet.html#APPLICATIONS
7. Proxy-ARP-Subnet.html#COPYING
8. ftp://ftp.linux.org.uk/pub/linux/Networking/base/
9. mailto:Robert.Edwards@anu.edu.au
10. mailto:linux-howto@metalab.unc.edu
</sect1 id="Proxy-Caching">
<sect1 id="NTP">
<title>NTP</title>
<para>
Time synchorinisation is generally considered important in the computing
environment. There are a number of reasons why this is important: it makes
sure your scheduled cron tasks on your various servers run well together,
it allows better use of log files between various machines to help
troubleshoot problems, and synchronised, correct logs are also useful if
your servers are ever attacked by crackers (either to report the attempt
to organisations such as AusCERT or in court to use against the bad guys).
Users who have overclocked their machine might also use time synchronisation
techniques to bring the time on their machines back to an accurate figure
at regular intervals, say every 20 minutes of so. This section contains an
overview of time keeping under Linux and some information about NTP, a
protocol which can be used to accurately reset the time across a computer
network.
</para>
2. How Linux Keeps Track of Time
2.1. Basic Strategies
<para>
A Linux system actually has two clocks: One is the battery powered
"Real Time Clock" (also known as the "RTC", "CMOS clock", or "Hardware
clock") which keeps track of time when the system is turned off but is
not used when the system is running. The other is the "system clock"
(sometimes called the "kernel clock" or "software clock") which is a
software counter based on the timer interrupt. It does not exist when
the system is not running, so it has to be initialized from the RTC
(or some other time source) at boot time. References to "the clock" in
the ntpd documentation refer to the system clock, not the RTC.
</para>
<para>
The two clocks will drift at different rates, so they will gradually
drift apart from each other, and also away from the "real" time. The
simplest way to keep them on time is to measure their drift rates and
apply correction factors in software. Since the RTC is only used when
the system is not running, the correction factor is applied when the
clock is read at boot time, using clock(8) or hwclock(8). The system
clock is corrected by adjusting the rate at which the system time is
advanced with each timer interrupt, using adjtimex(8).
</para>
<para>
A crude alternative to adjtimex(8) is to have chron run clock(8) or
hwclock(8) periodically to sync the system time to the (corrected)
RTC. This was recommended in the clock(8) man page, and it works if
you do it often enough that you don't cause large "jumps" in the
system time, but adjtimex(8) is a more elegant solution. Some
applications may complain if the time jumps backwards.
</para>
<para>
The next step up in accuracy is to use a program like ntpd to read the
time periodically from a network time server or radio clock, and
continuously adjust the rate of the system clock so that the times
always match, without causing sudden "jumps" in the system time. If
you always have a network connection at boot time, you can ignore the
RTC completely and use ntpdate (which comes with the ntpd package) to
initialize the system clock from a time server-- either a local server
on a LAN, or a remote server on the internet. But if you sometimes
don't have a network connection, or if you need the time to be
accurate during the boot sequence before the network is active, then
you need to maintain the time in the RTC as well.
</para>
2.2. Potential Conflicts
<para>
It might seem obvious that if you're using a program like ntpd, you
would want to sync the RTC to the (corrected) system clock. But this
turns out to be a bad idea if the system is going to stay shut down
longer than a few minutes, because it interferes with the programs
that apply the correction factor to the RTC at boot time.
</para>
<para>
If the system runs 24/7 and is always rebooted immediately whenever
it's shut down, then you can just set the RTC from the system clock
right before you reboot. The RTC won't drift enough to make a
difference in the time it takes to reboot, so you don't need to know
its drift rate.
</para>
<para>
Of course the system may go down unexpectedly, so some versions of the
kernel sync the RTC to the system clock every 11 minutes if the system
clock has been adjusted by another program. The RTC won't drift enough
in 11 minutes to make any difference, but if the system is down long
enough for the RTC to drift significantly, then you have a problem:
the programs that apply the drift correction to the RTC need to know
*exactly* when it was last reset, and the kernel doesn't record that
information anywhere.
</para>
<para>
Some unix "traditionalists" might wonder why anyone would run a linux
system less than 24/7, but some of us run dual-boot systems with
another OS running some of the time, or run Linux on laptops that have
to be shut down to conserve battery power when they're not being used.
Other people just don't like to leave machines running unattended for
long periods of time (even though we've heard all the arguments in
favor of it). So the "every 11 minutes" feature becomes a bug.
</para>
<para>
This "feature/bug" appears to behave differently in different versions
of the kernel (and possibly in different versions of xntpd and ntpd as
well), so if you're running both ntpd and hwclock you may need to test
your system to see what it actually does. If you can't keep the kernel
from resetting the RTC, you might have to run without a correction
factor on the RTC.
</para>
<para>
The part of the kernel that controls this can be found in
/usr/src/linux-2.0.34/arch/i386/kernel/time.c (where the version
number in the path will be the version of the kernel you're running).
If the variable time_status is set to TIME_OK then the kernel will
write the system time to the RTC every 11 minutes, otherwise it leaves
the RTC alone. Calls to adjtimex(2) (as used by ntpd and timed, for
example) may turn this on. Calls to settimeofday(2) will set
time_status to TIME_UNSYNC, which tells the kernel not to adjust the
RTC. I have not found any real documentation on this.
</para>
<para>
I've heard reports that some versions of the kernel may have problems
with "sleep modes" that shut down the CPU to save energy. The best
solution is to keep your kernel up to date, and refer any problems to
the people who maintain the kernel.
</para>
<para>
If you get bizarre results from the RTC you may have a hardware
problem. Some RTC chips include a lithium battery that can run down,
and some motherboards have an option for an external battery (be sure
the jumper is set correctly). The same battery maintains the CMOS RAM,
but the clock takes more power and is likely to fail first. Bizarre
results from the system clock may mean there is a problem with
interrupts.
</para>
2.3. Should the RTC use Local Time or UTC, and What About DST?
<para>
The Linux "system clock" actually just counts the number of seconds
past Jan 1, 1970, and is always in UTC (or GMT, which is technically
different but close enough that casual users tend to use both terms
interchangeably). UTC does not change as DST comes and goes-- what
changes is the conversion between UTC and local time. The translation
to local time is done by library functions that are linked into the
application programs.
</para>
<para>
This has two consequences: First, any application that needs to know
the local time also needs to know what time zone you're in, and
whether DST is in effect or not (see the next section for more on time
zones). Second, there is no provision in the kernel to change either
the system clock or the RTC as DST comes and goes, because UTC doesn't
change. Therefore, machines that only run Linux should have the RTC
set to UTC, not local time.
</para>
<para>
However, many people run dual-boot systems with other OS's that expect
the RTC to contain the local time, so hwclock needs to know whether
your RTC is in local time or UTC, which it then converts to seconds
past Jan 1, 1970 (UTC). This still does not provide for seasonal
changes to the RTC, so the change must be made by the other OS (this
is the one exception to the rule against letting more than one program
change the time in the RTC).
</para>
<para>
Unfortunately, there are no flags in the RTC or the CMOS RAM to
indicate standard time vs DST, so each OS stores this information
someplace where the other OS's can't find it. This means that hwclock
must assume that the RTC always contains the correct local time, even
if the other OS has not been run since the most recent seasonal time
change.
</para>
<para>
If Linux is running when the seasonal time change occurs, the system
clock is unaffected and applications will make the correct conversion.
But if linux has to be rebooted for any reason, the system clock will
be set to the time in the RTC, which will be off by one hour until the
other OS (usually Windows) has a chance to run.
</para>
<para>
There is no way around this, but Linux doesn't crash very often, so
the most likely reason to reboot on a dual-boot system is to run the
other OS anyway. But beware if you're one of those people who shuts
down Linux whenever you won't be using it for a while-- if you haven't
had a chance to run the other OS since the last time change, the RTC
will be off by an hour until you do.
</para>
<para>
Some other documents have stated that setting the RTC to UTC allows
Linux to take care of DST properly. This is not really wrong, but it
doesn't tell the whole story-- as long as you don't reboot, it does
not matter which time is in the RTC (or even if the RTC's battery
dies). Linux will maintain the correct time either way, until the next
reboot. In theory, if you only reboot once a year (which is not
unreasonable for Linux), DST could come and go and you'd never notice
that the RTC had been wrong for several months, because the system
clock would have stayed correct all along. But since you can't predict
when you'll want to reboot, it's better to have the RTC set to UTC if
you're not running another OS that requires local time.
</para>
<para>
The Dallas Semiconductor RTC chip (which is a drop-in replacement for
the Motorola chip used in the IBM AT and clones) actually has the
ability to do the DST conversion by itself, but this feature is not
used because the changeover dates are hard-wired into the chip and
can't be changed. Current versions change on the first Sunday in April
and the last Sunday in October, but earlier versions used different
dates (and obviously this doesn't work in countries that use other
dates). Also, the RTC is often integrated into the motherboard's
"chipset" (rather than being a separate chip) and I don't know if they
all have this ability.
</para>
2.4. How Linux keeps Track of Time Zones
<para>
You probably set your time zone correctly when you installed Linux.
But if you have to change it for some reason, or if the local laws
regarding DST have changed (as they do frequently in some countries),
then you'll need to know how to change it. If your system time is off
by some exact number of hours, you may have a time zone problem (or a
DST problem).
</para>
<para>
Time zone and DST information is stored in /usr/share/zoneinfo (or
/usr/lib/zoneinfo on older systems). The local time zone is
determined by a symbolic link from /etc/localtime to one of these
files. The way to change your timezone is to change the link. If
your local DST dates have changed, you'll have to edit the file.
</para>
<para>
You can also use the TZ environment variable to change the current
time zone, which is handy of you're logged in remotely to a machine in
another time zone. Also see the man pages for tzset and tzfile.
This is nicely summarized at
<http://www.linuxsa.org.au/tips/time.html>
</para>
2.5. The Bottom Line
<para>
If you don't need sub-second accuracy, hwclock(8) and adjtimex(8) may
be all you need. It's easy to get enthused about time servers and
radio clocks and so on, but I ran the old clock(8) program for years
with excellent results. On the other hand, if you have several
machines on a LAN it can be handy (and sometimes essential) to have
them automatically sync their clocks to each other. And the other
stuff can be fun to play with even if you don't really need it.
</para>
<para>
On machines that only run Linux, set the RTC to UTC (or GMT). On
dual-boot systems that require local time in the RTC, be aware that if
you have to reboot Linux after the seasonal time change, the clock may
be temporarily off by one hour, until you have a chance to run the
other OS. If you run more than two OS's, be sure only one of them is
trying to adjust for DST.
</para>
<para>
NTP is a standard method of synchronising time on a client from a remote
server across the network. NTP clients are typically installed on servers.
NTP is a standard method of synchronising time across a network of
computers. NTP clients are typically installed on servers.
Most business class ISPs provide NTP servers. Otherwise, there are a
number of free NTP servers in Australia:
</para>
<para>
The Univeristy of Melbourne ntp.cs.mu.oz.au
University of Adelaide ntp.saard.net
CSIRO Marine Labs, Tasmania ntp.ml.csiro.au
CSIRO National Measurements Laboratory, Sydney ntp.syd.dms.csiro.au
</para>
<para>
Xntpd (NTPv3) has been replaced by ntpd (NTPv4); the earlier version
is no longer being maintained.
</para>
</para>
Ntpd is the standard program for synchronizing clocks across a
network, and it comes with a list of public time servers you can
connect to. It can be a little more complicated to set up, but if
you're interested in this kind of thing I highly recommend that you
take a look at it.
</para>
<para>
The "home base" for information on ntpd is the NTP website at
<http://www.eecis.udel.edu/~ntp/> which also includes links to all
kinds of interesting time-related stuff (including software for other
OS's). Some linux distributions include ntpd on the CD. There is a
list of public time servers at
<http://www.eecis.udel.edu/~mills/ntp/clock2.html>.
</para>
<para>
A relatively new feature in ntpd is a "burst mode" which is designed
for machines that have only intermittent dial-up access to the
internet.
</para>
<para>
Ntpd includes drivers for quite a few radio clocks (although some
appear to be better supported than others). Most radio clocks are
designed for commercial use and cost thousands of dollars, but there
are some cheaper alternatives (discussed in later sections). In the
past most were WWV or WWVB receivers, but now most of them seem to be
GPS receivers. NIST has a PDF file that lists manufacturers of radio
clocks on their website at
<http://www.boulder.nist.gov/timefreq/links.htm> (near the bottom of
the page). The NTP website also includes many links to manufacturers
of radio clocks at <http://www.eecis.udel.edu/~ntp/hardware.htm> and
<http://www.eecis.udel.edu/~mills/ntp/refclock.htm>. Either list may
or may not be up to date at any given time :-). The list of drivers
for ntpd is at
<http://www.eecis.udel.edu/~ntp/ntp_spool/html/refclock.htm>.
</para>
<para>
Ntpd also includes drivers for several dial-up time services. These
are all long-distance (toll) calls, so be sure to calculate the effect
on your phone bill before using them.
</para>
3.4. Chrony
<para>
Xntpd was originally written for machines that have a full-time
connection to a network time server or radio clock. In theory it can
also be used with machines that are only connected intermittently, but
Richard Curnow couldn't get it to work the way he wanted it to, so he
wrote "chrony" as an alternative for those of us who only have network
access when we're dialed in to an ISP (this is the same problem that
ntpd's new "burst mode" was designed to solve). The current version
of chrony includes drift correction for the RTC, for machines that are
turned off for long periods of time.
</para>
<para>
You can get more information from Richard Curnow's website at
<http://www.rrbcurnow.freeuk.com/chrony> or <http://go.to/chrony>.
There are also two chrony mailing lists, one for announcements and one
for discussion by users. For information send email to chrony-users-
subscribe@egroups.com or chrony-announce-subscribe@egroups.com
</para>
<para>
Chrony is normally distributed as source code only, but Debian has
been including a binary in their "unstable" collection. The source
file is also available at the usual Linux archive sites.
</para>
3.5. Clockspeed
<para>
Another option is the clockspeed program by DJ Bernstein. It gets the
time from a network time server and simply resets the system clock
every three seconds. It can also be used to synchronize several
machines on a LAN.
</para>
<para>
I've sometimes had trouble reaching his website at
<http://Cr.yp.to/clockspeed.html>, so if you get a DNS error try again
on another day. I'll try to update this section if I get some better
information.
</para>
<para>
Note
You must be logged in as "root" to run any program that affects
the RTC or the system time, which includes most of the programs
described here. If you normally use a graphical interface for
everything, you may also need to learn some basic unix shell
commands.
</para>
<para>
Note
If you run more than one OS on your machine, you should only let
one of them set the RTC, so they don't confuse each other. The
exception is the twice-a-year adjustment for Daylight Saving(s)
Time.
</para>
<para>
If you run a dual-boot system that spends a lot of time running
Windows, you may want to check out some of the clock software
available for that OS instead. Follow the links on the NTP website at
<http://www.eecis.udel.edu/~ntp/software.html>.
</para>
</sect1 id="NTP">
<sect1 id="Traffic-Control">
8.6. Traffic Shaping
The traffic shaper is a virtual network device that makes it possible
to limit the rate of outgoing data flow over another network device.
This is especially useful in scenarios such as ISPs, where it is
desirable to control and enforce policies regarding how much bandwidth
is used by each client. Another alternative (for web services only)
may be certain Apache modules which restrict the number of IP
connections by client or the bandwidth used.
<title>Traffic-Control</title>
<para>
Traffic control encompasses the sets of mechanisms and operations by which
packets are queued for transmission/reception on a network interface. The
operations include enqueuing, policing, classifying, scheduling, shaping and
dropping. This HOWTO provides an introduction and overview of the
capabilities and implementation of traffic control under Linux.
</para>
<EFBFBD><EFBFBD>*<2A> the linux DiffServ project
<EFBFBD><EFBFBD>*<2A> HTB site (Martin "devik" Devera)
<EFBFBD><EFBFBD>*<2A> Traffic Control Next Generation (tcng)
TCNG manual (Werner Almesberger)
<EFBFBD><EFBFBD>*<2A> iproute2 (Alexey Kuznetsov)
iproute2 manual (Alexey Kuznetsov)
<EFBFBD><EFBFBD>*<2A> Research and documentation on traffic control under linux (Stef Coene)
<EFBFBD><EFBFBD>*<2A> LARTC HOWTO (bert hubert, et. al.)
<EFBFBD><EFBFBD>*<2A> guide to IP networking with linux (Martin A. Brown)
* http://metalab.unc.edu/mdw/HOWTO/NET3-4-HOWTO-6.html#ss6.15
* Traffic Control HOWTO
</sect1 id="Traffic-Control">
<sect1 id="Load-Balancing">
<title>Load-Balancing</title>
<para>
Demand for load balancing usually arises in database/web access when
many clients make simultaneous requests to a server. It would be
desirable to have multiple identical servers and redirect requests to
the less loaded server. This can be achieved through Network Address
Translation techniques (NAT) of which IP masquerading is a subset.
Network administrators can replace a single server providing Web
services - or any other application - with a logical pool of servers
sharing a common IP address. Incoming connections are directed to a
particular server using one load-balancing algorithm. The virtual
server rewrites incoming and outgoing packets to give clients the
appearance that only one server exists.
</para>
<para>
Linux IP-NAT information may be found here <http://www.csn.tu-
chemnitz.de/HyperNews/get/linux-ip-nat.html>
</para>
</sect1 id="Load-Balancing">
<sect1 id="Bandwidth-Limiting">
<title>Bandwidth-Limiting</title>
<para>
This section describes how to set up your Linux server to limit download
bandwidth or incoming traffic and how to use your internet link more
efficiently. It is meant to provide an easy solution for limiting
incoming traffic, thus preventing our LAN users from consuming all the
bandwidth of our internet link. This is useful when our internet link
is slow or our LAN users download tons of mp3s and the newest Linux
distro's *.iso files.
</para>
* Bandwidth Limiting HOWTO
6. Miscellaneous
6.1. Useful resources
Squid Web Proxy Cache
[http://www.squid-cache.org] http://www.squid-cache.org
Squid 2.4 Stable 1 Configuration manual
[http://www.visolve.com/squidman/Configuration%20Guide.html] http://
www.visolve.com/squidman/Configuration%20Guide.html
[http://www.visolve.com/squidman/Delaypool%20parameters.htm] http://
www.visolve.com/squidman/Delaypool%20parameters.htm
Squid FAQ
[http://www.squid-cache.org/Doc/FAQ/FAQ-19.html#ss19.8] http://
www.squid-cache.org/Doc/FAQ/FAQ-19.html#ss19.8
cbq-init script
[ftp://ftp.equinox.gu.net/pub/linux/cbq/] ftp://ftp.equinox.gu.net/pub/linux/
cbq/
Linux 2.4 Advanced Routing HOWTO
[http://www.linuxdoc.org/HOWTO/Adv-Routing-HOWTO.html] http://
www.linuxdoc.org/HOWTO/Adv-Routing-HOWTO.html
Traffic control (in Polish)
[http://ceti.pl/~kravietz/cbq/] http://ceti.pl/~kravietz/cbq/
Securing and Optimizing Linux Red Hat Edition - A Hands on Guide
[http://www.linuxdoc.org/guides.html] http://www.linuxdoc.org/guides.html
IPTraf
[http://cebu.mozcom.com/riker/iptraf/] http://cebu.mozcom.com/riker/iptraf/
IPCHAINS
[http://www.linuxdoc.org/HOWTO/IPCHAINS-HOWTO.html] http://www.linuxdoc.org/
HOWTO/IPCHAINS-HOWTO.html
Nylon socks proxy server
[http://mesh.eecs.umich.edu/projects/nylon/] http://mesh.eecs.umich.edu/
projects/nylon/
Indonesian translation of this HOWTO by Rahmat Rafiudin mjl_id@yahoo.com
[http://raf.unisba.ac.id/resources/BandwidthLimitingHOWTO/index.html] http://
raf.unisba.ac.id/resources/BandwidthLimitingHOWTO/index.html
</sect1 id="Bandwidth-Limiting">
<sect1 id="Compressed-TCP">
<title>Compressed-TCP</title>
<para>
In the past, we used to compress files in order to save disk space.
Today, disk space is cheap - but bandwidth is limited. By compressing
data streams such as TCP/IP-Sessions using SSH-like tools, you achieve
two goals:
</para>
1) You save bandwidth/transfered volume (that is important if you have
to pay for traffic or if your network is loaded.).
2) Speeding up low-bandwidth connections (Modem, GSM, ISDN).
<para>
This HowTo explains how to save both bandwith and connection time by
using tools like SSH1, SSH2, OpenSSH or LSH.
</para>
2. Compressing HTTP/FTP,...
<para>
My office is connected with a 64KBit ISDN line to the internet, so the
maximum transfer rate is about 7K/s. You can speed up the connection
by compressing it: when I download files, Netscape shows up a transfer
rate of up to 40K/s (Logfiles are compressable by factor 15). SSH is a
tool that is mainly designed to build up secure connections over
unsecured networks. Further more, SSH is able to compress connections
and to do port forwarding (like rinetd or redir). So it is the
appropriate tool to compress any simple TCP/IP connection. "Simple"
means, that only one TCP-connection is opened. An FTP-connections or
the connection between M$-Outlook and MS-Exchange are not simple as
several connections are established. SSH uses the LempleZiv (LZ77)
compression algorithm - so you will achieve the same high compression
rate as winzip/pkzip. In order to compress all HTTP-connections from
my intranet to the internet, I just have to execute one command on my
dial-in machine:
</para>
<para>
<screen>
ssh -l <login ID> <hostname> -C -L8080:<proxy_at_ISP>:80 -f sleep
10000
</screen>
</para>
<para>
<screen>
<hostname> = host that is located at my ISP. SSH-access is required.
<login ID> = my login-ID on <hostname>
<proxy_at_ISP> = the web proxy of my ISP
</screen>
</para>
<para>
My browser is configured to use localhost:8080 as proxy. My laptop
connects to the same socket. The connection is compressed and
forwarded to the real proxy by SSH. The infrastructure looks like:
</para>
<para>
<screen>
64KBit ISDN
My PC--------------------------------A PC (Unix/Linux/Win-NT) at my ISP
SSH-Client compressed SSH-Server, Port 22
Port 8080 |
| |
| |
| |
|10MBit Ethernet |100MBit
|not compressed |not compressed
| |
| |
My second PC ISP's WWW-proxy
with Netscape,... Port 80
(Laptop)
</screen>
</para>
3. Compressing Email
3.1. Incoming Emails (POP3, IMAP4)
<para>
Most people fetch their email from the mailserver via POP3. POP3 is a
protocol with many disadvantages:
</para>
1. POP3 transfers password in clear text. (There are SSL-
implementations of POP/IMAP and a challenge/response
authentication, defined in RFC-2095/2195).
2. POP3 causes much protocol overhead: first the client requests a
message than the server sends the message. After that the client
requests the transferred article to be deleted. The server confirms
the deletion. After that the server is ready for the next
transaction. So 4 transactions are needed for each email.
3. POP3 transfers the mails without compression although email is
highly compressible (factor=3.5).
<para>
You could compress POP3 by forwarding localhost:110 through a
compressed connection to your ISP's POP3-socket. After that you have
to tell your mail client to connect to localhost:110 in order to
download mail. That secures and speeds up the connection -- but the
download time still suffers from the POP3-inherent protocol overhead.
</para>
<para>
It makes sense to substitute POP3 by a more efficient protocol. The
idea is to download the entire mailbox at once without generating
protocol overhead. Furthermore it makes sense to compress the
connections. The appropriate tool which offers both features is SCP.
You can download your mail-file like this:
</para>
<para>
<screen>
scp -C -l loginId:/var/spool/mail/loginid /tmp/newmail
</screen>
</para>
<para>
But there is a problem: what happens if a new email arrives at the
server during the download of your mailbox? The new mail would be
lost. Therefore it makes more sense to use the following commands:
</para>
<para>
<screen>
ssh -l loginid mailserver -f mv /var/spool/mail/loginid
/tmp/loginid_fetchme
scp -C -l loginid:/tmp/my_new_mail /tmp/loginid_fetchme
</screen>
</para>
<para>
A move (mv) is a elementary operation, so you won't get into truble if
you receive new mail during the execution of the commands. But if the
mail server directories /tmp/ and /var/spool/mail are not on the same
disc you might get problems. A solution is to create a lockfile on the
server before you execute the mv: touch /var/spool/mail/loginid.lock.
You should remove it, after that. A better solution is to move the
file loginid in the same directory:
</para>
<para>
<screen>
ssh -l loginid mailserver -f mv /var/spool/mail/loginid
/var/spool/mail/loginid_fetchme
</screen>
</para>
<para>
After that you can use formail instead of procmail in order to filter
/tmp/newmail into the right folder(s):
</para>
<para>
<screen>
formail -s procmail < /tmp/newmail
</screen>
</para>
3.2. Outgoing Email (SMTP)
<para>
You send email over compresses and encrypted SSH-connections, in order
to:
</para>
<20> Save network traffic
<20> Secure the connection (This does not make sense, if the mail is
transported over untrusted networks, later.)
<20> Authenticate the sender. Many mail servers deny mail relaying in
order to prevent abuse. If you send an email over an SSH-
connection, the remote mail server (i.e. sendmail or MS-exchange)
thinks to be connected, locally.
<para>
If you have SSH-access on the mail server, you need the following
command:
</para>
<para>
<screen>
ssh -C -l loginid mailserver -L2525:mailserver:25
</screen>
</para>
<para>
If you don't have SSH-access on the mail server but to a server that
is allowed to use your mail server as relay, the command is:
</para>
<para>
<screen>
ssh -C -l loginid other_server -L2525:mailserver:25
</screen>
</para>
<para>
After that you can configure your mail client (or mail server: see
"smarthost") to send out mails to localhost port 2525.
</para>
4. Thoughts about performance.
<para>
Of course compression/encryption takes CPU time. It turned out that an
old Pentium-133 is able to encrypt and compress about 1GB/hour --
that's quite a lot. If you compile SSH with the option "--with-none"
you can tell SSH to use no encryption. That saves a little
performance. Here is a comprison between several download methods
(during the test, a noncompressed 6MB-file was transfered from a
133MHz-Pentium-1 to a 233MHz Pentium2 laptop over a 10MBit ethernet
without other load).
</para>
<para>
<screen>
+-------------------+--------+----------+-----------+----------------------+
| | FTP |encrypted |compressed |compressed & encrypted|
+-------------------+--------+----------+-----------+----------------------+
| Elapsed Time | 17.6s | 26s | 9s | 23s |
+-------------------+--------+----------+-----------+----------------------+
| Throughput | 790K/s | 232K/s | 320K/s | 264K/s |
+-------------------+--------+----------+-----------+----------------------+
|Compression Factor | 1 | 1 | 3.8 | 3.8 |
+-------------------+--------+----------+-----------+----------------------+
</screen>
</para>
</sect1 id="Compressed-TCP">
<sect1 id="IP-Accounting">
<title>IP-Accounting</title>
<para>
This option of the Linux kernel keeps track of IP network traffic,
performs packet logging and produces some statistics. A series of
rules may be defined so when a packet matches a given pattern, some
action is performed: a counter is increased, it is accepted/rejected,
etc.
</para>
<para>
6.3. IP Accounting (for Linux-2.0)
The IP accounting features of the Linux kernel allow you to collect
and analyze some network usage data. The data collected comprises the
number of packets and the number of bytes accumulated since the
figures were last reset. You may specify a variety of rules to
categorize the figures to suit whatever purpose you may have. This
option has been removed in kernel 2.1.102, because the old ipfwadm-
based firewalling was replaced by ``ipfwchains''.
</para>
<para>
<screen>
Kernel Compile Options:
Networking options --->
[*] IP: accounting
</screen>
</para>
<para>
After you have compiled and installed the kernel you need to use the
ipfwadm command to configure IP accounting. There are many different
ways of breaking down the accounting information that you might
choose. I've picked a simple example of what might be useful to use,
you should read the ipfwadm man page for more information.
Scenario: You have a ethernet network that is linked to the internet
via a PPP link. On the ethernet you have a machine that offers a
number of services and that you are interested in knowing how much
traffic is generated by each of ftp and world wide web traffic, as
well as total tcp and udp traffic.
</para>
<para>
You might use a command set that looks like the following, which is
shown as a shell script:
</para>
<para>
<screen>
#!/bin/sh
#
# Flush the accounting rules
ipfwadm -A -f
#
# Set shortcuts
localnet=44.136.8.96/29
any=0/0
# Add rules for local ethernet segment
ipfwadm -A in -a -P tcp -D $localnet ftp-data
ipfwadm -A out -a -P tcp -S $localnet ftp-data
ipfwadm -A in -a -P tcp -D $localnet www
ipfwadm -A out -a -P tcp -S $localnet www
ipfwadm -A in -a -P tcp -D $localnet
ipfwadm -A out -a -P tcp -S $localnet
ipfwadm -A in -a -P udp -D $localnet
ipfwadm -A out -a -P udp -S $localnet
#
# Rules for default
ipfwadm -A in -a -P tcp -D $any ftp-data
ipfwadm -A out -a -P tcp -S $any ftp-data
ipfwadm -A in -a -P tcp -D $any www
ipfwadm -A out -a -P tcp -S $any www
ipfwadm -A in -a -P tcp -D $any
ipfwadm -A out -a -P tcp -S $any
ipfwadm -A in -a -P udp -D $any
ipfwadm -A out -a -P udp -S $any
#
# List the rules
ipfwadm -A -l -n
#
</screen>
</para>
<para>
The names ``ftp-data'' and ``www'' refer to lines in /etc/services.
The last command lists each of the Accounting rules and displays the
collected totals.
</para>
<para>
An important point to note when analyzing IP accounting is that totals
for all rules that match will be incremented so that to obtain
differential figures you need to perform appropriate maths. For
example if I wanted to know how much data was not ftp nor www I would
substract the individual totals from the rule that matches all ports.
</para>
<para>
<screen>
root# ipfwadm -A -l -n
IP accounting rules
pkts bytes dir prot source destination ports
0 0 in tcp 0.0.0.0/0 44.136.8.96/29 * -> 20
0 0 out tcp 44.136.8.96/29 0.0.0.0/0 20 -> *
10 1166 in tcp 0.0.0.0/0 44.136.8.96/29 * -> 80
10 572 out tcp 44.136.8.96/29 0.0.0.0/0 80 -> *
252 10943 in tcp 0.0.0.0/0 44.136.8.96/29 * -> *
231 18831 out tcp 44.136.8.96/29 0.0.0.0/0 * -> *
0 0 in udp 0.0.0.0/0 44.136.8.96/29 * -> *
0 0 out udp 44.136.8.96/29 0.0.0.0/0 * -> *
0 0 in tcp 0.0.0.0/0 0.0.0.0/0 * -> 20
0 0 out tcp 0.0.0.0/0 0.0.0.0/0 20 -> *
10 1166 in tcp 0.0.0.0/0 0.0.0.0/0 * -> 80
10 572 out tcp 0.0.0.0/0 0.0.0.0/0 80 -> *
253 10983 in tcp 0.0.0.0/0 0.0.0.0/0 * -> *
231 18831 out tcp 0.0.0.0/0 0.0.0.0/0 * -> *
0 0 in udp 0.0.0.0/0 0.0.0.0/0 * -> *
0 0 out udp 0.0.0.0/0 0.0.0.0/0 * -> *
</screen>
</para>
<para>
6.4. IP Accounting (for Linux-2.2)
The new accounting code is accessed via ``IP Firewall Chains''. See
the IP chains home page for more information. Among other things,
you'll now need to use ipchains instead of ipfwadm to configure your
filters. (From Documentation/Changes in the latest kernel sources).
</para>
</sect1 id="IP-Accounting">
<sect1 id="IP-Aliasing">
<title>IP-Aliasing</title>
<para>
This is a cookbook recipe on how to set up and run IP aliasing on a Linux box
and how to set up the machine to receive e-mail on the aliased IP addresses.
</para>
<para>
This feature of the Linux kernel provides the possibility of setting
multiple network addresses on the same low-level network device driver
(e.g two IP addresses in one Ethernet card). It is typically used for
services that act differently based on the address they listen on
(e.g. "multihosting" or "virtual domains" or "virtual hosting
services".
</para>
<para>
There are some applications where being able to configure multiple IP
addresses to a single network device is useful. Internet Service
Providers often use this facility to provide a `customized' to their
World Wide Web and ftp offerings for their customers. You can refer to
the ``IP-Alias mini-HOWTO'' for more information than you find here.
</para>
<para>
Quickstart:
</para>
<para>
After compiling and installing your kernel with IP_Alias support
configuration is very simple. The aliases are added to virtual network
devices associated with the actual network device. A simple naming
convention applies to these devices being <devname>:<virtual dev num>,
e.g. eth0:0, ppp0:10 etc. Note that the the ifname:number device can
only be configured after the main interface has been set up.
</para>
<para>
For example, assume you have an ethernet network that supports two
different IP subnetworks simultaneously and you wish your machine to
have direct access to both, you could use something like:
</para>
<para>
<screen>
root# ifconfig eth0 192.168.1.1 netmask 255.255.255.0 up
root# route add -net 192.168.1.0 netmask 255.255.255.0 eth0
root# ifconfig eth0:0 192.168.10.1 netmask 255.255.255.0 up
root# route add -net 192.168.10.0 netmask 255.255.255.0 eth0:0
</screen>
</para>
-----------------------------------------------------------------------------
<para>
1. My Setup
</para>
<para>
<EFBFBD><EFBFBD>*<2A>IP Alias is standard in kernels 2.0.x and 2.2.x, and available as a
compile-time option in 2.4.x (IP Alias has been deprecated in 2.4.x and
replaced by a more powerful firewalling mechanism.)
<EFBFBD><EFBFBD>*<2A>IP Alias compiled as a loadable module. You would have indicated in the
"make config" command to make your kernel, that you want the IP Masq to
be compiled as a (M)odule. Check the Modules HOW-TO (if that exists) or
check the info in /usr/src/linux/Documentation/modules.txt.
<EFBFBD><EFBFBD>*<2A>I have to support 2 additional IPs over and above the IP already
allocated to me.
<EFBFBD><EFBFBD>*<2A>A D-Link DE620 pocket adapter (not important, works with any Linux
supported network adapter).
</para>
<para>
<screen>
Kernel Compile Options:
Networking options --->
....
[*] Network aliasing
....
<*> IP: aliasing support
</screen>
</para>
-----------------------------------------------------------------------------
<para>
2. Commands
</para>
<para>
1. Load the IP Alias module (you can skip this step if you compiled the
module into the kernel):
</para>
<para>
<screen>
/sbin/insmod /lib/modules/`uname -r`/ipv4/ip_alias.o
</screen>
</para>
<para>
2. Setup the loopback, eth0, and all the IP addresses beginning with the
main IP address for the eth0 interface:
</para>
<para>
<screen>
/sbin/ifconfig lo 127.0.0.1
/sbin/ifconfig eth0 up
/sbin/ifconfig eth0 172.16.3.1
/sbin/ifconfig eth0:0 172.16.3.10
/sbin/ifconfig eth0:1 172.16.3.100
</screen>
</para>
<para>
172.16.3.1 is the main IP address, while .10 and .100 are the aliases.
The magic is the eth0:x where x=0,1,2,...n for the different IP
addresses. The main IP address does not need to be aliased.
</para>
<para>
3. Setup the routes. First route the loopback, then the net, and finally,
the various IP addresses starting with the default (originally allocated)
one:
</para>
<para>
<screen>
/sbin/route add -net 127.0.0.0
/sbin/route add -net 172.16.3.0 dev eth0
/sbin/route add -host 172.16.3.1 dev eth0
/sbin/route add -host 172.16.3.10 dev eth0:0
/sbin/route add -host 172.16.3.100 dev eth0:1
/sbin/route add default gw 172.16.3.200
</screen>
</para>
<para>
That's it.
</para>
<para>
In the example IP address above, I am using the Private IP addresses (RFC
1918) for illustrative purposes. Substitute them with your own official or
private IP addresses.
</para>
<para>
The example shows only 3 IP addresses. The max is defined to be 256 in /usr/
include/linux/net_alias.h. 256 IP addresses on ONE card is a lot :-)!
</para>
<para>
Here's what my /sbin/ifconfig looks like:
</para>
<para>
<screen>
lo Link encap:Local Loopback
inet addr:127.0.0.1 Bcast:127.255.255.255 Mask:255.0.0.0
UP BROADCAST LOOPBACK RUNNING MTU:3584 Metric:1
RX packets:5088 errors:0 dropped:0 overruns:0
TX packets:5088 errors:0 dropped:0 overruns:0
eth0 Link encap:10Mbps Ethernet HWaddr 00:8E:B8:83:19:20
inet addr:172.16.3.1 Bcast:172.16.3.255 Mask:255.255.255.0
UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
RX packets:334036 errors:0 dropped:0 overruns:0
TX packets:11605 errors:0 dropped:0 overruns:0
Interrupt:7 Base address:0x378
eth0:0 Link encap:10Mbps Ethernet HWaddr 00:8E:B8:83:19:20
inet addr:172.16.3.10 Bcast:172.16.3.255 Mask:255.255.255.0
UP BROADCAST RUNNING MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0
TX packets:0 errors:0 dropped:0 overruns:0
eth0:1 Link encap:10Mbps Ethernet HWaddr 00:8E:B8:83:19:20
inet addr:172.16.3.100 Bcast:172.16.3.255 Mask:255.255.255.0
UP BROADCAST RUNNING MTU:1500 Metric:1
RX packets:1 errors:0 dropped:0 overruns:0
TX packets:0 errors:0 dropped:0 overruns:0
</screen>
</para>
<para>
And /proc/net/aliases:
</para>
<para>
<screen>
device family address
eth0:0 2 172.16.3.10
eth0:1 2 172.16.3.100
</screen>
</para>
<para>
And /proc/net/alias_types:
</para>
<para>
<screen>
type name n_attach
2 ip 2
</screen>
</para>
<para>
Of course, the stuff in /proc/net was created by the ifconfig command and not
by hand!
</para>
-----------------------------------------------------------------------------
<para>
3. Troubleshooting: Questions and Answers
</para>
<para>
3.1. Question: How can I keep the settings through a reboot?
</para>
<para>
Answer: Whether you are using BSD-style or SysV-style (Redhat?? for example)
init, you can always include it in /etc/rc.d/rc.local. Here's what I have on
my SysV init system (Redhat?? 3.0.3 and 4.0):
</para>
<para>
My /etc/rc.d/rc.local: (edited to show the relevant portions)
</para>
<para>
<screen>
#setting up IP alias interfaces
echo "Setting 172.16.3.1, 172.16.3.10, 172.16.3.100 IP Aliases ..."
/sbin/ifconfig lo 127.0.0.1
/sbin/ifconfig eth0 up
/sbin/ifconfig eth0 172.16.3.1
/sbin/ifconfig eth0:0 172.16.3.10
/sbin/ifconfig eth0:1 172.16.3.100
#setting up the routes
echo "Setting IP routes ..."
/sbin/route add -net 127.0.0.0
/sbin/route add -net 172.16.3.0 dev eth0
/sbin/route add -host 172.16.3.1 eth0
/sbin/route add -host 172.16.3.10 eth0:0
/sbin/route add -host 172.16.3.100 eth0:1
/sbin/route add default gw 172.16.3.200
#
</screen>
</para>
-----------------------------------------------------------------------------
<para>
3.2. Question: How do I set up the IP aliased machine to receive e-mail on
the various aliased IP addresses (on a machine using sendmail)?
</para>
<para>
Answer: Create (if it doesn't already exist) a file called, /etc/
mynames.cw,for example. The file does not have to be this exact name nor in
the /etc directory.
</para>
<para>
In that file, place the official domain names of the aliased IP addresses. If
these aliased IP addresses do not have a domain name, then you can place the
IP address itself.
</para>
<para>
The /etc/mynames.cw might look like this:
</para>
<para>
<screen>
# /etc/mynames.cw - include all aliases for your machine here; # is a comment
domain.one.net
domain.two.com
domain.three.org
4.5.6.7
</screen>
</para>
<para>
In your sendmail.cf file, where it defines a file class macro Fw, add the
following:
</para>
<para>
<screen>
##################
# local info #
##################
# file containing names of hosts for which we receive email
Fw/etc/mynames.cw
That should do it. Test out the new setting by invoking sendmail in test
mode. The following is an example:
ganymede$ /usr/lib/sendmail -bt
ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
Enter < ruleset> < address>
> 0 me@4.5.6.7
rewrite: ruleset 0 input: me @ 4 . 5 . 6 . 7
rewrite: ruleset 98 input: me @ 4 . 5 . 6 . 7
rewrite: ruleset 98 returns: me @ 4 . 5 . 6 . 7
rewrite: ruleset 97 input: me @ 4 . 5 . 6 . 7
rewrite: ruleset 3 input: me @ 4 . 5 . 6 . 7
rewrite: ruleset 96 input: me < @ 4 . 5 . 6 . 7 >
rewrite: ruleset 96 returns: me < @ 4 . 5 . 6 . 7 . >
rewrite: ruleset 3 returns: me < @ 4 . 5 . 6 . 7 . >
rewrite: ruleset 0 input: me < @ 4 . 5 . 6 . 7 . >
rewrite: ruleset 98 input: me < @ 4 . 5 . 6 . 7 . >
rewrite: ruleset 98 returns: me < @ 4 . 5 . 6 . 7 . >
rewrite: ruleset 0 returns: $# local $: me
rewrite: ruleset 97 returns: $# local $: me
rewrite: ruleset 0 returns: $# local $: me
> 0 me@4.5.6.8
rewrite: ruleset 0 input: me @ 4 . 5 . 6 . 8
rewrite: ruleset 98 input: me @ 4 . 5 . 6 . 8
rewrite: ruleset 98 returns: me @ 4 . 5 . 6 . 8
rewrite: ruleset 97 input: me @ 4 . 5 . 6 . 8
rewrite: ruleset 3 input: me @ 4 . 5 . 6 . 8
rewrite: ruleset 96 input: me < @ 4 . 5 . 6 . 8 >
rewrite: ruleset 96 returns: me < @ 4 . 5 . 6 . 8 >
rewrite: ruleset 3 returns: me < @ 4 . 5 . 6 . 8 >
rewrite: ruleset 0 input: me < @ 4 . 5 . 6 . 8 >
rewrite: ruleset 98 input: me < @ 4 . 5 . 6 . 8 >
rewrite: ruleset 98 returns: me < @ 4 . 5 . 6 . 8 >
rewrite: ruleset 95 input: < > me < @ 4 . 5 . 6 . 8 >
rewrite: ruleset 95 returns: me < @ 4 . 5 . 6 . 8 >
rewrite: ruleset 0 returns: $# smtp $@ 4 . 5 . 6 . 8 $: me < @ 4 . 5 . 6 . 8 >
rewrite: ruleset 97 returns: $# smtp $@ 4 . 5 . 6 . 8 $: me < @ 4 . 5 . 6 . 8 >
rewrite: ruleset 0 returns: $# smtp $@ 4 . 5 . 6 . 8 $: me < @ 4 . 5 . 6 . 8 >
>
</screen>
</para>
<para>
Notice when I tested me@4.5.6.7, it delivered the mail to the local machine,
while me@4.5.6.8 was handed off to the smtp mailer. That is the correct
response.
</para>
<para>
3.3. Question: How do I delete an alias?
</para>
<para>
Answer: To delete an alias you simply add a `-' to the end of its name and
refer to it and is as simple as:
</para>
<para>
<screen>
root# ifconfig eth0:0- 0
</screen>
</para>
<para>
All routes associated with that alias will also be deleted
automatically.
</para>
<para>
You are all set now.
</para>
</sect1 id="IP-Aliasing">
<sect1 id="Multicasting">
<title>Multicasting</title>
<para>
* Multicast HOWTO
A good page providing comparisons between reliable multicast protocols
is
<http://www.tascnets.com/mist/doc/mcpCompare.html>.
A very good and up-to-date site, with lots of interesting links
(Internet drafts, RFCs, papers, links to other sites) is:
<http://research.ivv.nasa.gov/RMP/links.html>.
<http://hill.lut.ac.uk/DS-Archive/MTP.html> is also a good source of
information on the subject.
Katia Obraczka's "Multicast Transport Protocols: A Survey and
Taxonomy" article gives short descriptions for each protocol and tries
to classify them according to different features. You can read it in
the IEEE Communications magazine, January 1998, vol. 36, No. 1.
10. References.
10.1. RFCs.
o RFC 1112 "Host Extensions for IP Multicasting". Steve Deering.
August 1989.
o RFC 2236 "Internet Group Management Protocol, version 2". W.
Fenner. November 1997.
o RFC 1458 "Requirements for Multicast Protocols". Braudes, R and
Zabele, S. May 1993.
o RFC 1469 "IP Multicast over Token-Ring Local Area Networks". T.
Pusateri. June 1993.
o RFC 1390 "Transmission of IP and ARP over FDDI Networks". D. Katz.
January 1993.
o RFC 1583 "OSPF Version 2". John Moy. March 1994.
o RFC 1584 "Multicast Extensions to OSPF". John Moy. March 1994.
o RFC 1585 "MOSPF: Analysis and Experience". John Moy. March 1994.
o RFC 1812 "Requirements for IP version 4 Routers". Fred Baker,
Editor. June 1995
o RFC 2117 "Protocol Independent Multicast-Sparse Mode (PIM-SM):
Protocol Specification". D. Estrin, D. Farinacci, A. Helmy, D.
Thaler; S. Deering, M. Handley, V. Jacobson, C. Liu, P. Sharma, and
L. Wei. July 1997.
o RFC 2189 "Core Based Trees (CBT version 2) Multicast Routing". A.
Ballardie. September 1997.
o RFC 2201 "Core Based Trees (CBT) Multicast Routing Architecture".
A. Ballardie. September 1997.
10.2. Internet Drafts.
o "Introduction to IP Multicast Routing". draft-ietf-mboned-intro-
multicast- 03.txt. T. Maufer, C. Semeria. July 1997.
o "Administratively Scoped IP Multicast". draft-ietf-mboned-admin-ip-
space-03.txt. D. Meyer. June 10, 1997.
10.3. Web pages.
o Linux Multicast Homepage.
<http://www.cs.virginia.edu/~mke2e/multicast.html>
o Linux Multicast FAQ. <http://andrew.triumf.ca/pub/linux/multicast-
FAQ>
o Multicast and MBONE on Linux.
<http://www.teksouth.com/linux/multicast/>
o Christian Daudt's MBONE-Linux Page.
<http://www.microplex.com/~csd/linux/mbone.html>
o Reliable Multicast Links
<http://research.ivv.nasa.gov/RMP/links.html>
o Multicast Transport Protocols <http://hill.lut.ac.uk/DS-
Archive/MTP.html>
10.4. Books.
o "TCP/IP Illustrated: Volume 1 The Protocols". Stevens, W. Richard.
Addison Wesley Publishing Company, Reading MA, 1994
o "TCP/IP Illustrated: Volume 2, The Implementation". Wright, Gary
and W. Richard Stevens. Addison Wesley Publishing Company, Reading
MA, 1995
o "UNIX Network Programming Volume 1. Networking APIs: Sockets and
XTI". Stevens, W. Richard. Second Edition, Prentice Hall, Inc.
1998.
o "Internetworking with TCP/IP Volume 1 Principles, Protocols, and
Architecture". Comer, Douglas E. Second Edition, Prentice Hall,
Inc. Englewood Cliffs, New Jersey, 1991
</sect1 id="Multicast">
<sect1 id="Network-Management">
<title>Network-Management</title>
<para>
There is an impressive number of tools focused on network management
and remote administration under Linux. Some interesting remote administration
projects are linuxconf and webmin:
</para>
<para>
<EFBFBD> Webmin <http://www.webmin.com/webmin/>
<EFBFBD> Linuxconf <http://www.solucorp.qc.ca/linuxconf/>
</para>
<para>
Other tools include network traffic analysis tools, network security
tools, monitoring tools, configuration tools, etc. An archive of many
of these tools may be found at Metalab
<http://www.metalab.unc.edu/pub/Linux/system/network/>
</para>
9.2. SNMP
<para>
The Simple Network Management Protocol is a protocol for Internet
network management services. It allows for remote monitoring and
configuration of routers, bridges, network cards, switches, etc...
There is a large amount of libraries, clients, daemons and SNMP based
monitoring programs available for Linux. A good page dealing with SNMP
and Linux software may be found at : http://linas.org/linux/NMS.html
</para>
10. Enterprise Linux Networking
<para>
In certain situations it is necessary for the networking
infrastructure to have proper mechanisms to guarantee network
availability nearly 100% of the time. Some related techniques are
described in the following sections. Most of the following material
can be found at the excellent Linas website:
http://linas.org/linux/index.html and in the Linux High-Availability
HOWTO <http://metalab.unc.edu/pub/Linux/ALPHA/linux-ha/High-
Availability-HOWTO.html>
</para>
10.1. High Availability
<para>
Redundancy is used to prevent the overall IT system from having single
points of failure. A server with only one network card or a single
SCSI disk has two single points of failure. The objective is to mask
unplanned outages from users in a manner that lets users continue to
work quickly. High availability software is a set of scripts and tools
that automatically monitor and detect failures, taking the appropriate
steps to restore normal operation and to notifying system
administrators.
</para>
</sect1 id="Networking-Management">
<sect1 id="Redundant-Networking">
<title>Redundant-Networking</title>
<para>
IP Address Takeover (IPAT). When a network adapter card fails, its IP
address should be taken by a working network card in the same node or
in another node. MAC Address Takeover: when an IP takeover occurs, it
should be made sure that all the nodes in the network update their ARP
caches (the mapping between IP and MAC addresses).
</para>
<para>
See the High-Availability HOWTO for more details:
http://metalab.unc.edu/pub/Linux/ALPHA/linux-ha/High-Availability-
HOWTO.html
</para>
</sect1 id="Redundant-Networking">
10.3. Redundant networking
IP Address Takeover (IPAT). When a network adapter card fails, its IP
address should be taken by a working network card in the same node or
in another node. MAC Address Takeover: when an IP takeover occurs, it
should be made sure that all the nodes in the network update their ARP
caches (the mapping between IP and MAC addresses).
See the High-Availability HOWTO for more details:
http://metalab.unc.edu/pub/Linux/ALPHA/linux-ha/High-Availability-
HOWTO.html