1813 lines
86 KiB
Plaintext
1813 lines
86 KiB
Plaintext
Traffic Control HOWTO
|
||
|
||
Version 1.0.2
|
||
|
||
Martin A. Brown
|
||
|
||
[http://linux-ip.net/] linux-ip.net
|
||
|
||
<martin@linux-ip.net>
|
||
|
||
"Oct 2006"
|
||
Revision History
|
||
Revision 1.0.2 2006-10-28 Revised by: MAB
|
||
Add references to HFSC, alter author email addresses
|
||
Revision 1.0.1 2003-11-17 Revised by: MAB
|
||
Added link to Leonardo Balliache's documentation
|
||
Revision 1.0 2003-09-24 Revised by: MAB
|
||
reviewed and approved by TLDP
|
||
Revision 0.7 2003-09-14 Revised by: MAB
|
||
incremental revisions, proofreading, ready for TLDP
|
||
Revision 0.6 2003-09-09 Revised by: MAB
|
||
minor editing, corrections from Stef Coene
|
||
Revision 0.5 2003-09-01 Revised by: MAB
|
||
HTB section mostly complete, more diagrams, LARTC pre-release
|
||
Revision 0.4 2003-08-30 Revised by: MAB
|
||
added diagram
|
||
Revision 0.3 2003-08-29 Revised by: MAB
|
||
substantial completion of classless, software, rules, elements and components
|
||
sections
|
||
Revision 0.2 2003-08-23 Revised by: MAB
|
||
major work on overview, elements, components and software sections
|
||
Revision 0.1 2003-08-15 Revised by: MAB
|
||
initial revision (outline complete)
|
||
|
||
|
||
Traffic control encompasses the sets of mechanisms and operations by which
|
||
packets are queued for transmission/reception on a network interface. The
|
||
operations include enqueuing, policing, classifying, scheduling, shaping and
|
||
dropping. This HOWTO provides an introduction and overview of the
|
||
capabilities and implementation of traffic control under Linux.
|
||
|
||
<EFBFBD> 2006, Martin A. Brown
|
||
|
||
|
||
Permission is granted to copy, distribute and/or modify this document
|
||
under the terms of the GNU Free Documentation License, Version 1.1 or any
|
||
later version published by the Free Software Foundation; with no
|
||
invariant sections, with no Front-Cover Texts, with no Back-Cover Text. A
|
||
copy of the license is located at [http://www.gnu.org/licenses/fdl.html]
|
||
http://www.gnu.org/licenses/fdl.html.
|
||
|
||
-----------------------------------------------------------------------------
|
||
Table of Contents
|
||
1. Introduction to Linux Traffic Control
|
||
1.1. Target audience and assumptions about the reader
|
||
1.2. Conventions
|
||
1.3. Recommended approach
|
||
1.4. Missing content, corrections and feedback
|
||
|
||
|
||
2. Overview of Concepts
|
||
2.1. What is it?
|
||
2.2. Why use it?
|
||
2.3. Advantages
|
||
2.4. Disdvantages
|
||
2.5. Queues
|
||
2.6. Flows
|
||
2.7. Tokens and buckets
|
||
2.8. Packets and frames
|
||
|
||
|
||
3. Traditional Elements of Traffic Control
|
||
3.1. Shaping
|
||
3.2. Scheduling
|
||
3.3. Classifying
|
||
3.4. Policing
|
||
3.5. Dropping
|
||
3.6. Marking
|
||
|
||
|
||
4. Components of Linux Traffic Control
|
||
4.1. qdisc
|
||
4.2. class
|
||
4.3. filter
|
||
4.4. classifier
|
||
4.5. policer
|
||
4.6. drop
|
||
4.7. handle
|
||
|
||
|
||
5. Software and Tools
|
||
5.1. Kernel requirements
|
||
5.2. iproute2 tools (tc)
|
||
5.3. tcng, Traffic Control Next Generation
|
||
5.4. IMQ, Intermediate Queuing device
|
||
|
||
|
||
6. Classless Queuing Disciplines (qdiscs)
|
||
6.1. FIFO, First-In First-Out (pfifo and bfifo)
|
||
6.2. pfifo_fast, the default Linux qdisc
|
||
6.3. SFQ, Stochastic Fair Queuing
|
||
6.4. ESFQ, Extended Stochastic Fair Queuing
|
||
6.5. GRED, Generic Random Early Drop
|
||
6.6. TBF, Token Bucket Filter
|
||
|
||
|
||
7. Classful Queuing Disciplines (qdiscs)
|
||
7.1. HTB, Hierarchical Token Bucket
|
||
7.2. HFSC, Hierarchical Fair Service Curve
|
||
7.3. PRIO, priority scheduler
|
||
7.4. CBQ, Class Based Queuing
|
||
|
||
|
||
8. Rules, Guidelines and Approaches
|
||
8.1. General Rules of Linux Traffic Control
|
||
8.2. Handling a link with a known bandwidth
|
||
8.3. Handling a link with a variable (or unknown) bandwidth
|
||
8.4. Sharing/splitting bandwidth based on flows
|
||
8.5. Sharing/splitting bandwidth based on IP
|
||
|
||
|
||
9. Scripts for use with QoS/Traffic Control
|
||
9.1. wondershaper
|
||
9.2. ADSL Bandwidth HOWTO script (myshaper)
|
||
9.3. htb.init
|
||
9.4. tcng.init
|
||
9.5. cbq.init
|
||
|
||
|
||
10. Diagram
|
||
10.1. General diagram
|
||
|
||
|
||
11. Annotated Traffic Control Links
|
||
|
||
1. Introduction to Linux Traffic Control
|
||
|
||
Linux offers a very rich set of tools for managing and manipulating the
|
||
transmission of packets. The larger Linux community is very familiar with the
|
||
tools available under Linux for packet mangling and firewalling (netfilter,
|
||
and before that, ipchains) as well as hundreds of network services which can
|
||
run on the operating system. Few inside the community and fewer outside the
|
||
Linux community are aware of the tremendous power of the traffic control
|
||
subsystem which has grown and matured under kernels 2.2 and 2.4.
|
||
|
||
This HOWTO purports to introduce the concepts of traffic control, the
|
||
traditional elements (in general), the components of the Linux traffic
|
||
control implementation and provide some guidelines . This HOWTO represents
|
||
the collection, amalgamation and synthesis of the [http://lartc.org/howto/]
|
||
LARTC HOWTO, documentation from individual projects and importantly the LARTC
|
||
mailing list over a period of study.
|
||
|
||
The impatient soul, who simply wishes to experiment right now, is
|
||
recommended to the [http://tldp.org/HOWTO/Traffic-Control-tcng-HTB-HOWTO/]
|
||
Traffic Control using tcng and HTB HOWTO and [http://lartc.org/howto/] LARTC
|
||
HOWTO for immediate satisfaction.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
1.1. Target audience and assumptions about the reader
|
||
|
||
The target audience for this HOWTO is the network administrator or savvy
|
||
home user who desires an introduction to the field of traffic control and an
|
||
overview of the tools available under Linux for implementing traffic control.
|
||
|
||
I assume that the reader is comfortable with UNIX concepts and the command
|
||
line and has a basic knowledge of IP networking. Users who wish to implement
|
||
traffic control may require the ability to patch, compile and install a
|
||
kernel or software package [1]. For users with newer kernels (2.4.20+, see
|
||
also Section 5.1), however, the ability to install and use software may be
|
||
all that is required.
|
||
|
||
Broadly speaking, this HOWTO was written with a sophisticated user in mind,
|
||
perhaps one who has already had experience with traffic control under Linux.
|
||
I assume that the reader may have no prior traffic control experience.
|
||
-----------------------------------------------------------------------------
|
||
|
||
1.2. Conventions
|
||
|
||
This text was written in [http://www.docbook.org/] DocBook ([http://
|
||
www.docbook.org/xml/4.2/index.html] version 4.2) with vim. All formatting has
|
||
been applied by [http://xmlsoft.org/XSLT/] xsltproc based on DocBook XSL and
|
||
LDP XSL stylesheets. Typeface formatting and display conventions are similar
|
||
to most printed and electronically distributed technical documentation.
|
||
-----------------------------------------------------------------------------
|
||
|
||
1.3. Recommended approach
|
||
|
||
I strongly recommend to the eager reader making a first foray into the
|
||
discipline of traffic control, to become only casually familiar with the tc
|
||
command line utility, before concentrating on tcng. The tcng software package
|
||
defines an entire language for describing traffic control structures. At
|
||
first, this language may seem daunting, but mastery of these basics will
|
||
quickly provide the user with a much wider ability to employ (and deploy)
|
||
traffic control configurations than the direct use of tc would afford.
|
||
|
||
Where possible, I'll try to prefer describing the behaviour of the Linux
|
||
traffic control system in an abstract manner, although in many cases I'll
|
||
need to supply the syntax of one or the other common systems for defining
|
||
these structures. I may not supply examples in both the tcng language and the
|
||
tc command line, so the wise user will have some familiarity with both.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
1.4. Missing content, corrections and feedback
|
||
|
||
There is content yet missing from this HOWTO. In particular, the following
|
||
items will be added at some point to this documentation.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> A description and diagram of GRED, WRR, PRIO and CBQ.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> A section of examples.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> A section detailing the classifiers.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> A section discussing the techniques for measuring traffic.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> A section covering meters.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> More details on tcng.
|
||
|
||
|
||
I welcome suggestions, corrections and feedback at <martin@linux-ip.net>.
|
||
All errors and omissions are strictly my fault. Although I have made every
|
||
effort to verify the factual correctness of the content presented herein, I
|
||
cannot accept any responsibility for actions taken under the influence of
|
||
this documentation.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
2. Overview of Concepts
|
||
|
||
This section will introduce traffic control and examine reasons for it,
|
||
identify a few advantages and disadvantages and introduce key concepts used
|
||
in traffic control.
|
||
-----------------------------------------------------------------------------
|
||
|
||
2.1. What is it?
|
||
|
||
Traffic control is the name given to the sets of queuing systems and
|
||
mechanisms by which packets are received and transmitted on a router. This
|
||
includes deciding which (and whether) packets to accept at what rate on the
|
||
input of an interface and determining which packets to transmit in what order
|
||
at what rate on the output of an interface.
|
||
|
||
In the overwhelming majority of situations, traffic control consists of a
|
||
single queue which collects entering packets and dequeues them as quickly as
|
||
the hardware (or underlying device) can accept them. This sort of queue is a
|
||
FIFO.
|
||
|
||
Note The default qdisc under Linux is the pfifo_fast, which is slightly more
|
||
complex than the FIFO.
|
||
|
||
There are examples of queues in all sorts of software. The queue is a way
|
||
of organizing the pending tasks or data (see also Section 2.5). Because
|
||
network links typically carry data in a serialized fashion, a queue is
|
||
required to manage the outbound data packets.
|
||
|
||
In the case of a desktop machine and an efficient webserver sharing the
|
||
same uplink to the Internet, the following contention for bandwidth may
|
||
occur. The web server may be able to fill up the output queue on the router
|
||
faster than the data can be transmitted across the link, at which point the
|
||
router starts to drop packets (its buffer is full!). Now, the desktop machine
|
||
(with an interactive application user) may be faced with packet loss and high
|
||
latency. Note that high latency sometimes leads to screaming users! By
|
||
separating the internal queues used to service these two different classes of
|
||
application, there can be better sharing of the network resource between the
|
||
two applications.
|
||
|
||
Traffic control is the set of tools which allows the user to have granular
|
||
control over these queues and the queuing mechanisms of a networked device.
|
||
The power to rearrange traffic flows and packets with these tools is
|
||
tremendous and can be complicated, but is no substitute for adequate
|
||
bandwidth.
|
||
|
||
The term Quality of Service (QoS) is often used as a synonym for traffic
|
||
control.
|
||
-----------------------------------------------------------------------------
|
||
|
||
2.2. Why use it?
|
||
|
||
Packet-switched networks differ from circuit based networks in one very
|
||
important regard. A packet-switched network itself is stateless. A
|
||
circuit-based network (such as a telephone network) must hold state within
|
||
the network. IP networks are stateless and packet-switched networks by
|
||
design; in fact, this statelessness is one of the fundamental strengths of
|
||
IP.
|
||
|
||
The weakness of this statelessness is the lack of differentiation between
|
||
types of flows. In simplest terms, traffic control allows an administrator to
|
||
queue packets differently based on attributes of the packet. It can even be
|
||
used to simulate the behaviour of a circuit-based network. This introduces
|
||
statefulness into the stateless network.
|
||
|
||
There are many practical reasons to consider traffic control, and many
|
||
scenarios in which using traffic control makes sense. Below are some examples
|
||
of common problems which can be solved or at least ameliorated with these
|
||
tools.
|
||
|
||
The list below is not an exhaustive list of the sorts of solutions
|
||
available to users of traffic control, but introduces the types of problems
|
||
that can be solved by using traffic control to maximize the usability of a
|
||
network connection.
|
||
|
||
Common traffic control solutions
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Limit total bandwidth to a known rate; TBF, HTB with child class(es).
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Limit the bandwidth of a particular user, service or client; HTB
|
||
classes and classifying with a filter. traffic.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Maximize TCP throughput on an asymmetric link; prioritize transmission
|
||
of ACK packets, wondershaper.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Reserve bandwidth for a particular application or user; HTB with
|
||
children classes and classifying.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Prefer latency sensitive traffic; PRIO inside an HTB class.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Managed oversubscribed bandwidth; HTB with borrowing.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Allow equitable distribution of unreserved bandwidth; HTB with
|
||
borrowing.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Ensure that a particular type of traffic is dropped; policer attached
|
||
to a filter with a drop action.
|
||
|
||
|
||
Remember, too that sometimes, it is simply better to purchase more
|
||
bandwidth. Traffic control does not solve all problems!
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
2.3. Advantages
|
||
|
||
When properly employed, traffic control should lead to more predictable
|
||
usage of network resources and less volatile contention for these resources.
|
||
The network then meets the goals of the traffic control configuration. Bulk
|
||
download traffic can be allocated a reasonable amount of bandwidth even as
|
||
higher priority interactive traffic is simultaneously serviced. Even low
|
||
priority data transfer such as mail can be allocated bandwidth without
|
||
tremendously affecting the other classes of traffic.
|
||
|
||
In a larger picture, if the traffic control configuration represents policy
|
||
which has been communicated to the users, then users (and, by extension,
|
||
applications) know what to expect from the network.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
2.4. Disdvantages
|
||
|
||
|
||
|
||
Complexity is easily one of the most significant disadvantages of using
|
||
traffic control. There are ways to become familiar with traffic control tools
|
||
which ease the learning curve about traffic control and its mechanisms, but
|
||
identifying a traffic control misconfiguration can be quite a challenge.
|
||
|
||
Traffic control when used appropriately can lead to more equitable
|
||
distribution of network resources. It can just as easily be installed in an
|
||
inappropriate manner leading to further and more divisive contention for
|
||
resources.
|
||
|
||
The computing resources required on a router to support a traffic control
|
||
scenario need to be capable of handling the increased cost of maintaining the
|
||
traffic control structures. Fortunately, this is a small incremental cost,
|
||
but can become more significant as the configuration grows in size and
|
||
complexity.
|
||
|
||
For personal use, there's no training cost associated with the use of
|
||
traffic control, but a company may find that purchasing more bandwidth is a
|
||
simpler solution than employing traffic control. Training employees and
|
||
ensuring depth of knowledge may be more costly than investing in more
|
||
bandwidth.
|
||
|
||
Although traffic control on packet-switched networks covers a larger
|
||
conceptual area, you can think of traffic control as a way to provide [some
|
||
of] the statefulness of a circuit-based network to a packet-switched network.
|
||
-----------------------------------------------------------------------------
|
||
|
||
2.5. Queues
|
||
|
||
Queues form the backdrop for all of traffic control and are the integral
|
||
concept behind scheduling. A queue is a location (or buffer) containing a
|
||
finite number of items waiting for an action or service. In networking, a
|
||
queue is the place where packets (our units) wait to be transmitted by the
|
||
hardware (the service). In the simplest model, packets are transmitted in a
|
||
first-come first-serve basis [2]. In the discipline of computer networking
|
||
(and more generally computer science), this sort of a queue is known as a
|
||
FIFO.
|
||
|
||
Without any other mechanisms, a queue doesn't offer any promise for traffic
|
||
control. There are only two interesting actions in a queue. Anything entering
|
||
a queue is enqueued into the queue. To remove an item from a queue is to
|
||
dequeue that item.
|
||
|
||
A queue becomes much more interesting when coupled with other mechanisms
|
||
which can delay packets, rearrange, drop and prioritize packets in multiple
|
||
queues. A queue can also use subqueues, which allow for complexity of
|
||
behaviour in a scheduling operation.
|
||
|
||
From the perspective of the higher layer software, a packet is simply
|
||
enqueued for transmission, and the manner and order in which the enqueued
|
||
packets are transmitted is immaterial to the higher layer. So, to the higher
|
||
layer, the entire traffic control system may appear as a single queue [3]. It
|
||
is only by examining the internals of this layer that the traffic control
|
||
structures become exposed and available.
|
||
-----------------------------------------------------------------------------
|
||
|
||
2.6. Flows
|
||
|
||
A flow is a distinct connection or conversation between two hosts. Any
|
||
unique set of packets between two hosts can be regarded as a flow. Under TCP
|
||
the concept of a connection with a source IP and port and destination IP and
|
||
port represents a flow. A UDP flow can be similarly defined.
|
||
|
||
Traffic control mechanisms frequently separate traffic into classes of
|
||
flows which can be aggregated and transmitted as an aggregated flow (consider
|
||
DiffServ). Alternate mechanisms may attempt to divide bandwidth equally based
|
||
on the individual flows.
|
||
|
||
Flows become important when attempting to divide bandwidth equally among a
|
||
set of competing flows, especially when some applications deliberately build
|
||
a large number of flows.
|
||
-----------------------------------------------------------------------------
|
||
|
||
2.7. Tokens and buckets
|
||
|
||
Two of the key underpinnings of a shaping mechanisms are the interrelated
|
||
concepts of tokens and buckets.
|
||
|
||
In order to control the rate of dequeuing, an implementation can count the
|
||
number of packets or bytes dequeued as each item is dequeued, although this
|
||
requires complex usage of timers and measurements to limit accurately.
|
||
Instead of calculating the current usage and time, one method, used widely in
|
||
traffic control, is to generate tokens at a desired rate, and only dequeue
|
||
packets or bytes if a token is available.
|
||
|
||
Consider the analogy of an amusement park ride with a queue of people
|
||
waiting to experience the ride. Let's imagine a track on which carts traverse
|
||
a fixed track. The carts arrive at the head of the queue at a fixed rate. In
|
||
order to enjoy the ride, each person must wait for an available cart. The
|
||
cart is analogous to a token and the person is analogous to a packet. Again,
|
||
this mechanism is a rate-limiting or shaping mechanism. Only a certain number
|
||
of people can experience the ride in a particular period.
|
||
|
||
To extend the analogy, imagine an empty line for the amusement park ride
|
||
and a large number of carts sitting on the track ready to carry people. If a
|
||
large number of people entered the line together many (maybe all) of them
|
||
could experience the ride because of the carts available and waiting. The
|
||
number of carts available is a concept analogous to the bucket. A bucket
|
||
contains a number of tokens and can use all of the tokens in bucket without
|
||
regard for passage of time.
|
||
|
||
And to complete the analogy, the carts on the amusement park ride (our
|
||
tokens) arrive at a fixed rate and are only kept available up to the size of
|
||
the bucket. So, the bucket is filled with tokens according to the rate, and
|
||
if the tokens are not used, the bucket can fill up. If tokens are used the
|
||
bucket will not fill up. Buckets are a key concept in supporting bursty
|
||
traffic such as HTTP.
|
||
|
||
The TBF qdisc is a classical example of a shaper (the section on TBF
|
||
includes a diagram which may help to visualize the token and bucket
|
||
concepts). The TBF generates rate tokens and only transmits packets when a
|
||
token is available. Tokens are a generic shaping concept.
|
||
|
||
In the case that a queue does not need tokens immediately, the tokens can
|
||
be collected until they are needed. To collect tokens indefinitely would
|
||
negate any benefit of shaping so tokens are collected until a certain number
|
||
of tokens has been reached. Now, the queue has tokens available for a large
|
||
number of packets or bytes which need to be dequeued. These intangible tokens
|
||
are stored in an intangible bucket, and the number of tokens that can be
|
||
stored depends on the size of the bucket.
|
||
|
||
This also means that a bucket full of tokens may be available at any
|
||
instant. Very predictable regular traffic can be handled by small buckets.
|
||
Larger buckets may be required for burstier traffic, unless one of the
|
||
desired goals is to reduce the burstiness of the flows.
|
||
|
||
In summary, tokens are generated at rate, and a maximum of a bucket's worth
|
||
of tokens may be collected. This allows bursty traffic to be handled, while
|
||
smoothing and shaping the transmitted traffic.
|
||
|
||
The concepts of tokens and buckets are closely interrelated and are used in
|
||
both TBF (one of the classless qdiscs) and HTB (one of the classful qdiscs).
|
||
Within the tcng language, the use of two- and three-color meters is
|
||
indubitably a token and bucket concept.
|
||
-----------------------------------------------------------------------------
|
||
|
||
2.8. Packets and frames
|
||
|
||
The terms for data sent across network changes depending on the layer the
|
||
user is examining. This document will rather impolitely (and incorrectly)
|
||
gloss over the technical distinction between packets and frames although they
|
||
are outlined here.
|
||
|
||
The word frame is typically used to describe a layer 2 (data link) unit of
|
||
data to be forwarded to the next recipient. Ethernet interfaces, PPP
|
||
interfaces, and T1 interfaces all name their layer 2 data unit a frame. The
|
||
frame is actually the unit on which traffic control is performed.
|
||
|
||
A packet, on the other hand, is a higher layer concept, representing layer
|
||
3 (network) units. The term packet is preferred in this documentation,
|
||
although it is slightly inaccurate.
|
||
-----------------------------------------------------------------------------
|
||
|
||
3. Traditional Elements of Traffic Control
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
3.1. Shaping
|
||
|
||
Shapers delay packets to meet a desired rate.
|
||
|
||
Shaping is the mechanism by which packets are delayed before transmission
|
||
in an output queue to meet a desired output rate. This is one of the most
|
||
common desires of users seeking bandwidth control solutions. The act of
|
||
delaying a packet as part of a traffic control solution makes every shaping
|
||
mechanism into a non-work-conserving mechanism, meaning roughly: "Work is
|
||
required in order to delay packets."
|
||
|
||
Viewed in reverse, a non-work-conserving queuing mechanism is performing a
|
||
shaping function. A work-conserving queuing mechanism (see PRIO) would not be
|
||
capable of delaying a packet.
|
||
|
||
Shapers attempt to limit or ration traffic to meet but not exceed a
|
||
configured rate (frequently measured in packets per second or bits/bytes per
|
||
second). As a side effect, shapers can smooth out bursty traffic [4]. One of
|
||
the advantages of shaping bandwidth is the ability to control latency of
|
||
packets. The underlying mechanism for shaping to a rate is typically a token
|
||
and bucket mechanism. See also Section 2.7 for further detail on tokens and
|
||
buckets.
|
||
-----------------------------------------------------------------------------
|
||
|
||
3.2. Scheduling
|
||
|
||
Schedulers arrange and/or rearrange packets for output.
|
||
|
||
Scheduling is the mechanism by which packets are arranged (or rearranged)
|
||
between input and output of a particular queue. The overwhelmingly most
|
||
common scheduler is the FIFO (first-in first-out) scheduler. From a larger
|
||
perspective, any set of traffic control mechanisms on an output queue can be
|
||
regarded as a scheduler, because packets are arranged for output.
|
||
|
||
Other generic scheduling mechanisms attempt to compensate for various
|
||
networking conditions. A fair queuing algorithm (see SFQ) attempts to prevent
|
||
any single client or flow from dominating the network usage. A round-robin
|
||
algorithm (see WRR) gives each flow or client a turn to dequeue packets.
|
||
Other sophisticated scheduling algorithms attempt to prevent backbone
|
||
overload (see GRED) or refine other scheduling mechanisms (see ESFQ).
|
||
-----------------------------------------------------------------------------
|
||
|
||
3.3. Classifying
|
||
|
||
Classifiers sort or separate traffic into queues.
|
||
|
||
Classifying is the mechanism by which packets are separated for different
|
||
treatment, possibly different output queues. During the process of accepting,
|
||
routing and transmitting a packet, a networking device can classify the
|
||
packet a number of different ways. Classification can include marking the
|
||
packet, which usually happens on the boundary of a network under a single
|
||
administrative control or classification can occur on each hop individually.
|
||
|
||
The Linux model (see Section 4.3) allows for a packet to cascade across a
|
||
series of classifiers in a traffic control structure and to be classified in
|
||
conjunction with policers (see also Section 4.5).
|
||
-----------------------------------------------------------------------------
|
||
|
||
3.4. Policing
|
||
|
||
Policers measure and limit traffic in a particular queue.
|
||
|
||
Policing, as an element of traffic control, is simply a mechanism by which
|
||
traffic can be limited. Policing is most frequently used on the network
|
||
border to ensure that a peer is not consuming more than its allocated
|
||
bandwidth. A policer will accept traffic to a certain rate, and then perform
|
||
an action on traffic exceeding this rate. A rather harsh solution is to drop
|
||
the traffic, although the traffic could be reclassified instead of being
|
||
dropped.
|
||
|
||
A policer is a yes/no question about the rate at which traffic is entering
|
||
a queue. If the packet is about to enter a queue below a given rate, take one
|
||
action (allow the enqueuing). If the packet is about to enter a queue above a
|
||
given rate, take another action. Although the policer uses a token bucket
|
||
mechanism internally, it does not have the capability to delay a packet as a
|
||
shaping mechanism does.
|
||
-----------------------------------------------------------------------------
|
||
|
||
3.5. Dropping
|
||
|
||
Dropping discards an entire packet, flow or classification.
|
||
|
||
Dropping a packet is a mechanism by which a packet is discarded.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
3.6. Marking
|
||
|
||
Marking is a mechanism by which the packet is altered.
|
||
|
||
Note This is not fwmark. The iptablestarget MARKand the ipchains--markare
|
||
used to modify packet metadata, not the packet itself.
|
||
|
||
Traffic control marking mechanisms install a DSCP on the packet itself,
|
||
which is then used and respected by other routers inside an administrative
|
||
domain (usually for DiffServ).
|
||
-----------------------------------------------------------------------------
|
||
|
||
4. Components of Linux Traffic Control
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Table 1. Correlation between traffic control elements and Linux components
|
||
+-------------------+-------------------------------------------------------+
|
||
|traditional element|Linux component |
|
||
+-------------------+-------------------------------------------------------+
|
||
|shaping |The class offers shaping capabilities. |
|
||
+-------------------+-------------------------------------------------------+
|
||
|scheduling |A qdisc is a scheduler. Schedulers can be simple such |
|
||
| |as the FIFO or complex, containing classes and other |
|
||
| |qdiscs, such as HTB. |
|
||
+-------------------+-------------------------------------------------------+
|
||
|classifying |The filter object performs the classification through |
|
||
| |the agency of a classifier object. Strictly speaking, |
|
||
| |Linux classifiers cannot exist outside of a filter. |
|
||
+-------------------+-------------------------------------------------------+
|
||
|policing |A policer exists in the Linux traffic control |
|
||
| |implementation only as part of a filter. |
|
||
+-------------------+-------------------------------------------------------+
|
||
|dropping |To drop traffic requires a filter with a policer which |
|
||
| |uses "drop" as an action. |
|
||
+-------------------+-------------------------------------------------------+
|
||
|marking |The dsmark qdisc is used for marking. |
|
||
+-------------------+-------------------------------------------------------+
|
||
-----------------------------------------------------------------------------
|
||
|
||
4.1. qdisc
|
||
|
||
Simply put, a qdisc is a scheduler (Section 3.2). Every output interface
|
||
needs a scheduler of some kind, and the default scheduler is a FIFO. Other
|
||
qdiscs available under Linux will rearrange the packets entering the
|
||
scheduler's queue in accordance with that scheduler's rules.
|
||
|
||
The qdisc is the major building block on which all of Linux traffic control
|
||
is built, and is also called a queuing discipline.
|
||
|
||
The classful qdiscs can contain classes, and provide a handle to which to
|
||
attach filters. There is no prohibition on using a classful qdisc without
|
||
child classes, although this will usually consume cycles and other system
|
||
resources for no benefit.
|
||
|
||
The classless qdiscs can contain no classes, nor is it possible to attach
|
||
filter to a classless qdisc. Because a classless qdisc contains no children
|
||
of any kind, there is no utility to classifying. This means that no filter
|
||
can be attached to a classless qdisc.
|
||
|
||
A source of terminology confusion is the usage of the terms root qdisc and
|
||
ingress qdisc. These are not really queuing disciplines, but rather locations
|
||
onto which traffic control structures can be attached for egress (outbound
|
||
traffic) and ingress (inbound traffic).
|
||
|
||
Each interface contains both. The primary and more common is the egress
|
||
qdisc, known as the root qdisc. It can contain any of the queuing disciplines
|
||
(qdiscs) with potential classes and class structures. The overwhelming
|
||
majority of documentation applies to the root qdisc and its children. Traffic
|
||
transmitted on an interface traverses the egress or root qdisc.
|
||
|
||
For traffic accepted on an interface, the ingress qdisc is traversed. With
|
||
its limited utility, it allows no child class to be created, and only exists
|
||
as an object onto which a filter can be attached. For practical purposes, the
|
||
ingress qdisc is merely a convenient object onto which to attach a policer to
|
||
limit the amount of traffic accepted on a network interface.
|
||
|
||
In short, you can do much more with an egress qdisc because it contains a
|
||
real qdisc and the full power of the traffic control system. An ingress qdisc
|
||
can only support a policer. The remainder of the documentation will concern
|
||
itself with traffic control structures attached to the root qdisc unless
|
||
otherwise specified.
|
||
-----------------------------------------------------------------------------
|
||
|
||
4.2. class
|
||
|
||
Classes only exist inside a classful qdisc (e.g., HTB and CBQ). Classes are
|
||
immensely flexible and can always contain either multiple children classes or
|
||
a single child qdisc [5]. There is no prohibition against a class containing
|
||
a classful qdisc itself, which facilitates tremendously complex traffic
|
||
control scenarios.
|
||
|
||
Any class can also have an arbitrary number of filters attached to it,
|
||
which allows the selection of a child class or the use of a filter to
|
||
reclassify or drop traffic entering a particular class.
|
||
|
||
A leaf class is a terminal class in a qdisc. It contains a qdisc (default
|
||
FIFO) and will never contain a child class. Any class which contains a child
|
||
class is an inner class (or root class) and not a leaf class.
|
||
-----------------------------------------------------------------------------
|
||
|
||
4.3. filter
|
||
|
||
The filter is the most complex component in the Linux traffic control
|
||
system. The filter provides a convenient mechanism for gluing together
|
||
several of the key elements of traffic control. The simplest and most obvious
|
||
role of the filter is to classify (see Section 3.3) packets. Linux filters
|
||
allow the user to classify packets into an output queue with either several
|
||
different filters or a single filter.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> A filter must contain a classifier phrase.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> A filter may contain a policer phrase.
|
||
|
||
|
||
Filters can be attached either to classful qdiscs or to classes, however
|
||
the enqueued packet always enters the root qdisc first. After the filter
|
||
attached to the root qdisc has been traversed, the packet may be directed to
|
||
any subclasses (which can have their own filters) where the packet may
|
||
undergo further classification.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
4.4. classifier
|
||
|
||
Filter objects, which can be manipulated using tc, can use several
|
||
different classifying mechanisms, the most common of which is the u32
|
||
classifier. The u32 classifier allows the user to select packets based on
|
||
attributes of the packet.
|
||
|
||
The classifiers are tools which can be used as part of a filter to identify
|
||
characteristics of a packet or a packet's metadata. The Linux classfier
|
||
object is a direct analogue to the basic operation and elemental mechanism of
|
||
traffic control classifying.
|
||
-----------------------------------------------------------------------------
|
||
|
||
4.5. policer
|
||
|
||
This elemental mechanism is only used in Linux traffic control as part of a
|
||
filter. A policer calls one action above and another action below the
|
||
specified rate. Clever use of policers can simulate a three-color meter. See
|
||
also Section 10.
|
||
|
||
Although both policing and shaping are basic elements of traffic control
|
||
for limiting bandwidth usage a policer will never delay traffic. It can only
|
||
perform an action based on specified criteria. See also Example 5.
|
||
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
4.6. drop
|
||
|
||
This basic traffic control mechanism is only used in Linux traffic control
|
||
as part of a policer. Any policer attached to any filter could have a drop
|
||
action.
|
||
|
||
Note The only place in the Linux traffic control system where a packet can be
|
||
explicitly dropped is a policer. A policer can limit packets enqueued at
|
||
a specific rate, or it can be configured to drop all traffic matching a
|
||
particular pattern [6].
|
||
|
||
There are, however, places within the traffic control system where a packet
|
||
may be dropped as a side effect. For example, a packet will be dropped if the
|
||
scheduler employed uses this method to control flows as the GRED does.
|
||
|
||
Also, a shaper or scheduler which runs out of its allocated buffer space
|
||
may have to drop a packet during a particularly bursty or overloaded period.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
4.7. handle
|
||
|
||
Every class and classful qdisc (see also Section 7) requires a unique
|
||
identifier within the traffic control structure. This unique identifier is
|
||
known as a handle and has two constituent members, a major number and a minor
|
||
number. These numbers can be assigned arbitrarily by the user in accordance
|
||
with the following rules [7].
|
||
|
||
|
||
|
||
The numbering of handles for classes and qdiscs
|
||
|
||
major
|
||
This parameter is completely free of meaning to the kernel. The user
|
||
may use an arbitrary numbering scheme, however all objects in the traffic
|
||
control structure with the same parent must share a major handle number.
|
||
Conventional numbering schemes start at 1 for objects attached directly
|
||
to the root qdisc.
|
||
|
||
minor
|
||
This parameter unambiguously identifies the object as a qdisc if minor
|
||
is 0. Any other value identifies the object as a class. All classes
|
||
sharing a parent must have unique minor numbers.
|
||
|
||
|
||
The special handle ffff:0 is reserved for the ingress qdisc.
|
||
|
||
The handle is used as the target in classid and flowid phrases of tc filter
|
||
statements. These handles are external identifiers for the objects, usable by
|
||
userland applications. The kernel maintains internal identifiers for each
|
||
object.
|
||
-----------------------------------------------------------------------------
|
||
|
||
5. Software and Tools
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
5.1. Kernel requirements
|
||
|
||
Many distributions provide kernels with modular or monolithic support for
|
||
traffic control (Quality of Service). Custom kernels may not already provide
|
||
support (modular or not) for the required features. If not, this is a very
|
||
brief listing of the required kernel options.
|
||
|
||
The user who has little or no experience compiling a kernel is recommended
|
||
to Kernel HOWTO. Experienced kernel compilers should be able to determine
|
||
which of the below options apply to the desired configuration, after reading
|
||
a bit more about traffic control and planning.
|
||
|
||
|
||
Example 1. Kernel compilation options [8]
|
||
#
|
||
# QoS and/or fair queueing
|
||
#
|
||
CONFIG_NET_SCHED=y
|
||
CONFIG_NET_SCH_CBQ=m
|
||
CONFIG_NET_SCH_HTB=m
|
||
CONFIG_NET_SCH_CSZ=m
|
||
CONFIG_NET_SCH_PRIO=m
|
||
CONFIG_NET_SCH_RED=m
|
||
CONFIG_NET_SCH_SFQ=m
|
||
CONFIG_NET_SCH_TEQL=m
|
||
CONFIG_NET_SCH_TBF=m
|
||
CONFIG_NET_SCH_GRED=m
|
||
CONFIG_NET_SCH_DSMARK=m
|
||
CONFIG_NET_SCH_INGRESS=m
|
||
CONFIG_NET_QOS=y
|
||
CONFIG_NET_ESTIMATOR=y
|
||
CONFIG_NET_CLS=y
|
||
CONFIG_NET_CLS_TCINDEX=m
|
||
CONFIG_NET_CLS_ROUTE4=m
|
||
CONFIG_NET_CLS_ROUTE=y
|
||
CONFIG_NET_CLS_FW=m
|
||
CONFIG_NET_CLS_U32=m
|
||
CONFIG_NET_CLS_RSVP=m
|
||
CONFIG_NET_CLS_RSVP6=m
|
||
CONFIG_NET_CLS_POLICE=y
|
||
|
||
|
||
A kernel compiled with the above set of options will provide modular
|
||
support for almost everything discussed in this documentation. The user may
|
||
need to modprobe module before using a given feature. Again, the confused
|
||
user is recommended to the Kernel HOWTO, as this document cannot adequately
|
||
address questions about the use of the Linux kernel.
|
||
-----------------------------------------------------------------------------
|
||
|
||
5.2. iproute2 tools (tc)
|
||
|
||
iproute2 is a suite of command line utilities which manipulate kernel
|
||
structures for IP networking configuration on a machine. For technical
|
||
documentation on these tools, see the iproute2 documentation and for a more
|
||
expository discussion, the documentation at [http://linux-ip.net/]
|
||
linux-ip.net. Of the tools in the iproute2 package, the binary tc is the only
|
||
one used for traffic control. This HOWTO will ignore the other tools in the
|
||
suite.
|
||
|
||
|
||
Because it interacts with the kernel to direct the creation, deletion and
|
||
modification of traffic control structures, the tc binary needs to be
|
||
compiled with support for all of the qdiscs you wish to use. In particular,
|
||
the HTB qdisc is not supported yet in the upstream iproute2 package. See
|
||
Section 7.1 for more information.
|
||
|
||
The tc tool performs all of the configuration of the kernel structures
|
||
required to support traffic control. As a result of its many uses, the
|
||
command syntax can be described (at best) as arcane. The utility takes as its
|
||
first non-option argument one of three Linux traffic control components,
|
||
qdisc, class or filter.
|
||
|
||
|
||
Example 2. tc command usage
|
||
[root@leander]# tc
|
||
Usage: tc [ OPTIONS ] OBJECT { COMMAND | help }
|
||
where OBJECT := { qdisc | class | filter }
|
||
OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] }
|
||
|
||
|
||
Each object accepts further and different options, and will be incompletely
|
||
described and documented below. The hints in the examples below are designed
|
||
to introduce the vagaries of tc command line syntax. For more examples,
|
||
consult the [http://lartc.org/howto/] LARTC HOWTO. For even better
|
||
understanding, consult the kernel and iproute2 code.
|
||
|
||
|
||
Example 3. tc qdisc
|
||
[root@leander]# tc qdisc add \ (1)
|
||
> dev eth0 \ (2)
|
||
> root \ (3)
|
||
> handle 1:0 \ (4)
|
||
> htb (5)
|
||
|
||
|
||
(1) Add a queuing discipline. The verb could also be del.
|
||
(2) Specify the device onto which we are attaching the new queuing
|
||
discipline.
|
||
(3) This means "egress" to tc. The word root must be used, however. Another
|
||
qdisc with limited functionality, the ingress qdisc can be attached to
|
||
the same device.
|
||
(4) The handle is a user-specified number of the form major:minor. The
|
||
minor number for any queueing discipline handle must always be zero (0).
|
||
An acceptable shorthand for a qdisc handle is the syntax "1:", where the
|
||
minor number is assumed to be zero (0) if not specified.
|
||
(5) This is the queuing discipline to attach, HTB in this example. Queuing
|
||
discipline specific parameters will follow this. In the example here, we
|
||
add no qdisc-specific parameters.
|
||
|
||
Above was the simplest use of the tc utility for adding a queuing
|
||
discipline to a device. Here's an example of the use of tc to add a class to
|
||
an existing parent class.
|
||
|
||
|
||
Example 4. tc class
|
||
[root@leander]# tc class add \ (1)
|
||
> dev eth0 \ (2)
|
||
> parent 1:1 \ (3)
|
||
> classid 1:6 \ (4)
|
||
> htb \ (5)
|
||
> rate 256kbit \ (6)
|
||
> ceil 512kbit (7)
|
||
|
||
|
||
(1) Add a class. The verb could also be del.
|
||
(2) Specify the device onto which we are attaching the new class.
|
||
(3) Specify the parent handle to which we are attaching the new class.
|
||
(4) This is a unique handle (major:minor) identifying this class. The minor
|
||
number must be any non-zero (0) number.
|
||
(5) Both of the classful qdiscs require that any children classes be
|
||
classes of the same type as the parent. Thus an HTB qdisc will contain
|
||
HTB classes.
|
||
(6) (7)
|
||
This is a class specific parameter. Consult Section 7.1 for more detail
|
||
on these parameters.
|
||
|
||
|
||
|
||
|
||
Example 5. tc filter
|
||
[root@leander]# tc filter add \ (1)
|
||
> dev eth0 \ (2)
|
||
> parent 1:0 \ (3)
|
||
> protocol ip \ (4)
|
||
> prio 5 \ (5)
|
||
> u32 \ (6)
|
||
> match ip port 22 0xffff \ (7)
|
||
> match ip tos 0x10 0xff \ (8)
|
||
> flowid 1:6 \ (9)
|
||
> police \ (10)
|
||
> rate 32000bps \ (11)
|
||
> burst 10240 \ (12)
|
||
> mpu 0 \ (13)
|
||
> action drop/continue (14)
|
||
|
||
|
||
(1) Add a filter. The verb could also be del.
|
||
(2) Specify the device onto which we are attaching the new filter.
|
||
(3) Specify the parent handle to which we are attaching the new filter.
|
||
(4) This parameter is required. It's use should be obvious, although I
|
||
don't know more.
|
||
(5) The prio parameter allows a given filter to be preferred above another.
|
||
The pref is a synonym.
|
||
(6) This is a classifier, and is a required phrase in every tc filter
|
||
command.
|
||
(7) (8)
|
||
These are parameters to the classifier. In this case, packets with a
|
||
type of service flag (indicating interactive usage) and matching port 22
|
||
will be selected by this statement.
|
||
(9) The flowid specifies the handle of the target class (or qdisc) to which
|
||
a matching filter should send its selected packets.
|
||
(10)
|
||
This is the policer, and is an optional phrase in every tc filter
|
||
command.
|
||
(11) The policer will perform one action above this rate, and another action
|
||
below (see action parameter).
|
||
(12) The burst is an exact analog to burst in HTB (burst is a buckets
|
||
concept).
|
||
(13) The minimum policed unit. To count all traffic, use an mpu of zero (0).
|
||
(14) The action indicates what should be done if the rate based on the
|
||
attributes of the policer. The first word specifies the action to take if
|
||
the policer has been exceeded. The second word specifies action to take
|
||
otherwise.
|
||
|
||
As evidenced above, the tc command line utility has an arcane and complex
|
||
syntax, even for simple operations such as these examples show. It should
|
||
come as no surprised to the reader that there exists an easier way to
|
||
configure Linux traffic control. See the next section, Section 5.3.
|
||
-----------------------------------------------------------------------------
|
||
|
||
5.3. tcng, Traffic Control Next Generation
|
||
|
||
FIXME; sing the praises of tcng. See also [http://tldp.org/HOWTO/
|
||
Traffic-Control-tcng-HTB-HOWTO/] Traffic Control using tcng and HTB HOWTO
|
||
and tcng documentation.
|
||
|
||
Traffic control next generation (hereafter, tcng) provides all of the power
|
||
of traffic control under Linux with twenty percent of the headache.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
5.4. IMQ, Intermediate Queuing device
|
||
|
||
|
||
|
||
FIXME; must discuss IMQ. See also Patrick McHardy's website on [http://
|
||
trash.net/~kaber/imq/] IMQ.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
6. Classless Queuing Disciplines (qdiscs)
|
||
|
||
Each of these queuing disciplines can be used as the primary qdisc on an
|
||
interface, or can be used inside a leaf class of a classful qdiscs. These are
|
||
the fundamental schedulers used under Linux. Note that the default scheduler
|
||
is the pfifo_fast.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
6.1. FIFO, First-In First-Out (pfifo and bfifo)
|
||
|
||
Note This is not the default qdisc on Linux interfaces. Be certain to see
|
||
Section 6.2 for the full details on the default (pfifo_fast) qdisc.
|
||
|
||
The FIFO algorithm forms the basis for the default qdisc on all Linux
|
||
network interfaces (pfifo_fast). It performs no shaping or rearranging of
|
||
packets. It simply transmits packets as soon as it can after receiving and
|
||
queuing them. This is also the qdisc used inside all newly created classes
|
||
until another qdisc or a class replaces the FIFO.
|
||
|
||
[fifo-qdisc]
|
||
|
||
A real FIFO qdisc must, however, have a size limit (a buffer size) to
|
||
prevent it from overflowing in case it is unable to dequeue packets as
|
||
quickly as it receives them. Linux implements two basic FIFO qdiscs, one
|
||
based on bytes, and one on packets. Regardless of the type of FIFO used, the
|
||
size of the queue is defined by the parameter limit. For a pfifo the unit is
|
||
understood to be packets and for a bfifo the unit is understood to be bytes.
|
||
|
||
|
||
Example 6. Specifying a limit for a packet or byte FIFO
|
||
[root@leander]# cat bfifo.tcc
|
||
/*
|
||
* make a FIFO on eth0 with 10kbyte queue size
|
||
*
|
||
*/
|
||
|
||
dev eth0 {
|
||
egress {
|
||
fifo (limit 10kB );
|
||
}
|
||
}
|
||
[root@leander]# tcc < bfifo.tcc
|
||
# ================================ Device eth0 ================================
|
||
|
||
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
|
||
tc qdisc add dev eth0 handle 2:0 parent 1:0 bfifo limit 10240
|
||
[root@leander]# cat pfifo.tcc
|
||
/*
|
||
* make a FIFO on eth0 with 30 packet queue size
|
||
*
|
||
*/
|
||
|
||
dev eth0 {
|
||
egress {
|
||
fifo (limit 30p );
|
||
}
|
||
}
|
||
[root@leander]# tcc < pfifo.tcc
|
||
# ================================ Device eth0 ================================
|
||
|
||
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
|
||
tc qdisc add dev eth0 handle 2:0 parent 1:0 pfifo limit 30
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
6.2. pfifo_fast, the default Linux qdisc
|
||
|
||
The pfifo_fast qdisc is the default qdisc for all interfaces under Linux.
|
||
Based on a conventional FIFO qdisc, this qdisc also provides some
|
||
prioritization. It provides three different bands (individual FIFOs) for
|
||
separating traffic. The highest priority traffic (interactive flows) are
|
||
placed into band 0 and are always serviced first. Similarly, band 1 is always
|
||
emptied of pending packets before band 2 is dequeued.
|
||
|
||
[pfifo_fast-qdisc]
|
||
|
||
There is nothing configurable to the end user about the pfifo_fast qdisc.
|
||
For exact details on the priomap and use of the ToS bits, see the pfifo-fast
|
||
section of the LARTC HOWTO.
|
||
-----------------------------------------------------------------------------
|
||
|
||
6.3. SFQ, Stochastic Fair Queuing
|
||
|
||
The SFQ qdisc attempts to fairly distribute opportunity to transmit data to
|
||
the network among an arbitrary number of flows. It accomplishes this by using
|
||
a hash function to separate the traffic into separate (internally maintained)
|
||
FIFOs which are dequeued in a round-robin fashion. Because there is the
|
||
possibility for unfairness to manifest in the choice of hash function, this
|
||
function is altered periodically. Perturbation (the parameter perturb) sets
|
||
this periodicity.
|
||
|
||
[sfq-qdisc]
|
||
|
||
|
||
Example 7. Creating an SFQ
|
||
[root@leander]# cat sfq.tcc
|
||
/*
|
||
* make an SFQ on eth0 with a 10 second perturbation
|
||
*
|
||
*/
|
||
|
||
dev eth0 {
|
||
egress {
|
||
sfq( perturb 10s );
|
||
}
|
||
}
|
||
[root@leander]# tcc < sfq.tcc
|
||
# ================================ Device eth0 ================================
|
||
|
||
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
|
||
tc qdisc add dev eth0 handle 2:0 parent 1:0 sfq perturb 10
|
||
|
||
|
||
Unfortunately, some clever software (e.g. Kazaa and eMule among others)
|
||
obliterate the benefit of this attempt at fair queuing by opening as many TCP
|
||
sessions (flows) as can be sustained. In many networks, with well-behaved
|
||
users, SFQ can adequately distribute the network resources to the contending
|
||
flows, but other measures may be called for when obnoxious applications have
|
||
invaded the network.
|
||
|
||
See also Section 6.4 for an SFQ qdisc with more exposed parameters for the
|
||
user to manipulate.
|
||
-----------------------------------------------------------------------------
|
||
|
||
6.4. ESFQ, Extended Stochastic Fair Queuing
|
||
|
||
Conceptually, this qdisc is no different than SFQ although it allows the
|
||
user to control more parameters than its simpler cousin. This qdisc was
|
||
conceived to overcome the shortcoming of SFQ identified above. By allowing
|
||
the user to control which hashing algorithm is used for distributing access
|
||
to network bandwidth, it is possible for the user to reach a fairer real
|
||
distribution of bandwidth.
|
||
|
||
|
||
Example 8. ESFQ usage
|
||
Usage: ... esfq [ perturb SECS ] [ quantum BYTES ] [ depth FLOWS ]
|
||
[ divisor HASHBITS ] [ limit PKTS ] [ hash HASHTYPE]
|
||
|
||
Where:
|
||
HASHTYPE := { classic | src | dst }
|
||
|
||
|
||
FIXME; need practical experience and/or attestation here.
|
||
-----------------------------------------------------------------------------
|
||
|
||
6.5. GRED, Generic Random Early Drop
|
||
|
||
FIXME; I have never used this. Need practical experience or attestation.
|
||
|
||
Theory declares that a RED algorithm is useful on a backbone or core
|
||
network, but not as useful near the end-user. See the section on flows to see
|
||
a general discussion of the thirstiness of TCP.
|
||
-----------------------------------------------------------------------------
|
||
|
||
6.6. TBF, Token Bucket Filter
|
||
|
||
This qdisc is built on tokens and buckets. It simply shapes traffic
|
||
transmitted on an interface. To limit the speed at which packets will be
|
||
dequeued from a particular interface, the TBF qdisc is the perfect solution.
|
||
It simply slows down transmitted traffic to the specified rate.
|
||
|
||
Packets are only transmitted if there are sufficient tokens available.
|
||
Otherwise, packets are deferred. Delaying packets in this fashion will
|
||
introduce an artificial latency into the packet's round trip time.
|
||
|
||
[tbf-qdisc]
|
||
|
||
|
||
Example 9. Creating a 256kbit/s TBF
|
||
[root@leander]# cat tbf.tcc
|
||
/*
|
||
* make a 256kbit/s TBF on eth0
|
||
*
|
||
*/
|
||
|
||
dev eth0 {
|
||
egress {
|
||
tbf( rate 256 kbps, burst 20 kB, limit 20 kB, mtu 1514 B );
|
||
}
|
||
}
|
||
[root@leander]# tcc < tbf.tcc
|
||
# ================================ Device eth0 ================================
|
||
|
||
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
|
||
tc qdisc add dev eth0 handle 2:0 parent 1:0 tbf burst 20480 limit 20480 mtu 1514 rate 32000bps
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
7. Classful Queuing Disciplines (qdiscs)
|
||
|
||
The flexibility and control of Linux traffic control can be unleashed
|
||
through the agency of the classful qdiscs. Remember that the classful queuing
|
||
disciplines can have filters attached to them, allowing packets to be
|
||
directed to particular classes and subqueues.
|
||
|
||
There are several common terms to describe classes directly attached to the
|
||
root qdisc and terminal classes. Classess attached to the root qdisc are
|
||
known as root classes, and more generically inner classes. Any terminal class
|
||
in a particular queuing discipline is known as a leaf class by analogy to the
|
||
tree structure of the classes. Besides the use of figurative language
|
||
depicting the structure as a tree, the language of family relationships is
|
||
also quite common.
|
||
-----------------------------------------------------------------------------
|
||
|
||
7.1. HTB, Hierarchical Token Bucket
|
||
|
||
HTB uses the concepts of tokens and buckets along with the class-based
|
||
system and filters to allow for complex and granular control over traffic.
|
||
With a complex borrowing model, HTB can perform a variety of sophisticated
|
||
traffic control techniques. One of the easiest ways to use HTB immediately is
|
||
that of shaping.
|
||
|
||
By understanding tokens and buckets or by grasping the function of TBF, HTB
|
||
should be merely a logical step. This queuing discipline allows the user to
|
||
define the characteristics of the tokens and bucket used and allows the user
|
||
to nest these buckets in an arbitrary fashion. When coupled with a
|
||
classifying scheme, traffic can be controlled in a very granular fashion.
|
||
|
||
|
||
|
||
Below is example output of the syntax for HTB on the command line with the
|
||
tc tool. Although the syntax for tcng is a language of its own, the rules for
|
||
HTB are the same.
|
||
|
||
|
||
Example 10. tc usage for HTB
|
||
Usage: ... qdisc add ... htb [default N] [r2q N]
|
||
default minor id of class to which unclassified packets are sent {0}
|
||
r2q DRR quantums are computed as rate in Bps/r2q {10}
|
||
debug string of 16 numbers each 0-3 {0}
|
||
|
||
... class add ... htb rate R1 burst B1 [prio P] [slot S] [pslot PS]
|
||
[ceil R2] [cburst B2] [mtu MTU] [quantum Q]
|
||
rate rate allocated to this class (class can still borrow)
|
||
burst max bytes burst which can be accumulated during idle period {computed}
|
||
ceil definite upper class rate (no borrows) {rate}
|
||
cburst burst but for ceil {computed}
|
||
mtu max packet size we create rate map for {1600}
|
||
prio priority of leaf; lower are served first {0}
|
||
quantum how much bytes to serve from leaf at once {use r2q}
|
||
|
||
TC HTB version 3.3
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
7.1.1. Software requirements
|
||
|
||
Unlike almost all of the other software discussed, HTB is a newer queuing
|
||
discipline and your distribution may not have all of the tools and capability
|
||
you need to use HTB. The kernel must support HTB; kernel version 2.4.20 and
|
||
later support it in the stock distribution, although earlier kernel versions
|
||
require patching. To enable userland support for HTB, see [http://
|
||
luxik.cdi.cz/~devik/qos/htb/] HTB for an iproute2 patch to tc.
|
||
-----------------------------------------------------------------------------
|
||
|
||
7.1.2. Shaping
|
||
|
||
One of the most common applications of HTB involves shaping transmitted
|
||
traffic to a specific rate.
|
||
|
||
All shaping occurs in leaf classes. No shaping occurs in inner or root
|
||
classes as they only exist to suggest how the borrowing model should
|
||
distribute available tokens.
|
||
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
7.1.3. Borrowing
|
||
|
||
A fundamental part of the HTB qdisc is the borrowing mechanism. Children
|
||
classes borrow tokens from their parents once they have exceeded rate. A
|
||
child class will continue to attempt to borrow until it reaches ceil, at
|
||
which point it will begin to queue packets for transmission until more tokens
|
||
/ctokens are available. As there are only two primary types of classes which
|
||
can be created with HTB the following table and diagram identify the various
|
||
possible states and the behaviour of the borrowing mechanisms.
|
||
|
||
|
||
|
||
|
||
Table 2. HTB class states and potential actions taken
|
||
+------+-----+--------------+-----------------------------------------------+
|
||
|type |class|HTB internal |action taken |
|
||
|of |state|state | |
|
||
|class | | | |
|
||
+------+-----+--------------+-----------------------------------------------+
|
||
|leaf |< |HTB_CAN_SEND |Leaf class will dequeue queued bytes up to |
|
||
| |rate | |available tokens (no more than burst packets) |
|
||
+------+-----+--------------+-----------------------------------------------+
|
||
|leaf |> |HTB_MAY_BORROW|Leaf class will attempt to borrow tokens/ |
|
||
| |rate,| |ctokens from parent class. If tokens are |
|
||
| |< | |available, they will be lent in quantum |
|
||
| |ceil | |increments and the leaf class will dequeue up |
|
||
| | | |to cburst bytes |
|
||
+------+-----+--------------+-----------------------------------------------+
|
||
|leaf |> |HTB_CANT_SEND |No packets will be dequeued. This will cause |
|
||
| |ceil | |packet delay and will increase latency to meet |
|
||
| | | |the desired rate. |
|
||
+------+-----+--------------+-----------------------------------------------+
|
||
|inner,|< |HTB_CAN_SEND |Inner class will lend tokens to children. |
|
||
|root |rate | | |
|
||
+------+-----+--------------+-----------------------------------------------+
|
||
|inner,|> |HTB_MAY_BORROW|Inner class will attempt to borrow tokens/ |
|
||
|root |rate,| |ctokens from parent class, lending them to |
|
||
| |< | |competing children in quantum increments per |
|
||
| |ceil | |request. |
|
||
+------+-----+--------------+-----------------------------------------------+
|
||
|inner,|> |HTB_CANT_SEND |Inner class will not attempt to borrow from its|
|
||
|root |ceil | |parent and will not lend tokens/ctokens to |
|
||
| | | |children classes. |
|
||
+------+-----+--------------+-----------------------------------------------+
|
||
|
||
This diagram identifies the flow of borrowed tokens and the manner in which
|
||
tokens are charged to parent classes. In order for the borrowing model to
|
||
work, each class must have an accurate count of the number of tokens used by
|
||
itself and all of its children. For this reason, any token used in a child or
|
||
leaf class is charged to each parent class until the root class is reached.
|
||
|
||
Any child class which wishes to borrow a token will request a token from
|
||
its parent class, which if it is also over its rate will request to borrow
|
||
from its parent class until either a token is located or the root class is
|
||
reached. So the borrowing of tokens flows toward the leaf classes and the
|
||
charging of the usage of tokens flows toward the root class.
|
||
|
||
[htb-borrow]
|
||
|
||
Note in this diagram that there are several HTB root classes. Each of these
|
||
root classes can simulate a virtual circuit.
|
||
-----------------------------------------------------------------------------
|
||
|
||
7.1.4. HTB class parameters
|
||
|
||
|
||
|
||
default
|
||
An optional parameter with every HTB qdisc object, the default default
|
||
is 0, which cause any unclassified traffic to be dequeued at hardware
|
||
speed, completely bypassing any of the classes attached to the root
|
||
qdisc.
|
||
|
||
rate
|
||
Used to set the minimum desired speed to which to limit transmitted
|
||
traffic. This can be considered the equivalent of a committed information
|
||
rate (CIR), or the guaranteed bandwidth for a given leaf class.
|
||
|
||
ceil
|
||
Used to set the maximum desired speed to which to limit the transmitted
|
||
traffic. The borrowing model should illustrate how this parameter is
|
||
used. This can be considered the equivalent of "burstable bandwidth".
|
||
|
||
burst
|
||
This is the size of the rate bucket (see Tokens and buckets). HTB will
|
||
dequeue burst bytes before awaiting the arrival of more tokens.
|
||
|
||
cburst
|
||
This is the size of the ceil bucket (see Tokens and buckets). HTB will
|
||
dequeue cburst bytes before awaiting the arrival of more ctokens.
|
||
|
||
quantum
|
||
This is a key parameter used by HTB to control borrowing. Normally, the
|
||
correct quantum is calculated by HTB, not specified by the user. Tweaking
|
||
this parameter can have tremendous effects on borrowing and shaping under
|
||
contention, because it is used both to split traffic between children
|
||
classes over rate (but below ceil) and to transmit packets from these
|
||
same classes.
|
||
|
||
r2q
|
||
Also, usually calculated for the user, r2q is a hint to HTB to help
|
||
determine the optimal quantum for a particular class.
|
||
|
||
mtu
|
||
|
||
|
||
prio
|
||
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
7.1.5. Rules
|
||
|
||
Below are some general guidelines to using HTB culled from [http://
|
||
docum.org/] http://docum.org/ and the LARTC mailing list. These rules are
|
||
simply a recommendation for beginners to maximize the benefit of HTB until
|
||
gaining a better understanding of the practical application of HTB.
|
||
|
||
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Shaping with HTB occurs only in leaf classes. See also Section 7.1.2.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Because HTB does not shape in any class except the leaf class, the sum
|
||
of the rates of leaf classes should not exceed the ceil of a parent
|
||
class. Ideally, the sum of the rates of the children classes would match
|
||
the rate of the parent class, allowing the parent class to distribute
|
||
leftover bandwidth (ceil - rate) among the children classes.
|
||
|
||
This key concept in employing HTB bears repeating. Only leaf classes
|
||
actually shape packets; packets are only delayed in these leaf classes.
|
||
The inner classes (all the way up to the root class) exist to define how
|
||
borrowing/lending occurs (see also Section 7.1.3).
|
||
|
||
<EFBFBD><EFBFBD>*<2A> The quantum is only only used when a class is over rate but below ceil.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> The quantum should be set at MTU or higher. HTB will dequeue a single
|
||
packet at least per service opportunity even if quantum is too small. In
|
||
such a case, it will not be able to calculate accurately the real
|
||
bandwidth consumed [9].
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Parent classes lend tokens to children in increments of quantum, so for
|
||
maximum granularity and most instantaneously evenly distributed
|
||
bandwidth, quantum should be as low as possible while still no less than
|
||
MTU.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> A distinction between tokens and ctokens is only meaningful in a leaf
|
||
class, because non-leaf classes only lend tokens to child classes.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> HTB borrowing could more accurately be described as "using".
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
7.2. HFSC, Hierarchical Fair Service Curve
|
||
|
||
The HFSC classful qdisc balances delay-sensitive traffic against throughput
|
||
sensitive traffic. In a congested or backlogged state, the HFSC queuing
|
||
discipline interleaves the delay-sensitive traffic when required according
|
||
service curve definitions. Read about the Linux implementation in German,
|
||
HFSC Scheduling mit Linux or read a translation into English, HFSC Scheduling
|
||
with Linux. The original research article, A Hierarchical Fair Service Curve
|
||
Algorithm For Link-Sharing, Real-Time and Priority Services, also remains
|
||
available.
|
||
|
||
This section will be completed at a later date.
|
||
-----------------------------------------------------------------------------
|
||
|
||
7.3. PRIO, priority scheduler
|
||
|
||
The PRIO classful qdisc works on a very simple precept. When it is ready to
|
||
dequeue a packet, the first class is checked for a packet. If there's a
|
||
packet, it gets dequeued. If there's no packet, then the next class is
|
||
checked, until the queuing mechanism has no more classes to check.
|
||
|
||
This section will be completed at a later date.
|
||
-----------------------------------------------------------------------------
|
||
|
||
7.4. CBQ, Class Based Queuing
|
||
|
||
CBQ is the classic implementation (also called venerable) of a traffic
|
||
control system. This section will be completed at a later date.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
8. Rules, Guidelines and Approaches
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
8.1. General Rules of Linux Traffic Control
|
||
|
||
There are a few general rules which ease the study of Linux traffic
|
||
control. Traffic control structures under Linux are the same whether the
|
||
initial configuration has been done with tcng or with tc.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Any router performing a shaping function should be the bottleneck on
|
||
the link, and should be shaping slightly below the maximum available link
|
||
bandwidth. This prevents queues from forming in other routers, affording
|
||
maximum control of packet latency/deferral to the shaping device.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> A device can only shape traffic it transmits [10]. Because the traffic
|
||
has already been received on an input interface, the traffic cannot be
|
||
shaped. A traditional solution to this problem is an ingress policer.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Every interface must have a qdisc. The default qdisc (the pfifo_fast
|
||
qdisc) is used when another qdisc is not explicitly attached to the
|
||
interface.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> One of the classful qdiscs added to an interface with no children
|
||
classes typically only consumes CPU for no benefit.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Any newly created class contains a FIFO. This qdisc can be replaced
|
||
explicitly with any other qdisc. The FIFO qdisc will be removed
|
||
implicitly if a child class is attached to this class.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Classes directly attached to the root qdisc can be used to simulate
|
||
virtual circuits.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> A filter can be attached to classes or one of the classful qdiscs.
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
8.2. Handling a link with a known bandwidth
|
||
|
||
HTB is an ideal qdisc to use on a link with a known bandwidth, because the
|
||
innermost (root-most) class can be set to the maximum bandwidth available on
|
||
a given link. Flows can be further subdivided into children classes, allowing
|
||
either guaranteed bandwidth to particular classes of traffic or allowing
|
||
preference to specific kinds of traffic.
|
||
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
8.3. Handling a link with a variable (or unknown) bandwidth
|
||
|
||
In theory, the PRIO scheduler is an ideal match for links with variable
|
||
bandwidth, because it is a work-conserving qdisc (which means that it
|
||
provides no shaping). In the case of a link with an unknown or fluctuating
|
||
bandwidth, the PRIO scheduler simply prefers to dequeue any available packet
|
||
in the highest priority band first, then falling to the lower priority
|
||
queues.
|
||
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
8.4. Sharing/splitting bandwidth based on flows
|
||
|
||
Of the many types of contention for network bandwidth, this is one of the
|
||
easier types of contention to address in general. By using the SFQ qdisc,
|
||
traffic in a particular queue can be separated into flows, each of which will
|
||
be serviced fairly (inside that queue). Well-behaved applications (and users)
|
||
will find that using SFQ and ESFQ are sufficient for most sharing needs.
|
||
|
||
The Achilles heel of these fair queuing algorithms is a misbehaving user or
|
||
application which opens many connections simultaneously (e.g., eMule,
|
||
eDonkey, Kazaa). By creating a large number of individual flows, the
|
||
application can dominate slots in the fair queuing algorithm. Restated, the
|
||
fair queuing algorithm has no idea that a single application is generating
|
||
the majority of the flows, and cannot penalize the user. Other methods are
|
||
called for.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
8.5. Sharing/splitting bandwidth based on IP
|
||
|
||
For many administrators this is the ideal method of dividing bandwidth
|
||
amongst their users. Unfortunately, there is no easy solution, and it becomes
|
||
increasingly complex with the number of machine sharing a network link.
|
||
|
||
To divide bandwidth equitably between N IP addresses, there must be N
|
||
classes.
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
9. Scripts for use with QoS/Traffic Control
|
||
|
||
|
||
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
9.1. wondershaper
|
||
|
||
More to come, see [http://lartc.org/wondershaper/] wondershaper.
|
||
-----------------------------------------------------------------------------
|
||
|
||
9.2. ADSL Bandwidth HOWTO script (myshaper)
|
||
|
||
More to come, see [http://www.tldp.org/HOWTO/
|
||
ADSL-Bandwidth-Management-HOWTO/implementation.html] myshaper.
|
||
-----------------------------------------------------------------------------
|
||
|
||
9.3. htb.init
|
||
|
||
More to come, see htb.init.
|
||
-----------------------------------------------------------------------------
|
||
|
||
9.4. tcng.init
|
||
|
||
More to come, see tcng.init.
|
||
-----------------------------------------------------------------------------
|
||
|
||
9.5. cbq.init
|
||
|
||
More to come, see cbq.init.
|
||
-----------------------------------------------------------------------------
|
||
|
||
10. Diagram
|
||
|
||
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
10.1. General diagram
|
||
|
||
Below is a general diagram of the relationships of the components of a
|
||
classful queuing discipline (HTB pictured). A larger version of the diagram
|
||
is [http://linux-ip.net/traffic-control/htb-class.png] available.
|
||
|
||
|
||
|
||
|
||
Example 11. An example HTB tcng configuration
|
||
/*
|
||
*
|
||
* possible mock up of diagram shown at
|
||
* http://linux-ip.net/traffic-control/htb-class.png
|
||
*
|
||
*/
|
||
|
||
$m_web = trTCM (
|
||
cir 512 kbps, /* commited information rate */
|
||
cbs 10 kB, /* burst for CIR */
|
||
pir 1024 kbps, /* peak information rate */
|
||
pbs 10 kB /* burst for PIR */
|
||
) ;
|
||
|
||
dev eth0 {
|
||
egress {
|
||
|
||
class ( <$web> ) if tcp_dport == PORT_HTTP && __trTCM_green( $m_web );
|
||
class ( <$bulk> ) if tcp_dport == PORT_HTTP && __trTCM_yellow( $m_web );
|
||
drop if __trTCM_red( $m_web );
|
||
class ( <$bulk> ) if tcp_dport == PORT_SSH ;
|
||
|
||
htb () { /* root qdisc */
|
||
|
||
class ( rate 1544kbps, ceil 1544kbps ) { /* root class */
|
||
|
||
$web = class ( rate 512kbps, ceil 512kbps ) { sfq ; } ;
|
||
$bulk = class ( rate 512kbps, ceil 1544kbps ) { sfq ; } ;
|
||
|
||
}
|
||
}
|
||
}
|
||
}
|
||
|
||
|
||
[htb-class]
|
||
|
||
|
||
-----------------------------------------------------------------------------
|
||
|
||
11. Annotated Traffic Control Links
|
||
|
||
This section identifies a number of links to documentation about traffic
|
||
control and Linux traffic control software. Each link will be listed with a
|
||
brief description of the content at that site.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> HTB site, HTB user guide and HTB theory (Martin "devik" Devera)
|
||
|
||
Hierarchical Token Bucket, HTB, is a classful queuing discipline.
|
||
Widely used and supported it is also fairly well documented in the user
|
||
guide and at [http://www.docum.org/] Stef Coene's site (see below).
|
||
|
||
<EFBFBD><EFBFBD>*<2A> General Quality of Service docs (Leonardo Balliache)
|
||
|
||
|
||
There is a good deal of understandable and introductory documentation on
|
||
his site, and in particular has some excellent overview material. See in
|
||
particular, the detailed [http://opalsoft.net/qos/DS.htm] Linux QoS
|
||
document among others.
|
||
<EFBFBD><EFBFBD>*<2A> tcng (Traffic Control Next Generation) and tcng manual (Werner
|
||
Almesberger)
|
||
|
||
The tcng software includes a language and a set of tools for creating
|
||
and testing traffic control structures. In addition to generating tc
|
||
commands as output, it is also capable of providing output for non-Linux
|
||
applications. A key piece of the tcng suite which is ignored in this
|
||
documentation is the tcsim traffic control simulator.
|
||
|
||
The user manual provided with the tcng software has been converted to
|
||
HTML with latex2html. The distribution comes with the TeX documentation.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> iproute2 and iproute2 manual (Alexey Kuznetsov)
|
||
|
||
This is a the source code for the iproute2 suite, which includes the
|
||
essential tc binary. Note, that as of
|
||
iproute2-2.4.7-now-ss020116-try.tar.gz, the package did not support HTB,
|
||
so a patch available from the [http://luxik.cdi.cz/~devik/qos/htb/] HTB
|
||
site will be required.
|
||
|
||
The manual documents the entire suite of tools, although the tc utility
|
||
is not adequately documented here. The ambitious reader is recommended to
|
||
the LARTC HOWTO after consuming this introduction.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Documentation, graphs, scripts and guidelines to traffic control under
|
||
Linux (Stef Coene)
|
||
|
||
Stef Coene has been gathering statistics and test results, scripts and
|
||
tips for the use of QoS under Linux. There are some particularly useful
|
||
graphs and guidelines available for implementing traffic control at
|
||
Stef's site.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> [http://lartc.org/howto/] LARTC HOWTO (bert hubert, et. al.)
|
||
|
||
The Linux Advanced Routing and Traffic Control HOWTO is one of the key
|
||
sources of data about the sophisticated techniques which are available
|
||
for use under Linux. The Traffic Control Introduction HOWTO should
|
||
provide the reader with enough background in the language and concepts of
|
||
traffic control. The LARTC HOWTO is the next place the reader should look
|
||
for general traffic control information.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Guide to IP Networking with Linux (Martin A. Brown)
|
||
|
||
Not directly related to traffic control, this site includes articles
|
||
and general documentation on the behaviour of the Linux IP layer.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Werner Almesberger's Papers
|
||
|
||
Werner Almesberger is one of the main developers and champions of
|
||
traffic control under Linux (he's also the author of tcng, above). One of
|
||
the key documents describing the entire traffic control architecture of
|
||
the Linux kernel is his Linux Traffic Control - Implementation Overview
|
||
which is available in [http://www.almesberger.net/cv/papers/tcio8.pdf]
|
||
PDF or [http://www.almesberger.net/cv/papers/tcio8.ps.gz] PS format.
|
||
|
||
<EFBFBD><EFBFBD>*<2A> Linux DiffServ project
|
||
|
||
Mercilessly snipped from the main page of the DiffServ site...
|
||
|
||
Differentiated Services (short: Diffserv) is an architecture for
|
||
providing different types or levels of service for network traffic.
|
||
One key characteristic of Diffserv is that flows are aggregated in
|
||
the network, so that core routers only need to distinguish a
|
||
comparably small number of aggregated flows, even if those flows
|
||
contain thousands or millions of individual flows.
|
||
|
||
|
||
Notes
|
||
|
||
[1] See Section 5 for more details on the use or installation of a
|
||
particular traffic control mechanism, kernel or command line utility.
|
||
[2] This queueing model has long been used in civilized countries to
|
||
distribute scant food or provisions equitably. William Faulkner is
|
||
reputed to have walked to the front of the line for to fetch his share
|
||
of ice, proving that not everybody likes the FIFO model, and providing
|
||
us a model for considering priority queuing.
|
||
[3] Similarly, the entire traffic control system appears as a queue or
|
||
scheduler to the higher layer which is enqueuing packets into this
|
||
layer.
|
||
[4] This smoothing effect is not always desirable, hence the HTB parameters
|
||
burst and cburst.
|
||
[5] A classful qdisc can only have children classes of its type. For
|
||
example, an HTB qdisc can only have HTB classes as children. A CBQ qdisc
|
||
cannot have HTB classes as children.
|
||
[6] In this case, you'll have a filter which uses a classifier to select the
|
||
packets you wish to drop. Then you'll use a policer with a with a drop
|
||
action like this police rate 1bps burst 1 action drop/drop.
|
||
[7] I do not know the range nor base of these numbers. I believe they are
|
||
u32 hexadecimal, but need to confirm this.
|
||
[8] The options listed in this example are taken from a 2.4.20 kernel source
|
||
tree. The exact options may differ slightly from kernel release to
|
||
kernel release depending on patches and new schedulers and classifiers.
|
||
[9] HTB will report bandwidth usage in this scenario incorrectly. It will
|
||
calculate the bandwidth used by quantum instead of the real dequeued
|
||
packet size. This can skew results quickly.
|
||
[10] In fact, the Intermediate Queuing Device (IMQ) simulates an output
|
||
device onto which traffic control structures can be attached. This
|
||
clever solution allows a networking device to shape ingress traffic in
|
||
the same fashion as egress traffic. Despite the apparent contradiction
|
||
of the rule, IMQ appears as a device to the kernel. Thus, there has been
|
||
no violation of the rule, but rather a sneaky reinterpretation of that
|
||
rule.
|