old-www/HOWTO/html_single/Traffic-Control-HOWTO/index.html

6711 lines
124 KiB
HTML
Raw Permalink Blame History

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML
><HEAD
><TITLE
>Traffic Control HOWTO</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"></HEAD
><BODY
CLASS="article"
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#840084"
ALINK="#0000FF"
><DIV
CLASS="ARTICLE"
><DIV
CLASS="TITLEPAGE"
><H1
CLASS="title"
><A
NAME="AEN2"
></A
>Traffic Control HOWTO</H1
><H2
CLASS="subtitle"
>Version 1.0.2</H2
><H3
CLASS="author"
><A
NAME="AEN5"
>Martin A. Brown</A
></H3
><DIV
CLASS="affiliation"
><SPAN
CLASS="orgname"
>&#13; <A
HREF="http://linux-ip.net/"
TARGET="_top"
>linux-ip.net</A
>
<BR></SPAN
><DIV
CLASS="address"
><P
CLASS="address"
><TT
CLASS="email"
>&#60;<A
HREF="mailto:martin@linux-ip.net"
>martin@linux-ip.net</A
>&#62;</TT
></P
></DIV
></DIV
><P
CLASS="pubdate"
>"Oct 2006"
<BR></P
><DIV
CLASS="revhistory"
><TABLE
WIDTH="100%"
BORDER="0"
><TR
><TH
ALIGN="LEFT"
VALIGN="TOP"
COLSPAN="3"
><B
>Revision History</B
></TH
></TR
><TR
><TD
ALIGN="LEFT"
>Revision 1.0.2</TD
><TD
ALIGN="LEFT"
>2006-10-28</TD
><TD
ALIGN="LEFT"
>Revised by: MAB</TD
></TR
><TR
><TD
ALIGN="LEFT"
COLSPAN="3"
>Add references to HFSC, alter author email addresses</TD
></TR
><TR
><TD
ALIGN="LEFT"
>Revision 1.0.1</TD
><TD
ALIGN="LEFT"
>2003-11-17</TD
><TD
ALIGN="LEFT"
>Revised by: MAB</TD
></TR
><TR
><TD
ALIGN="LEFT"
COLSPAN="3"
>Added link to Leonardo Balliache's documentation</TD
></TR
><TR
><TD
ALIGN="LEFT"
>Revision 1.0</TD
><TD
ALIGN="LEFT"
>2003-09-24</TD
><TD
ALIGN="LEFT"
>Revised by: MAB</TD
></TR
><TR
><TD
ALIGN="LEFT"
COLSPAN="3"
>reviewed and approved by TLDP</TD
></TR
><TR
><TD
ALIGN="LEFT"
>Revision 0.7</TD
><TD
ALIGN="LEFT"
>2003-09-14</TD
><TD
ALIGN="LEFT"
>Revised by: MAB</TD
></TR
><TR
><TD
ALIGN="LEFT"
COLSPAN="3"
>incremental revisions, proofreading, ready for TLDP</TD
></TR
><TR
><TD
ALIGN="LEFT"
>Revision 0.6</TD
><TD
ALIGN="LEFT"
>2003-09-09</TD
><TD
ALIGN="LEFT"
>Revised by: MAB</TD
></TR
><TR
><TD
ALIGN="LEFT"
COLSPAN="3"
>minor editing, corrections from Stef Coene</TD
></TR
><TR
><TD
ALIGN="LEFT"
>Revision 0.5</TD
><TD
ALIGN="LEFT"
>2003-09-01</TD
><TD
ALIGN="LEFT"
>Revised by: MAB</TD
></TR
><TR
><TD
ALIGN="LEFT"
COLSPAN="3"
>HTB section mostly complete, more diagrams, LARTC pre-release</TD
></TR
><TR
><TD
ALIGN="LEFT"
>Revision 0.4</TD
><TD
ALIGN="LEFT"
>2003-08-30</TD
><TD
ALIGN="LEFT"
>Revised by: MAB</TD
></TR
><TR
><TD
ALIGN="LEFT"
COLSPAN="3"
>added diagram</TD
></TR
><TR
><TD
ALIGN="LEFT"
>Revision 0.3</TD
><TD
ALIGN="LEFT"
>2003-08-29</TD
><TD
ALIGN="LEFT"
>Revised by: MAB</TD
></TR
><TR
><TD
ALIGN="LEFT"
COLSPAN="3"
>substantial completion of classless, software, rules,
elements and components sections</TD
></TR
><TR
><TD
ALIGN="LEFT"
>Revision 0.2</TD
><TD
ALIGN="LEFT"
>2003-08-23</TD
><TD
ALIGN="LEFT"
>Revised by: MAB</TD
></TR
><TR
><TD
ALIGN="LEFT"
COLSPAN="3"
>major work on overview, elements, components and
software sections</TD
></TR
><TR
><TD
ALIGN="LEFT"
>Revision 0.1</TD
><TD
ALIGN="LEFT"
>2003-08-15</TD
><TD
ALIGN="LEFT"
>Revised by: MAB</TD
></TR
><TR
><TD
ALIGN="LEFT"
COLSPAN="3"
>initial revision (outline complete)</TD
></TR
></TABLE
></DIV
><DIV
><DIV
CLASS="abstract"
><A
NAME="AEN71"
></A
><P
></P
><P
>&#13; Traffic control encompasses the sets of mechanisms and operations by
which packets are queued for transmission/reception on a network
interface. The operations include enqueuing, policing, classifying,
scheduling, shaping and dropping. This HOWTO provides an introduction
and overview of the capabilities and implementation of traffic control
under Linux.
</P
><P
></P
></DIV
></DIV
><DIV
CLASS="legalnotice"
><A
NAME="legalnotice"
></A
><P
></P
><P
><EFBFBD> 2006, Martin A. Brown</P
><A
NAME="AEN17"
></A
><BLOCKQUOTE
CLASS="BLOCKQUOTE"
><P
>&#13; Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.1 or any later version published by the Free Software
Foundation; with no invariant sections, with no Front-Cover Texts,
with no Back-Cover Text. A copy of the license is located at
<A
HREF="http://www.gnu.org/licenses/fdl.html"
TARGET="_top"
>http://www.gnu.org/licenses/fdl.html</A
>.
</P
></BLOCKQUOTE
><P
></P
></DIV
><HR></DIV
><DIV
CLASS="TOC"
><DL
><DT
><B
>Table of Contents</B
></DT
><DT
>1. <A
HREF="#intro"
>Introduction to Linux Traffic Control</A
></DT
><DD
><DL
><DT
>1.1. <A
HREF="#i-assumptions"
>Target audience and assumptions about the reader</A
></DT
><DT
>1.2. <A
HREF="#i-conventions"
>Conventions</A
></DT
><DT
>1.3. <A
HREF="#i-recommendation"
>Recommended approach</A
></DT
><DT
>1.4. <A
HREF="#i-missing"
>Missing content, corrections and feedback</A
></DT
></DL
></DD
><DT
>2. <A
HREF="#overview"
>Overview of Concepts</A
></DT
><DD
><DL
><DT
>2.1. <A
HREF="#o-what-is"
>What is it?</A
></DT
><DT
>2.2. <A
HREF="#o-why-use"
>Why use it?</A
></DT
><DT
>2.3. <A
HREF="#o-advantages"
>Advantages</A
></DT
><DT
>2.4. <A
HREF="#o-disadvantages"
>Disdvantages</A
></DT
><DT
>2.5. <A
HREF="#o-queues"
>Queues</A
></DT
><DT
>2.6. <A
HREF="#o-flows"
>Flows</A
></DT
><DT
>2.7. <A
HREF="#o-tokens"
>Tokens and buckets</A
></DT
><DT
>2.8. <A
HREF="#o-packets"
>Packets and frames</A
></DT
></DL
></DD
><DT
>3. <A
HREF="#elements"
>Traditional Elements of Traffic Control</A
></DT
><DD
><DL
><DT
>3.1. <A
HREF="#e-shaping"
>Shaping</A
></DT
><DT
>3.2. <A
HREF="#e-scheduling"
>Scheduling</A
></DT
><DT
>3.3. <A
HREF="#e-classifying"
>Classifying</A
></DT
><DT
>3.4. <A
HREF="#e-policing"
>Policing</A
></DT
><DT
>3.5. <A
HREF="#e-dropping"
>Dropping</A
></DT
><DT
>3.6. <A
HREF="#e-marking"
>Marking</A
></DT
></DL
></DD
><DT
>4. <A
HREF="#components"
>Components of Linux Traffic Control</A
></DT
><DD
><DL
><DT
>4.1. <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
></DT
><DT
>4.2. <A
HREF="#c-class"
><TT
CLASS="constant"
>class</TT
></A
></DT
><DT
>4.3. <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
></DT
><DT
>4.4. <A
HREF="#c-classifier"
>classifier</A
></DT
><DT
>4.5. <A
HREF="#c-police"
>policer</A
></DT
><DT
>4.6. <A
HREF="#c-drop"
><TT
CLASS="constant"
>drop</TT
></A
></DT
><DT
>4.7. <A
HREF="#c-handle"
><TT
CLASS="constant"
>handle</TT
></A
></DT
></DL
></DD
><DT
>5. <A
HREF="#software"
>Software and Tools</A
></DT
><DD
><DL
><DT
>5.1. <A
HREF="#s-kernel"
>Kernel requirements</A
></DT
><DT
>5.2. <A
HREF="#s-iproute2"
><B
CLASS="command"
>iproute2</B
> tools (<B
CLASS="command"
>tc</B
>)</A
></DT
><DT
>5.3. <A
HREF="#s-tcng"
><B
CLASS="command"
>tcng</B
>, Traffic Control Next Generation</A
></DT
><DT
>5.4. <A
HREF="#s-imq"
>IMQ, Intermediate Queuing device</A
></DT
></DL
></DD
><DT
>6. <A
HREF="#classless-qdiscs"
>Classless Queuing Disciplines (<A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
>s)</A
></DT
><DD
><DL
><DT
>6.1. <A
HREF="#qs-fifo"
>FIFO, First-In First-Out (<TT
CLASS="constant"
>pfifo</TT
> and <TT
CLASS="constant"
>bfifo</TT
>)</A
></DT
><DT
>6.2. <A
HREF="#qs-pfifo_fast"
><TT
CLASS="constant"
>pfifo_fast</TT
>, the default Linux qdisc</A
></DT
><DT
>6.3. <A
HREF="#qs-sfq"
>SFQ, Stochastic Fair Queuing</A
></DT
><DT
>6.4. <A
HREF="#qs-esfq"
>ESFQ, Extended Stochastic Fair Queuing</A
></DT
><DT
>6.5. <A
HREF="#qs-gred"
>GRED, Generic Random Early Drop</A
></DT
><DT
>6.6. <A
HREF="#qs-tbf"
>TBF, Token Bucket Filter</A
></DT
></DL
></DD
><DT
>7. <A
HREF="#classful-qdiscs"
>Classful Queuing Disciplines (<A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
>s)</A
></DT
><DD
><DL
><DT
>7.1. <A
HREF="#qc-htb"
>HTB, Hierarchical Token Bucket</A
></DT
><DT
>7.2. <A
HREF="#qc-hfsc"
>HFSC, Hierarchical Fair Service Curve</A
></DT
><DT
>7.3. <A
HREF="#qc-prio"
>PRIO, priority scheduler</A
></DT
><DT
>7.4. <A
HREF="#qc-cbq"
>CBQ, Class Based Queuing</A
></DT
></DL
></DD
><DT
>8. <A
HREF="#rules"
>Rules, Guidelines and Approaches</A
></DT
><DD
><DL
><DT
>8.1. <A
HREF="#r-general"
>General Rules of Linux Traffic Control</A
></DT
><DT
>8.2. <A
HREF="#r-known-bandwidth"
>Handling a link with a known bandwidth</A
></DT
><DT
>8.3. <A
HREF="#r-unknown-bandwidth"
>Handling a link with a variable (or unknown) bandwidth</A
></DT
><DT
>8.4. <A
HREF="#r-sharing-flows"
>Sharing/splitting bandwidth based on flows</A
></DT
><DT
>8.5. <A
HREF="#r-sharing-ips"
>Sharing/splitting bandwidth based on IP</A
></DT
></DL
></DD
><DT
>9. <A
HREF="#scripts"
>Scripts for use with QoS/Traffic Control</A
></DT
><DD
><DL
><DT
>9.1. <A
HREF="#sc-wondershaper"
>wondershaper</A
></DT
><DT
>9.2. <A
HREF="#sc-myshaper"
>ADSL Bandwidth HOWTO script (<TT
CLASS="filename"
>myshaper</TT
>)</A
></DT
><DT
>9.3. <A
HREF="#sc-htb.init"
><TT
CLASS="filename"
>htb.init</TT
></A
></DT
><DT
>9.4. <A
HREF="#sc-tcng.init"
><TT
CLASS="filename"
>tcng.init</TT
></A
></DT
><DT
>9.5. <A
HREF="#sc-cbq.init"
><TT
CLASS="filename"
>cbq.init</TT
></A
></DT
></DL
></DD
><DT
>10. <A
HREF="#diagram"
>Diagram</A
></DT
><DD
><DL
><DT
>10.1. <A
HREF="#d-general"
>General diagram</A
></DT
></DL
></DD
><DT
>11. <A
HREF="#links"
>Annotated Traffic Control Links</A
></DT
></DL
></DIV
><DIV
CLASS="section"
><H1
CLASS="section"
><A
NAME="intro"
></A
>1. Introduction to Linux Traffic Control</H1
><P
>&#13; Linux offers a very rich set of tools for managing and manipulating the
transmission of packets. The larger Linux community is very familiar with
the tools available under Linux for packet mangling and firewalling
(netfilter, and before that, ipchains) as well as hundreds of network
services which can run on the operating system. Few inside the community
and fewer outside the Linux community are aware of the tremendous power of
the traffic control subsystem which has grown and matured under kernels
2.2 and 2.4.
</P
><P
>&#13; This HOWTO purports to introduce the
<A
HREF="#overview"
>concepts of traffic control</A
>,
<A
HREF="#elements"
>the traditional elements (in general)</A
>,
<A
HREF="#components"
>the components of the Linux traffic control
implementation</A
> and provide some
<A
HREF="#rules"
>guidelines</A
>
.
This HOWTO represents the collection, amalgamation and synthesis of the
<A
HREF="http://lartc.org/howto/"
TARGET="_top"
>LARTC HOWTO</A
>, documentation from individual projects and importantly
the <A
HREF="http://mailman.ds9a.nl/mailman/listinfo/lartc/"
TARGET="_top"
>LARTC
mailing list</A
> over a period of study.
</P
><P
>&#13; The impatient soul, who simply wishes to experiment right now, is
recommended to the <A
HREF="http://tldp.org/HOWTO/Traffic-Control-tcng-HTB-HOWTO/"
TARGET="_top"
>&#13; Traffic Control using tcng and HTB HOWTO</A
> and <A
HREF="http://lartc.org/howto/"
TARGET="_top"
>LARTC HOWTO</A
> for
immediate satisfaction.
</P
><P
>&#13; </P
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="i-assumptions"
></A
>1.1. Target audience and assumptions about the reader</H2
><P
>&#13; The target audience for this HOWTO is the network administrator or savvy
home user who desires an introduction to the field of traffic control
and an overview of the tools available under Linux for implementing
traffic control.
</P
><P
>&#13; I assume that the reader is comfortable with UNIX concepts and the
command line and has a basic knowledge of IP networking. Users who wish
to implement traffic control may require the ability to patch, compile
and install a kernel or software package
<A
NAME="AEN91"
HREF="#FTN.AEN91"
><SPAN
CLASS="footnote"
>[1]</SPAN
></A
>. For users with newer kernels
(2.4.20+, see also
<A
HREF="#s-kernel"
>Section 5.1</A
>), however, the ability to install and use
software may be all that is required.
</P
><P
>&#13; Broadly speaking, this HOWTO was written with a sophisticated user in
mind, perhaps one who has already had experience with traffic control
under Linux. I assume that the reader may have
no prior traffic control experience.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="i-conventions"
></A
>1.2. Conventions</H2
><P
>&#13; This text was written in
<A
HREF="http://www.docbook.org/"
TARGET="_top"
>DocBook</A
>
(<A
HREF="http://www.docbook.org/xml/4.2/index.html"
TARGET="_top"
>version 4.2</A
>)
with
<A
HREF="http://vim.sourceforge.net/"
TARGET="_top"
><B
CLASS="command"
>vim</B
></A
>.
All formatting has been applied by
<A
HREF="http://xmlsoft.org/XSLT/"
TARGET="_top"
>xsltproc</A
> based on
<A
HREF="http://docbook.sourceforge.net/projects/xsl/"
TARGET="_top"
>DocBook
XSL</A
> and
<A
HREF="http://www.tldp.org/LDP/LDP-Author-Guide/usingldpxsl.html"
TARGET="_top"
>LDP
XSL</A
> stylesheets. Typeface formatting and display conventions
are similar to most printed and electronically distributed technical
documentation.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="i-recommendation"
></A
>1.3. Recommended approach</H2
><P
>&#13; I strongly recommend to the eager reader making a first foray into the
discipline of traffic control, to become only casually familiar with the
<A
HREF="#s-iproute2-tc"
><B
CLASS="command"
>tc</B
></A
> command line utility, before concentrating on <A
HREF="#s-tcng"
><B
CLASS="command"
>tcng</B
></A
>. The
<B
CLASS="command"
>tcng</B
> software package defines an entire language for describing
traffic control structures.
At first, this language may seem daunting, but mastery of these basics
will quickly provide the user with a much wider ability to employ (and
deploy) traffic control configurations than the direct use of <B
CLASS="command"
>tc</B
>
would afford.
</P
><P
>&#13; Where possible, I'll try to prefer describing the behaviour of
the Linux traffic control system in an abstract manner, although in
many cases I'll need to supply the syntax of one or the other common
systems for defining these structures. I may not supply examples in
both the <B
CLASS="command"
>tcng</B
> language and the <B
CLASS="command"
>tc</B
> command line, so the wise user
will have some familiarity with both.
</P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="i-missing"
></A
>1.4. Missing content, corrections and feedback</H2
><P
>&#13; There is content yet missing from this HOWTO. In particular, the
following items will be added at some point to this documentation.
</P
><P
></P
><UL
><LI
><P
>&#13; A description and diagram of GRED, WRR, PRIO
and CBQ.
</P
></LI
><LI
><P
>&#13; A section of examples.
</P
></LI
><LI
><P
>&#13; A section detailing the classifiers.
</P
></LI
><LI
><P
>&#13; A section discussing the techniques for measuring traffic.
</P
></LI
><LI
><P
>&#13; A section covering meters.
</P
></LI
><LI
><P
>&#13; More details on <B
CLASS="command"
>tcng</B
>.
</P
></LI
></UL
><P
>&#13; I welcome suggestions, corrections and feedback at <TT
CLASS="email"
>&#60;<A
HREF="mailto:martin@linux-ip.net"
>martin@linux-ip.net</A
>&#62;</TT
>. All errors
and omissions are strictly my fault. Although I have made every effort
to verify the factual correctness of the content presented herein, I
cannot accept any responsibility for actions taken under the influence
of this documentation.
</P
><P
>&#13; </P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H1
CLASS="section"
><A
NAME="overview"
></A
>2. Overview of Concepts</H1
><P
>&#13; This section will
<A
HREF="#o-what-is"
>introduce traffic control</A
> and
<A
HREF="#o-why-use"
>examine reasons for it</A
>,
identify a few
<A
HREF="#o-advantages"
>advantages</A
> and
<A
HREF="#o-disadvantages"
>disadvantages</A
> and
introduce key concepts used in traffic control.
</P
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="o-what-is"
></A
>2.1. What is it?</H2
><P
>&#13; Traffic control is the name given to the sets of queuing systems and
mechanisms by which packets are received and transmitted on a router.
This includes deciding which (and whether) packets to accept at what
rate on the input of an interface and determining which packets to
transmit in what order at what rate on the output of an interface.
</P
><P
>&#13; In the overwhelming majority of situations, traffic control consists of
a single queue which collects entering packets and dequeues them as
quickly as the hardware (or underlying device) can accept them. This
sort of queue is a FIFO.
</P
><DIV
CLASS="note"
><P
></P
><TABLE
CLASS="note"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/note.gif"
HSPACE="5"
ALT="Note"></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
>The default qdisc under Linux is the <A
HREF="#qs-pfifo_fast"
><TT
CLASS="constant"
>pfifo_fast</TT
></A
>, which is
slightly more complex than the <A
HREF="#qs-fifo"
>FIFO</A
>.
</TD
></TR
></TABLE
></DIV
><P
>&#13; There are examples of queues in all sorts of software. The queue is a
way of organizing the pending tasks or data (see also
<A
HREF="#o-queues"
>Section 2.5</A
>). Because network links typically carry data
in a serialized fashion, a queue is required to manage the outbound
data packets.
</P
><P
>&#13; In the case of a desktop machine and an efficient webserver sharing the
same uplink to the Internet, the following contention for bandwidth may
occur. The web server may be able to fill up the output queue on the
router faster than the data can be transmitted across the link, at which
point the router starts to drop packets (its buffer is full!). Now, the
desktop machine (with an interactive application user) may be faced with
packet loss and high latency. Note that high latency sometimes leads to
screaming users! By separating the internal queues used to service
these two different classes of application, there can be better sharing
of the network resource between the two applications.
</P
><P
>&#13; Traffic control is the set of tools which allows the user to have
granular control over these queues and the queuing mechanisms of a
networked device. The power to rearrange traffic flows and packets with
these tools is tremendous and can be complicated, but is no substitute
for adequate bandwidth.
</P
><P
>&#13; The term Quality of Service (QoS) is often used as a synonym for traffic
control.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="o-why-use"
></A
>2.2. Why use it?</H2
><P
>&#13; Packet-switched networks differ from circuit based networks in one very
important regard. A packet-switched network itself is stateless. A
circuit-based network (such as a telephone network) must hold state
within the network. IP networks are stateless and packet-switched
networks by design; in fact, this statelessness is one of the
fundamental strengths of IP.
</P
><P
>&#13; The weakness of this statelessness is the lack of differentiation
between types of flows. In simplest terms, traffic control allows an
administrator to queue packets differently based on attributes of the
packet. It can even be used to simulate the behaviour of a
circuit-based network. This introduces statefulness into the stateless
network.
</P
><P
>&#13; There are many practical reasons to consider traffic control, and many
scenarios in which using traffic control makes sense. Below are some
examples of common problems which can be solved or at least ameliorated
with these tools.
</P
><P
>&#13; The list below is not an exhaustive list of the sorts of solutions
available to users of traffic control, but introduces the
types of problems that can be solved by using traffic control to
maximize the usability of a network connection.
</P
><P
></P
><P
><B
>Common traffic control solutions</B
></P
><UL
><LI
><P
>&#13; Limit total bandwidth to a known rate; <A
HREF="#qs-tbf"
>TBF</A
>,
<A
HREF="#qc-htb"
>HTB</A
> with child class(es).
</P
></LI
><LI
><P
>&#13; Limit the bandwidth of a particular user, service or client;
<A
HREF="#qc-htb"
>HTB</A
> classes and <A
HREF="#e-classifying"
>classifying</A
> with a
<A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
>. traffic.
</P
></LI
><LI
><P
>&#13; Maximize TCP throughput on an asymmetric link; prioritize
transmission of ACK packets, <A
HREF="#sc-wondershaper"
>wondershaper</A
>.
</P
></LI
><LI
><P
>&#13; Reserve bandwidth for a particular application or user;
<A
HREF="#qc-htb"
>HTB</A
> with children classes and <A
HREF="#e-classifying"
>classifying</A
>.
</P
></LI
><LI
><P
>&#13; Prefer latency sensitive traffic; <A
HREF="#qc-prio"
>PRIO</A
> inside an
<A
HREF="#qc-htb"
>HTB</A
> class.
</P
></LI
><LI
><P
>&#13; Managed oversubscribed bandwidth; <A
HREF="#qc-htb"
>HTB</A
> with borrowing.
</P
></LI
><LI
><P
>&#13; Allow equitable distribution of unreserved bandwidth; <A
HREF="#qc-htb"
>HTB</A
>
with borrowing.
</P
></LI
><LI
><P
>&#13; Ensure that a particular type of traffic is dropped; <A
HREF="#c-police"
><TT
CLASS="constant"
>policer</TT
></A
>
attached to a <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
> with a <A
HREF="#c-drop"
><TT
CLASS="constant"
>drop</TT
></A
> action.
</P
></LI
></UL
><P
>&#13; Remember, too that sometimes, it is simply better to purchase more
bandwidth. Traffic control does not solve all problems!
</P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="o-advantages"
></A
>2.3. Advantages</H2
><P
>&#13; When properly employed, traffic control should lead to more predictable
usage of network resources and less volatile contention for these
resources. The network then meets the goals of the traffic control
configuration. Bulk download traffic can be allocated a reasonable
amount of bandwidth even as higher priority interactive traffic is
simultaneously
serviced. Even low priority data transfer such as mail can
be allocated bandwidth without tremendously affecting the other classes
of traffic.
</P
><P
>&#13; In a larger picture, if the traffic control configuration represents
policy which has been communicated to the users, then users (and,
by extension, applications) know what to expect from the network.
</P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="o-disadvantages"
></A
>2.4. Disdvantages</H2
><P
>&#13; </P
><P
>&#13; Complexity is easily one of the most significant disadvantages of using
traffic control. There are ways to become familiar with traffic control
tools which ease the learning curve about traffic control and its
mechanisms, but identifying a traffic control misconfiguration can be
quite a challenge.
</P
><P
>&#13; Traffic control when used appropriately can lead to more equitable
distribution of network resources. It can just as easily be installed
in an inappropriate manner leading to further and more divisive
contention for resources.
</P
><P
>&#13; The computing resources required on a router to support a traffic
control scenario need to be capable of handling the increased cost of
maintaining the traffic control structures. Fortunately, this is a
small incremental cost, but can become more significant as the
configuration grows in size and complexity.
</P
><P
>&#13; For personal use, there's no training cost associated with the use of
traffic control, but a company may find that purchasing more bandwidth
is a simpler solution than employing traffic control. Training
employees and ensuring depth of knowledge may be more costly than
investing in more bandwidth.
</P
><P
>&#13; Although
traffic control on packet-switched networks covers a larger conceptual
area, you can think of traffic control as a way to provide [some of] the
statefulness of a circuit-based network to a packet-switched network.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="o-queues"
></A
>2.5. Queues</H2
><P
>&#13; Queues form the backdrop for all of traffic control and are the integral
concept behind scheduling. A queue is a location (or buffer) containing
a finite number of items waiting for an action or service. In
networking, a queue is the place where packets (our units) wait to be
transmitted by the hardware (the service). In the simplest model,
packets are transmitted in a first-come first-serve basis
<A
NAME="AEN220"
HREF="#FTN.AEN220"
><SPAN
CLASS="footnote"
>[2]</SPAN
></A
>.
In the discipline of computer networking (and more generally
computer science), this sort of a queue is known as a FIFO.
</P
><P
>&#13; Without any other mechanisms, a queue doesn't offer any promise for
traffic control. There are only two interesting actions in a queue.
Anything entering a queue is enqueued into the queue. To remove an item
from a queue is to dequeue that item.
</P
><P
>&#13; A queue becomes much more interesting when coupled with other mechanisms
which can delay packets, rearrange, drop and prioritize packets in
multiple queues. A queue can also use subqueues, which allow for
complexity of behaviour in a scheduling operation.
</P
><P
>&#13; From the perspective of the higher layer software, a packet is simply
enqueued for transmission, and the manner and order in which the
enqueued packets are transmitted is immaterial to the higher layer. So,
to the higher layer, the entire traffic control system may appear as a
single queue
<A
NAME="AEN225"
HREF="#FTN.AEN225"
><SPAN
CLASS="footnote"
>[3]</SPAN
></A
>.
It is only by examining the internals of this layer that
the traffic control structures become exposed and available.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="o-flows"
></A
>2.6. Flows</H2
><P
>&#13; A flow is a distinct connection or conversation between two hosts. Any
unique set of packets between two hosts can be regarded as a flow.
Under TCP the concept of a connection with a source IP and port and
destination IP and port represents a flow. A UDP flow can be similarly
defined.
</P
><P
>&#13; Traffic control mechanisms frequently separate traffic into classes of
flows which can be aggregated and transmitted as an aggregated flow
(consider DiffServ). Alternate mechanisms may attempt to divide
bandwidth equally based on the individual flows.
</P
><P
>&#13; Flows become important when attempting to divide bandwidth equally among
a set of competing flows, especially when some applications deliberately
build a large number of flows.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="o-tokens"
></A
>2.7. Tokens and buckets</H2
><A
NAME="o-buckets"
></A
><P
>&#13; Two of the key underpinnings of a <A
HREF="#e-shaping"
>shaping</A
> mechanisms are
the interrelated concepts of tokens and buckets.
</P
><P
>&#13; In order to control the rate of dequeuing, an implementation can count
the number of packets or bytes dequeued as each item is dequeued,
although this requires complex usage of timers and measurements to limit
accurately. Instead of calculating the current usage and time, one
method, used widely in traffic control, is to generate tokens at a
desired rate, and only dequeue packets or bytes if a token is available.
</P
><P
>&#13; Consider the analogy of an amusement park ride with a queue of people
waiting to experience the ride. Let's imagine a track on which carts
traverse a fixed track. The carts arrive at the head of the queue at a
fixed rate. In order to enjoy the ride, each person must wait for an
available cart. The cart is analogous to a token and the person is
analogous to a packet. Again, this mechanism is a rate-limiting or
<A
HREF="#e-shaping"
>shaping</A
> mechanism. Only a certain number of people can
experience the ride in a particular period.
</P
><P
>&#13; To extend the analogy, imagine an empty line for the amusement park
ride and a large number of carts sitting on the track ready to carry
people. If a large number of people entered the line together many
(maybe all) of them could experience the ride because of the carts
available and waiting. The number of carts available is a concept
analogous to the bucket. A bucket contains a number of tokens and can
use all of the tokens in bucket without regard for passage of time.
</P
><P
>&#13; And to complete the analogy, the carts on the amusement park ride (our
tokens) arrive at a fixed rate and are only kept available up to the
size of the bucket. So, the bucket is filled with tokens according to
the rate, and if the tokens are not used, the bucket can fill up. If
tokens are used the bucket will not fill up. Buckets are a key concept
in supporting bursty traffic such as HTTP.
</P
><P
>&#13; The <A
HREF="#qs-tbf"
>TBF</A
> qdisc is a classical example of a shaper (the section
on TBF includes a diagram which may help to visualize the token
and bucket concepts). The TBF generates <TT
CLASS="parameter"
><I
>rate</I
></TT
> tokens and
only transmits packets when a token is available. Tokens are a generic
shaping concept.
</P
><P
>&#13; In the case that a queue does not need tokens immediately, the tokens
can be collected until they are needed. To collect tokens indefinitely
would negate any benefit of shaping so tokens are collected until a
certain number of tokens has been reached. Now, the queue has tokens
available for a large number of packets or bytes which need to be
dequeued. These intangible tokens are stored in an intangible bucket,
and the number of tokens that can be stored depends on the size of the
bucket.
</P
><P
>&#13; This also means that a bucket full of tokens may be available at any
instant. Very predictable regular traffic can be handled by small
buckets. Larger buckets may be required for burstier traffic, unless
one of the desired goals is to reduce the burstiness of the
<A
HREF="#o-flows"
>flows</A
>.
</P
><P
>&#13; In summary, tokens are generated at rate, and a maximum of a bucket's
worth of tokens may be collected. This allows bursty traffic to be
handled, while smoothing and shaping the transmitted traffic.
</P
><P
>&#13; The concepts of tokens and buckets are closely interrelated and are used
in both <A
HREF="#qs-tbf"
>TBF</A
> (one of the <A
HREF="#classless-qdiscs"
>classless qdiscs</A
>) and
<A
HREF="#qc-htb"
>HTB</A
> (one of the <A
HREF="#classful-qdiscs"
>classful qdiscs</A
>).
Within the <B
CLASS="command"
>tcng</B
> language, the use of two- and three-color meters is
indubitably a token and bucket concept.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="o-packets"
></A
>2.8. Packets and frames</H2
><A
NAME="o-frames"
></A
><P
>&#13; The terms for data sent across network changes depending on the layer
the user is examining. This document will rather impolitely (and
incorrectly) gloss over the technical distinction between
packets and frames although they are outlined here.
</P
><P
>&#13; The word frame is typically used to describe a layer 2 (data link) unit
of data to be forwarded to the next recipient. Ethernet interfaces, PPP
interfaces, and T1 interfaces all name their layer 2 data unit a frame.
The frame is actually the unit on which traffic control is performed.
</P
><P
>&#13; A packet, on the other hand, is a higher layer concept, representing
layer 3 (network) units. The term packet is preferred in this
documentation, although it is slightly inaccurate.
</P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H1
CLASS="section"
><A
NAME="elements"
></A
>3. Traditional Elements of Traffic Control</H1
><P
>&#13; </P
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="e-shaping"
></A
>3.1. Shaping</H2
><P
>&#13; Shapers delay packets to meet a desired rate.
</P
><P
>&#13; Shaping is the mechanism by which packets are delayed before
transmission in an output queue to meet a desired output rate. This is
one of the most common desires of users seeking bandwidth control
solutions. The act of delaying a packet as part of a traffic control
solution makes every shaping mechanism into a non-work-conserving
mechanism, meaning roughly: "Work is required in order to delay
packets."
</P
><P
>&#13; Viewed in reverse, a non-work-conserving queuing mechanism is performing
a shaping function. A work-conserving queuing mechanism (see
<A
HREF="#qc-prio"
>PRIO</A
>) would not be capable of delaying a packet.
</P
><P
>&#13; Shapers attempt to limit or ration traffic to meet but not exceed a
configured rate (frequently measured in packets per second or bits/bytes
per second). As a side effect, shapers can smooth out bursty traffic
<A
NAME="AEN271"
HREF="#FTN.AEN271"
><SPAN
CLASS="footnote"
>[4]</SPAN
></A
>.
One of the advantages of shaping bandwidth is the ability to control
latency of packets. The underlying mechanism for shaping to a rate is
typically a token and bucket mechanism. See also
<A
HREF="#o-tokens"
>Section 2.7</A
> for further detail on tokens and buckets.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="e-scheduling"
></A
>3.2. Scheduling</H2
><P
>&#13; Schedulers arrange and/or rearrange packets for output.
</P
><P
>&#13; Scheduling is the mechanism by which packets are arranged (or
rearranged) between input and output of a particular queue. The
overwhelmingly most common scheduler is the FIFO (first-in first-out)
scheduler. From a larger perspective, any set of traffic control
mechanisms on an output queue can be regarded as a scheduler, because
packets are arranged for output.
</P
><P
>&#13; Other generic scheduling mechanisms attempt to compensate for various
networking conditions. A fair queuing algorithm (see <A
HREF="#qs-sfq"
>SFQ</A
>)
attempts to prevent any single client or flow from dominating the
network usage. A round-robin algorithm (see WRR) gives each
flow or client a turn to dequeue packets. Other sophisticated
scheduling algorithms attempt to prevent backbone overload (see
<A
HREF="#qs-gred"
>GRED</A
>) or refine other scheduling mechanisms (see
<A
HREF="#qs-esfq"
>ESFQ</A
>).
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="e-classifying"
></A
>3.3. Classifying</H2
><P
>&#13; Classifiers sort or separate traffic into queues.
</P
><P
>&#13; Classifying is the mechanism by which packets are separated for
different treatment, possibly different output queues. During the
process of accepting, routing and transmitting a packet, a networking
device can classify the packet a number of different ways.
Classification can include
<A
HREF="#e-marking"
>marking</A
> the packet, which usually
happens on the boundary of a network under a single administrative
control or classification can occur on each hop individually.
</P
><P
>&#13; The Linux model (see
<A
HREF="#c-filter"
>Section 4.3</A
>) allows for a packet to cascade across a
series of classifiers in a traffic control structure and to be
classified in conjunction with
<A
HREF="#e-policing"
>policers</A
> (see also
<A
HREF="#c-police"
>Section 4.5</A
>).
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="e-policing"
></A
>3.4. Policing</H2
><P
>&#13; Policers measure and limit traffic in a particular queue.
</P
><P
>&#13; Policing, as an element of traffic control, is simply
a mechanism by which traffic can be limited. Policing is most
frequently used on the network border to ensure that a peer is not
consuming more than its allocated bandwidth. A policer will accept
traffic to a certain rate, and then perform an action on traffic
exceeding this rate. A rather harsh solution is to
<A
HREF="#e-dropping"
>drop</A
> the traffic, although the
traffic could be
<A
HREF="#e-classifying"
>reclassified</A
> instead of being
dropped.
</P
><P
>&#13; A policer is a yes/no question about the rate at which traffic is
entering a queue. If the packet is about to enter a queue below a given
rate, take one action (allow the enqueuing). If the packet is about to
enter a queue above a given rate, take another action. Although the
policer uses a token bucket mechanism internally, it does not have the
capability to delay a packet as a <A
HREF="#e-shaping"
>shaping</A
> mechanism does.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="e-dropping"
></A
>3.5. Dropping</H2
><P
>&#13; Dropping discards an entire packet, flow or classification.
</P
><P
>&#13; Dropping a packet is a mechanism by which a packet is discarded.
</P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="e-marking"
></A
>3.6. Marking</H2
><P
>&#13; Marking is a mechanism by which the packet is altered.
</P
><DIV
CLASS="note"
><P
></P
><TABLE
CLASS="note"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/note.gif"
HSPACE="5"
ALT="Note"></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
>This is not <TT
CLASS="constant"
>fwmark</TT
>. The <B
CLASS="command"
>iptables</B
>target <TT
CLASS="constant"
>MARK</TT
>and the
<B
CLASS="command"
>ipchains</B
><TT
CLASS="option"
>--mark</TT
>are used to modify packet metadata, not the packet
itself.
</TD
></TR
></TABLE
></DIV
><P
>&#13; Traffic control marking mechanisms install a DSCP on the packet
itself, which is then used and respected by other routers inside an
administrative domain (usually for DiffServ).
</P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H1
CLASS="section"
><A
NAME="components"
></A
>4. Components of Linux Traffic Control</H1
><P
>&#13; </P
><P
>&#13; </P
><P
>&#13; </P
><DIV
CLASS="table"
><A
NAME="tb-c-components-correlation"
></A
><P
><B
>Table 1. Correlation between traffic control elements and Linux
components</B
></P
><TABLE
BORDER="1"
CLASS="CALSTABLE"
><THEAD
><TR
><TH
WIDTH="25%"
ALIGN="LEFT"
VALIGN="MIDDLE"
>traditional element</TH
><TH
WIDTH="75%"
ALIGN="LEFT"
VALIGN="MIDDLE"
>Linux component</TH
></TR
></THEAD
><TBODY
><TR
><TD
WIDTH="25%"
ALIGN="LEFT"
VALIGN="MIDDLE"
><A
HREF="#e-shaping"
>shaping</A
></TD
><TD
WIDTH="75%"
ALIGN="LEFT"
VALIGN="MIDDLE"
>The <A
HREF="#c-class"
><TT
CLASS="constant"
>class</TT
></A
> offers shaping capabilities.</TD
></TR
><TR
><TD
WIDTH="25%"
ALIGN="LEFT"
VALIGN="MIDDLE"
><A
HREF="#e-scheduling"
>scheduling</A
></TD
><TD
WIDTH="75%"
ALIGN="LEFT"
VALIGN="MIDDLE"
>A <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
> is a scheduler. Schedulers
can be simple such as the FIFO or
complex, containing classes and other
qdiscs, such as HTB.</TD
></TR
><TR
><TD
WIDTH="25%"
ALIGN="LEFT"
VALIGN="MIDDLE"
><A
HREF="#e-classifying"
>classifying</A
></TD
><TD
WIDTH="75%"
ALIGN="LEFT"
VALIGN="MIDDLE"
>The <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
> object performs the
classification through the agency of a
<A
HREF="#c-classifier"
><TT
CLASS="constant"
>classifier</TT
></A
> object. Strictly speaking,
Linux classifiers cannot exist outside
of a filter.</TD
></TR
><TR
><TD
WIDTH="25%"
ALIGN="LEFT"
VALIGN="MIDDLE"
><A
HREF="#e-policing"
>policing</A
></TD
><TD
WIDTH="75%"
ALIGN="LEFT"
VALIGN="MIDDLE"
>A <A
HREF="#c-police"
><TT
CLASS="constant"
>policer</TT
></A
> exists in the Linux traffic
control implementation only as part of a
<A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
>.</TD
></TR
><TR
><TD
WIDTH="25%"
ALIGN="LEFT"
VALIGN="MIDDLE"
><A
HREF="#e-dropping"
>dropping</A
></TD
><TD
WIDTH="75%"
ALIGN="LEFT"
VALIGN="MIDDLE"
>To <A
HREF="#c-drop"
><TT
CLASS="constant"
>drop</TT
></A
> traffic requires a
<A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
> with a <A
HREF="#c-police"
><TT
CLASS="constant"
>policer</TT
></A
> which
uses <SPAN
CLASS="QUOTE"
>"drop"</SPAN
> as an action.</TD
></TR
><TR
><TD
WIDTH="25%"
ALIGN="LEFT"
VALIGN="MIDDLE"
><A
HREF="#e-marking"
>marking</A
></TD
><TD
WIDTH="75%"
ALIGN="LEFT"
VALIGN="MIDDLE"
>The <TT
CLASS="constant"
>dsmark</TT
> <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
> is used for
marking.</TD
></TR
></TBODY
></TABLE
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="c-qdisc"
></A
>4.1. <TT
CLASS="constant"
>qdisc</TT
></H2
><P
>&#13; Simply put, a qdisc is a scheduler
(<A
HREF="#e-scheduling"
>Section 3.2</A
>). Every output interface needs a
scheduler of some kind, and the default scheduler is a FIFO.
Other qdiscs available under Linux will rearrange the packets entering
the scheduler's queue in accordance with that scheduler's rules.
</P
><P
>&#13; The qdisc is the major building block on which all of Linux traffic
control is built, and is also called a queuing discipline.
</P
><P
>&#13; The <A
HREF="#classful-qdiscs"
>classful qdiscs</A
> can contain <A
HREF="#c-class"
><TT
CLASS="constant"
>class</TT
></A
>es, and provide a handle
to which to attach <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
>s. There is no prohibition on using a
classful qdisc without child classes, although this will usually consume
cycles and other system resources for no benefit.
</P
><P
>&#13; The <A
HREF="#classless-qdiscs"
>classless qdiscs</A
> can contain no classes, nor is it possible to
attach filter to a classless qdisc. Because a classless qdisc contains
no children of any kind, there is no utility to <A
HREF="#e-classifying"
>classifying</A
>.
This means that no filter can be attached to a classless qdisc.
</P
><P
>&#13; A source of terminology confusion is the usage of the terms
<TT
CLASS="constant"
>root</TT
> qdisc and <TT
CLASS="constant"
>ingress</TT
> qdisc. These are not
really queuing disciplines, but rather locations onto which traffic
control structures can be attached for egress (outbound traffic) and
ingress (inbound traffic).
</P
><P
>&#13; Each interface contains both. The primary and more common is the
egress qdisc, known as the <TT
CLASS="constant"
>root</TT
> qdisc. It can contain any
of the queuing disciplines (<A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
>s) with potential
<A
HREF="#c-class"
><TT
CLASS="constant"
>class</TT
></A
>es and class structures. The overwhelming majority of
documentation applies to the <TT
CLASS="constant"
>root</TT
> qdisc and its children. Traffic
transmitted on an interface traverses the egress or <TT
CLASS="constant"
>root</TT
> qdisc.
</P
><P
>&#13; For traffic accepted on an interface, the <TT
CLASS="constant"
>ingress</TT
> qdisc is traversed.
With its limited utility, it allows no child <A
HREF="#c-class"
><TT
CLASS="constant"
>class</TT
></A
> to be
created, and only exists as an object onto which a <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
> can be
attached. For practical purposes, the <TT
CLASS="constant"
>ingress</TT
> qdisc is merely a
convenient object onto which to attach a <A
HREF="#c-police"
><TT
CLASS="constant"
>policer</TT
></A
> to limit the
amount of traffic accepted on a network interface.
</P
><P
>&#13; In short, you can do much more with an egress qdisc because it contains
a real qdisc and the full power of the traffic control system. An
<TT
CLASS="constant"
>ingress</TT
> qdisc can only support a policer. The remainder of the
documentation will concern itself with traffic control structures
attached to the <TT
CLASS="constant"
>root</TT
> qdisc unless otherwise specified.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="c-class"
></A
>4.2. <TT
CLASS="constant"
>class</TT
></H2
><P
>&#13; Classes only exist inside a classful <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
> (<I
CLASS="foreignphrase"
>e.g.</I
>, <A
HREF="#qc-htb"
>HTB</A
>
and <A
HREF="#qc-cbq"
>CBQ</A
>). Classes are immensely flexible and can always
contain either multiple children classes or a single child qdisc
<A
NAME="AEN422"
HREF="#FTN.AEN422"
><SPAN
CLASS="footnote"
>[5]</SPAN
></A
>.
There is no prohibition against a class containing a classful qdisc
itself, which facilitates tremendously complex traffic control
scenarios.
</P
><P
>&#13; Any class can also have an arbitrary number of <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
>s attached
to it, which allows the selection of a child class or the use of a
filter to reclassify or drop traffic entering a particular class.
</P
><P
>&#13; A leaf class is a terminal class in a qdisc. It contains a qdisc
(default <A
HREF="#qs-fifo"
>FIFO</A
>) and will never contain a child class. Any
class which contains a child class is an inner class (or root class) and
not a leaf class.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="c-filter"
></A
>4.3. <TT
CLASS="constant"
>filter</TT
></H2
><P
>&#13; The filter is the most complex component in the Linux
traffic control system. The filter provides a convenient mechanism for
gluing together several of the key elements of traffic control. The
simplest and most obvious role of the filter is to classify
(see <A
HREF="#e-classifying"
>Section 3.3</A
>) packets. Linux filters allow the
user to classify packets into an output queue with either several
different filters or a single filter.
</P
><P
></P
><UL
><LI
><P
>&#13; A filter must contain a <A
HREF="#c-classifier"
><TT
CLASS="constant"
>classifier</TT
></A
> phrase.
</P
></LI
><LI
><P
>&#13; A filter may contain a <A
HREF="#c-police"
><TT
CLASS="constant"
>policer</TT
></A
> phrase.
</P
></LI
></UL
><P
>&#13; Filters can be attached either to classful <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
>s or to
<A
HREF="#c-class"
><TT
CLASS="constant"
>class</TT
></A
>es, however the enqueued packet always enters the root
qdisc first. After the filter attached to the root qdisc has been
traversed, the packet may be directed to any subclasses (which can have
their own filters) where the packet may undergo further classification.
</P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="c-classifier"
></A
>4.4. classifier</H2
><P
>&#13; Filter objects, which can be manipulated using <A
HREF="#s-iproute2-tc"
><B
CLASS="command"
>tc</B
></A
>, can use several
different classifying mechanisms, the most common of which is the
<TT
CLASS="constant"
>u32</TT
> classifier. The <TT
CLASS="constant"
>u32</TT
> classifier allows the user to
select packets based on attributes of the packet.
</P
><P
>&#13; The classifiers are tools which can be used as part of a <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
>
to identify characteristics of a packet or a packet's metadata. The
Linux classfier object is a direct analogue to the basic operation and
elemental mechanism of traffic control <A
HREF="#e-classifying"
>classifying</A
>.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="c-police"
></A
>4.5. policer</H2
><P
>&#13; This elemental mechanism is only used in Linux traffic control as part
of a <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
>. A policer calls one action above and another
action below the specified rate. Clever use of policers can simulate
a three-color meter. See also
<A
HREF="#diagram"
>Section 10</A
>.
</P
><P
>&#13; Although both <A
HREF="#e-policing"
>policing</A
> and <A
HREF="#e-shaping"
>shaping</A
> are basic
elements of traffic control for limiting bandwidth usage a policer will
never delay traffic. It can only perform an action based on specified
criteria. See also
<A
HREF="#ex-s-iproute2-tc-filter"
>Example 5</A
>.
</P
><P
>&#13; </P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="c-drop"
></A
>4.6. <TT
CLASS="constant"
>drop</TT
></H2
><P
>&#13; This basic traffic control mechanism is only used in Linux traffic
control as part of a <A
HREF="#c-police"
><TT
CLASS="constant"
>policer</TT
></A
>. Any policer attached to
any <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
> could have a <A
HREF="#c-drop"
><TT
CLASS="constant"
>drop</TT
></A
> action.
</P
><DIV
CLASS="note"
><P
></P
><TABLE
CLASS="note"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/note.gif"
HSPACE="5"
ALT="Note"></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
>The only place in the Linux traffic control system where a packet can be
explicitly dropped is a policer. A policer can limit packets enqueued
at a specific rate, or it can be configured to drop all traffic matching
a particular pattern
<A
NAME="AEN483"
HREF="#FTN.AEN483"
><SPAN
CLASS="footnote"
>[6]</SPAN
></A
>.
</TD
></TR
></TABLE
></DIV
><P
>&#13; There are, however, places within the traffic control system where a
packet may be dropped as a side effect. For example, a packet will be
dropped if the scheduler employed uses this method to control flows as
the <A
HREF="#qs-gred"
>GRED</A
> does.
</P
><P
>&#13; Also, a shaper or scheduler which runs out of its allocated buffer space
may have to drop a packet during a particularly bursty or overloaded
period.
</P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="c-handle"
></A
>4.7. <TT
CLASS="constant"
>handle</TT
></H2
><P
>&#13; Every <A
HREF="#c-class"
><TT
CLASS="constant"
>class</TT
></A
> and classful <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
> (see also
<A
HREF="#classful-qdiscs"
>Section 7</A
>) requires a unique identifier within
the traffic control structure. This unique identifier is known as a
handle and has two constituent members, a major number and a minor
number. These numbers can be assigned arbitrarily by the user in
accordance with the following rules
<A
NAME="AEN505"
HREF="#FTN.AEN505"
><SPAN
CLASS="footnote"
>[7]</SPAN
></A
>.
</P
><P
>&#13; </P
><P
></P
><DIV
CLASS="variablelist"
><P
><B
>The numbering of handles for classes and qdiscs</B
></P
><DL
><DT
><TT
CLASS="parameter"
><I
>major</I
></TT
></DT
><DD
><P
>&#13; This parameter is completely free of meaning to the kernel. The
user may use an arbitrary numbering scheme, however all objects in
the traffic control structure with the same parent must share a
<TT
CLASS="parameter"
><I
>major</I
></TT
> handle number. Conventional
numbering schemes start at 1 for objects attached directly to the
<TT
CLASS="constant"
>root</TT
> qdisc.
</P
></DD
><DT
><TT
CLASS="parameter"
><I
>minor</I
></TT
></DT
><DD
><P
>&#13; This parameter unambiguously identifies the object as a qdisc if
<TT
CLASS="parameter"
><I
>minor</I
></TT
> is 0. Any other value identifies the
object as a class. All classes sharing a parent must have unique
<TT
CLASS="parameter"
><I
>minor</I
></TT
> numbers.
</P
></DD
></DL
></DIV
><P
>&#13; The special handle ffff:0 is reserved for the <TT
CLASS="constant"
>ingress</TT
> qdisc.
</P
><P
>&#13; The handle is used as the target in <TT
CLASS="parameter"
><I
>classid</I
></TT
> and
<TT
CLASS="parameter"
><I
>flowid</I
></TT
> phrases of <B
CLASS="command"
>tc</B
> <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
> statements.
These handles are external identifiers for the objects, usable by
userland applications. The kernel maintains internal identifiers for
each object.
</P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H1
CLASS="section"
><A
NAME="software"
></A
>5. Software and Tools</H1
><P
>&#13; </P
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="s-kernel"
></A
>5.1. Kernel requirements</H2
><P
>&#13; Many distributions provide kernels with modular or monolithic support
for traffic control (Quality of Service). Custom kernels may not
already provide support (modular or not) for the required features. If
not, this is a very brief listing of the required kernel options.
</P
><P
>&#13; The user who has little or no experience compiling a kernel is
recommended to <A
HREF="http://tldp.org/HOWTO/Kernel-HOWTO/"
TARGET="_top"
>Kernel
HOWTO</A
>. Experienced kernel compilers should
be able to determine which of the below options apply to the desired
configuration, after reading a bit more about traffic control and
planning.
</P
><DIV
CLASS="example"
><A
NAME="ex-s-kernel-options"
></A
><P
><B
>Example 1. Kernel compilation options
<A
NAME="AEN542"
HREF="#FTN.AEN542"
><SPAN
CLASS="footnote"
>[8]</SPAN
></A
>
</B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_CSZ=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_QOS=y
CONFIG_NET_ESTIMATOR=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_CLS_POLICE=y
</PRE
></FONT
></TD
></TR
></TABLE
></DIV
><P
>&#13; A kernel compiled with the above set of options will provide modular
support for almost everything discussed in this documentation. The user
may need to <B
CLASS="command"
>modprobe
<TT
CLASS="replaceable"
><I
>module</I
></TT
></B
> before using a given
feature. Again, the confused user is recommended to the
<A
HREF="http://tldp.org/HOWTO/Kernel-HOWTO/"
TARGET="_top"
>Kernel
HOWTO</A
>, as this document cannot adequately address questions
about the use of the Linux kernel.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="s-iproute2"
></A
>5.2. <B
CLASS="command"
>iproute2</B
> tools (<B
CLASS="command"
>tc</B
>)</H2
><P
>&#13; <B
CLASS="command"
>iproute2</B
> is a suite of command line utilities which
manipulate kernel structures for IP networking
configuration on a machine. For technical documentation on these tools,
see the <A
HREF="http://linux-ip.net/gl/ip-cref/"
TARGET="_top"
>iproute2
documentation</A
> and for a more expository discussion, the
documentation at <A
HREF="http://linux-ip.net/"
TARGET="_top"
>linux-ip.net</A
>. Of the tools in the <B
CLASS="command"
>iproute2</B
>
package, the binary <B
CLASS="command"
>tc</B
> is the only one used for traffic control. This
HOWTO will ignore the other tools in the suite.
</P
><A
NAME="s-iproute2-tc"
></A
><P
>&#13; Because it interacts with the kernel to direct the creation, deletion
and modification of traffic control structures, the <B
CLASS="command"
>tc</B
> binary needs to
be compiled with support for all of the <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
>s you wish to use.
In particular, the HTB qdisc is not supported yet in the upstream
<B
CLASS="command"
>iproute2</B
> package. See
<A
HREF="#qc-htb"
>Section 7.1</A
> for more information.
</P
><P
>&#13; The <B
CLASS="command"
>tc</B
> tool performs all of the configuration of the kernel structures
required to support traffic control. As a result of its many uses, the
command syntax can be described (at best) as arcane. The utility takes
as its first non-option argument one of three Linux traffic control
components, <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
>, <A
HREF="#c-class"
><TT
CLASS="constant"
>class</TT
></A
> or <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
>.
</P
><DIV
CLASS="example"
><A
NAME="ex-s-iproute2-tc"
></A
><P
><B
>Example 2. <B
CLASS="command"
>tc</B
> command usage</B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>tc</B
></TT
>
<TT
CLASS="computeroutput"
>Usage: tc [ OPTIONS ] OBJECT { COMMAND | help }
where OBJECT := { qdisc | class | filter }
OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] }</TT
>
</PRE
></FONT
></TD
></TR
></TABLE
></DIV
><P
>&#13; Each object accepts further and different options, and will be
incompletely described and documented below. The hints in the examples
below are designed to introduce the vagaries of <B
CLASS="command"
>tc</B
> command line
syntax. For more examples, consult the <A
HREF="http://lartc.org/howto/"
TARGET="_top"
>LARTC HOWTO</A
>. For even
better understanding, consult the kernel and <B
CLASS="command"
>iproute2</B
> code.
</P
><DIV
CLASS="example"
><A
NAME="ex-s-iproute2-tc-qdisc"
></A
><P
><B
>Example 3. <B
CLASS="command"
>tc</B
> <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
></B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>tc qdisc add \</B
></TT
> <A
NAME="ex-s-itcq-tc"
><IMG
SRC="../images/callouts/1.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(1)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> dev eth0 \</B
></TT
> <A
NAME="ex-s-itcq-dev"
><IMG
SRC="../images/callouts/2.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(2)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> root \</B
></TT
> <A
NAME="ex-s-itcq-root"
><IMG
SRC="../images/callouts/3.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(3)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> handle 1:0 \</B
></TT
> <A
NAME="ex-s-itcq-handle"
><IMG
SRC="../images/callouts/4.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(4)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> htb</B
></TT
> <A
NAME="ex-s-itcq-qdisc"
><IMG
SRC="../images/callouts/5.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(5)"></A
>
</PRE
></FONT
></TD
></TR
></TABLE
><DIV
CLASS="calloutlist"
><DL
COMPACT="COMPACT"
><DT
><A
HREF="#ex-s-itcq-tc"
><IMG
SRC="../images/callouts/1.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(1)"></A
></DT
><DD
>&#13; Add a queuing discipline. The verb could also be
<TT
CLASS="constant"
>del</TT
>.
</DD
><DT
><A
HREF="#ex-s-itcq-dev"
><IMG
SRC="../images/callouts/2.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(2)"></A
></DT
><DD
>&#13; Specify the device onto which we are attaching the new queuing
discipline.
</DD
><DT
><A
HREF="#ex-s-itcq-root"
><IMG
SRC="../images/callouts/3.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(3)"></A
></DT
><DD
>&#13; This means <SPAN
CLASS="QUOTE"
>"egress"</SPAN
> to <B
CLASS="command"
>tc</B
>. The word
<TT
CLASS="constant"
>root</TT
> must be used, however. Another
qdisc with limited functionality, the <TT
CLASS="constant"
>ingress</TT
> qdisc can be
attached to the same device.
</DD
><DT
><A
HREF="#ex-s-itcq-handle"
><IMG
SRC="../images/callouts/4.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(4)"></A
></DT
><DD
>&#13; The <A
HREF="#c-handle"
><TT
CLASS="constant"
>handle</TT
></A
> is a user-specified number of the form
<TT
CLASS="replaceable"
><I
>major</I
></TT
>:<TT
CLASS="replaceable"
><I
>minor</I
></TT
>.
The minor number for any queueing discipline handle must always be
zero (0). An acceptable shorthand for a <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
> handle is
the syntax "1:", where the minor number is assumed to be zero (0)
if not specified.
</DD
><DT
><A
HREF="#ex-s-itcq-qdisc"
><IMG
SRC="../images/callouts/5.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(5)"></A
></DT
><DD
>&#13; This is the queuing discipline to attach, HTB in this
example. Queuing discipline specific parameters will follow this.
In the example here, we add no qdisc-specific parameters.
</DD
></DL
></DIV
></DIV
><P
>&#13; Above was the simplest use of the <B
CLASS="command"
>tc</B
> utility for adding a queuing
discipline to a device. Here's an example of the use of <B
CLASS="command"
>tc</B
> to add a
class to an existing parent class.
</P
><DIV
CLASS="example"
><A
NAME="ex-s-iproute2-tc-class"
></A
><P
><B
>Example 4. <B
CLASS="command"
>tc</B
> <A
HREF="#c-class"
><TT
CLASS="constant"
>class</TT
></A
></B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>tc class add \</B
></TT
> <A
NAME="ex-s-itcc-tc"
><IMG
SRC="../images/callouts/1.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(1)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> dev eth0 \</B
></TT
> <A
NAME="ex-s-itcc-dev"
><IMG
SRC="../images/callouts/2.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(2)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> parent 1:1 \</B
></TT
> <A
NAME="ex-s-itcc-parent"
><IMG
SRC="../images/callouts/3.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(3)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> classid 1:6 \</B
></TT
> <A
NAME="ex-s-itcc-classid"
><IMG
SRC="../images/callouts/4.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(4)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> htb \</B
></TT
> <A
NAME="ex-s-itcc-classtype"
><IMG
SRC="../images/callouts/5.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(5)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> rate 256kbit \</B
></TT
> <A
NAME="ex-s-itcc-htb-rate"
><IMG
SRC="../images/callouts/6.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(6)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> ceil 512kbit</B
></TT
> <A
NAME="ex-s-itcc-htb-ceil"
><IMG
SRC="../images/callouts/7.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(7)"></A
>
</PRE
></FONT
></TD
></TR
></TABLE
><DIV
CLASS="calloutlist"
><DL
COMPACT="COMPACT"
><DT
><A
HREF="#ex-s-itcc-tc"
><IMG
SRC="../images/callouts/1.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(1)"></A
></DT
><DD
>&#13; Add a class. The verb could also be <TT
CLASS="constant"
>del</TT
>.
</DD
><DT
><A
HREF="#ex-s-itcc-dev"
><IMG
SRC="../images/callouts/2.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(2)"></A
></DT
><DD
>&#13; Specify the device onto which we are attaching the new class.
</DD
><DT
><A
HREF="#ex-s-itcc-parent"
><IMG
SRC="../images/callouts/3.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(3)"></A
></DT
><DD
>&#13; Specify the parent <A
HREF="#c-handle"
><TT
CLASS="constant"
>handle</TT
></A
> to which we are attaching the new class.
</DD
><DT
><A
HREF="#ex-s-itcc-classid"
><IMG
SRC="../images/callouts/4.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(4)"></A
></DT
><DD
>&#13; This is a unique <A
HREF="#c-handle"
><TT
CLASS="constant"
>handle</TT
></A
>
(<TT
CLASS="replaceable"
><I
>major</I
></TT
>:<TT
CLASS="replaceable"
><I
>minor</I
></TT
>)
identifying this class. The minor number must be any non-zero (0)
number.
</DD
><DT
><A
HREF="#ex-s-itcc-classtype"
><IMG
SRC="../images/callouts/5.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(5)"></A
></DT
><DD
>&#13; Both of the <A
HREF="#classful-qdiscs"
>classful qdiscs</A
> require that any children classes be
classes of the same type as the parent. Thus an HTB qdisc
will contain HTB classes.
</DD
><DT
><A
HREF="#ex-s-itcc-htb-rate"
><IMG
SRC="../images/callouts/6.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(6)"></A
><A
HREF="#ex-s-itcc-htb-ceil"
><IMG
SRC="../images/callouts/7.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(7)"></A
></DT
><DD
>&#13; This is a class specific parameter. Consult
<A
HREF="#qc-htb"
>Section 7.1</A
> for more detail on these parameters.
</DD
></DL
></DIV
></DIV
><P
>&#13; </P
><DIV
CLASS="example"
><A
NAME="ex-s-iproute2-tc-filter"
></A
><P
><B
>Example 5. <B
CLASS="command"
>tc</B
> <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
></B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>tc filter add \</B
></TT
> <A
NAME="ex-s-itcf-tc"
><IMG
SRC="../images/callouts/1.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(1)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> dev eth0 \</B
></TT
> <A
NAME="ex-s-itcf-dev"
><IMG
SRC="../images/callouts/2.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(2)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> parent 1:0 \</B
></TT
> <A
NAME="ex-s-itcf-parent"
><IMG
SRC="../images/callouts/3.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(3)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> protocol ip \</B
></TT
> <A
NAME="ex-s-itcf-protocol"
><IMG
SRC="../images/callouts/4.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(4)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> prio 5 \</B
></TT
> <A
NAME="ex-s-itcf-prio"
><IMG
SRC="../images/callouts/5.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(5)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> u32 \</B
></TT
> <A
NAME="ex-s-itcf-classifier"
><IMG
SRC="../images/callouts/6.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(6)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> match ip port 22 0xffff \</B
></TT
> <A
NAME="ex-s-itcf-match-port"
><IMG
SRC="../images/callouts/7.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(7)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> match ip tos 0x10 0xff \</B
></TT
> <A
NAME="ex-s-itcf-match-tos"
><IMG
SRC="../images/callouts/8.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(8)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> flowid 1:6 \</B
></TT
> <A
NAME="ex-s-itcf-flowid"
><IMG
SRC="../images/callouts/9.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(9)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> police \</B
></TT
> <A
NAME="ex-s-itcf-police"
><IMG
SRC="../images/callouts/10.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(10)"></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> rate 32000bps \</B
></TT
> <A
NAME="ex-s-itcf-prate"
><B
>(11)</B
></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> burst 10240 \</B
></TT
> <A
NAME="ex-s-itcf-burst"
><B
>(12)</B
></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> mpu 0 \</B
></TT
> <A
NAME="ex-s-itcf-mpu"
><B
>(13)</B
></A
>
<TT
CLASS="prompt"
>&#62; </TT
><TT
CLASS="userinput"
><B
> action drop/continue</B
></TT
> <A
NAME="ex-s-itcf-action"
><B
>(14)</B
></A
>
</PRE
></FONT
></TD
></TR
></TABLE
><DIV
CLASS="calloutlist"
><DL
COMPACT="COMPACT"
><DT
><A
HREF="#ex-s-itcf-tc"
><IMG
SRC="../images/callouts/1.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(1)"></A
></DT
><DD
>&#13; Add a filter. The verb could also be <TT
CLASS="constant"
>del</TT
>.
</DD
><DT
><A
HREF="#ex-s-itcf-dev"
><IMG
SRC="../images/callouts/2.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(2)"></A
></DT
><DD
>&#13; Specify the device onto which we are attaching the new filter.
</DD
><DT
><A
HREF="#ex-s-itcf-parent"
><IMG
SRC="../images/callouts/3.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(3)"></A
></DT
><DD
>&#13; Specify the parent handle to which we are attaching the new
filter.
</DD
><DT
><A
HREF="#ex-s-itcf-protocol"
><IMG
SRC="../images/callouts/4.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(4)"></A
></DT
><DD
>&#13; This parameter is required. It's use should be obvious, although
I don't know more.
</DD
><DT
><A
HREF="#ex-s-itcf-prio"
><IMG
SRC="../images/callouts/5.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(5)"></A
></DT
><DD
>&#13; The <TT
CLASS="parameter"
><I
>prio</I
></TT
> parameter allows a given filter to
be preferred above another. The <TT
CLASS="parameter"
><I
>pref</I
></TT
> is a
synonym.
</DD
><DT
><A
HREF="#ex-s-itcf-classifier"
><IMG
SRC="../images/callouts/6.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(6)"></A
></DT
><DD
>&#13; This is a <A
HREF="#c-classifier"
><TT
CLASS="constant"
>classifier</TT
></A
>, and is a required phrase in every
<B
CLASS="command"
>tc</B
> <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
> command.
</DD
><DT
><A
HREF="#ex-s-itcf-match-port"
><IMG
SRC="../images/callouts/7.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(7)"></A
><A
HREF="#ex-s-itcf-match-tos"
><IMG
SRC="../images/callouts/8.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(8)"></A
></DT
><DD
>&#13; These are parameters to the classifier. In this case, packets
with a type of service flag (indicating interactive usage) and
matching port 22 will be selected by this statement.
</DD
><DT
><A
HREF="#ex-s-itcf-flowid"
><IMG
SRC="../images/callouts/9.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(9)"></A
></DT
><DD
>&#13; The <TT
CLASS="parameter"
><I
>flowid</I
></TT
> specifies the <A
HREF="#c-handle"
><TT
CLASS="constant"
>handle</TT
></A
> of
the target class (or qdisc) to which a matching filter should send
its selected packets.
</DD
><DT
><A
HREF="#ex-s-itcf-police"
><IMG
SRC="../images/callouts/10.gif"
HSPACE="0"
VSPACE="0"
BORDER="0"
ALT="(10)"></A
></DT
><DD
>&#13; This is the <A
HREF="#c-police"
><TT
CLASS="constant"
>policer</TT
></A
>, and is an optional phrase in every
<B
CLASS="command"
>tc</B
> <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
> command.
</DD
><DT
><A
HREF="#ex-s-itcf-prate"
><B
>(11)</B
></A
></DT
><DD
>&#13; The policer will perform one action above this rate, and another
action below (see
<A
HREF="#ex-s-itcf-action-text"
>action parameter</A
>).
</DD
><DT
><A
HREF="#ex-s-itcf-burst"
><B
>(12)</B
></A
></DT
><DD
>&#13; The <TT
CLASS="parameter"
><I
>burst</I
></TT
> is an exact analog to <TT
CLASS="parameter"
><I
>burst</I
></TT
> in
<A
HREF="#qc-htb"
>HTB</A
> (<TT
CLASS="parameter"
><I
>burst</I
></TT
> is a <A
HREF="#o-buckets"
>buckets</A
> concept).
</DD
><DT
><A
HREF="#ex-s-itcf-mpu"
><B
>(13)</B
></A
></DT
><DD
>&#13; The minimum policed unit. To count all traffic, use an
<TT
CLASS="parameter"
><I
>mpu</I
></TT
> of zero (0).
</DD
><DT
><A
HREF="#ex-s-itcf-action"
><B
>(14)</B
></A
></DT
><DD
>&#13; The <TT
CLASS="parameter"
><I
>action</I
></TT
> indicates what should be done if
the <TT
CLASS="parameter"
><I
>rate</I
></TT
> based on the attributes of the policer. The
first word specifies the action to take if the policer has been
exceeded. The second word specifies action to take otherwise.
</DD
></DL
></DIV
></DIV
><P
>&#13; As evidenced above, the <B
CLASS="command"
>tc</B
> command line utility has an arcane and
complex syntax, even for simple operations such as these examples show.
It should come as no surprised to the reader that there exists an easier
way to configure Linux traffic control. See the next section,
<A
HREF="#s-tcng"
>Section 5.3</A
>.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="s-tcng"
></A
>5.3. <B
CLASS="command"
>tcng</B
>, Traffic Control Next Generation</H2
><P
>&#13; FIXME; sing the praises of tcng. See also <A
HREF="http://tldp.org/HOWTO/Traffic-Control-tcng-HTB-HOWTO/"
TARGET="_top"
>&#13; Traffic Control using tcng and HTB HOWTO</A
> and
<A
HREF="http://linux-ip.net/gl/tcng/"
TARGET="_top"
>tcng
documentation</A
>.
</P
><P
>&#13; Traffic control next generation (hereafter, <B
CLASS="command"
>tcng</B
>) provides all of the
power of traffic control under Linux with twenty percent of the
headache.
</P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="s-imq"
></A
>5.4. IMQ, Intermediate Queuing device</H2
><P
>&#13; </P
><P
>&#13; FIXME; must discuss IMQ. See also Patrick McHardy's website on
<A
HREF="http://trash.net/~kaber/imq/"
TARGET="_top"
>IMQ</A
>.
</P
><P
>&#13; </P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H1
CLASS="section"
><A
NAME="classless-qdiscs"
></A
>6. Classless Queuing Disciplines (<A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
>s)</H1
><P
>&#13; Each of these queuing disciplines can be used as the primary qdisc on an
interface, or can be used inside a leaf class of a <A
HREF="#classful-qdiscs"
>classful qdiscs</A
>.
These are the fundamental schedulers used under Linux. Note that the
default scheduler is the <A
HREF="#qs-pfifo_fast"
><TT
CLASS="constant"
>pfifo_fast</TT
></A
>.
</P
><P
>&#13; </P
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="qs-fifo"
></A
>6.1. FIFO, First-In First-Out (<TT
CLASS="constant"
>pfifo</TT
> and <TT
CLASS="constant"
>bfifo</TT
>)</H2
><DIV
CLASS="note"
><P
></P
><TABLE
CLASS="note"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/note.gif"
HSPACE="5"
ALT="Note"></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
>This is not the default qdisc on Linux interfaces. Be certain to see
<A
HREF="#qs-pfifo_fast"
>Section 6.2</A
> for the full details on the default
(<TT
CLASS="constant"
>pfifo_fast</TT
>) qdisc.
</TD
></TR
></TABLE
></DIV
><P
>&#13; The FIFO algorithm forms the basis for the default qdisc on all Linux
network interfaces (<A
HREF="#qs-pfifo_fast"
><TT
CLASS="constant"
>pfifo_fast</TT
></A
>). It performs no shaping or
rearranging of packets. It simply transmits packets as soon as it can
after receiving and queuing them. This is also the qdisc used inside
all newly created classes until another qdisc or a class replaces the
FIFO.
</P
><DIV
CLASS="mediaobject"
><P
><IMG
SRC="images/fifo-qdisc.png"></P
></DIV
><P
>&#13; A real FIFO qdisc must, however, have a size limit (a buffer size) to
prevent it from overflowing in case it is unable to dequeue packets as
quickly as it receives them. Linux implements two basic FIFO
<A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
>s, one based on bytes, and one on packets. Regardless of
the type of FIFO used, the size of the queue is defined by the parameter
<TT
CLASS="parameter"
><I
>limit</I
></TT
>. For a <TT
CLASS="constant"
>pfifo</TT
> the unit is understood
to be packets and for a <TT
CLASS="constant"
>bfifo</TT
> the unit is understood to be bytes.
</P
><DIV
CLASS="example"
><A
NAME="ex-qs-fifo-limit"
></A
><P
><B
>Example 6. Specifying a <TT
CLASS="parameter"
><I
>limit</I
></TT
> for a packet
or byte FIFO</B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>cat bfifo.tcc</B
></TT
>
<TT
CLASS="computeroutput"
>/*
* make a FIFO on eth0 with 10kbyte queue size
*
*/
dev eth0 {
egress {
fifo (limit 10kB );
}
}</TT
>
<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>tcc &#60; bfifo.tcc</B
></TT
>
<TT
CLASS="computeroutput"
># ================================ Device eth0 ================================
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
tc qdisc add dev eth0 handle 2:0 parent 1:0 bfifo limit 10240</TT
>
<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>cat pfifo.tcc</B
></TT
>
<TT
CLASS="computeroutput"
>/*
* make a FIFO on eth0 with 30 packet queue size
*
*/
dev eth0 {
egress {
fifo (limit 30p );
}
}</TT
>
<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>tcc &#60; pfifo.tcc</B
></TT
>
<TT
CLASS="computeroutput"
># ================================ Device eth0 ================================
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
tc qdisc add dev eth0 handle 2:0 parent 1:0 pfifo limit 30</TT
>
</PRE
></FONT
></TD
></TR
></TABLE
></DIV
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="qs-pfifo_fast"
></A
>6.2. <TT
CLASS="constant"
>pfifo_fast</TT
>, the default Linux qdisc</H2
><P
>&#13; The <TT
CLASS="constant"
>pfifo_fast</TT
> qdisc is the default qdisc for all interfaces under
Linux. Based on a conventional <A
HREF="#qs-fifo"
>FIFO</A
> qdisc, this qdisc also
provides some prioritization. It provides three different bands
(individual FIFOs) for separating traffic. The highest priority traffic
(interactive flows) are placed into band 0 and are always serviced
first. Similarly, band 1 is always emptied of pending packets before
band 2 is dequeued.
</P
><DIV
CLASS="mediaobject"
><P
><IMG
SRC="images/pfifo_fast-qdisc.png"></P
></DIV
><P
>&#13; There is nothing configurable to the end user about the <TT
CLASS="constant"
>pfifo_fast</TT
>
qdisc. For exact details on the <TT
CLASS="constant"
>priomap</TT
> and use of
the ToS bits, see the <A
HREF="http://lartc.org/howto/lartc.qdisc.classless.html"
TARGET="_top"
>pfifo-fast
section of the LARTC HOWTO</A
>.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="qs-sfq"
></A
>6.3. SFQ, Stochastic Fair Queuing</H2
><P
>&#13; The SFQ qdisc attempts to fairly distribute opportunity to
transmit data to the network among an arbitrary number of
<A
HREF="#o-flows"
>flows</A
>. It accomplishes this by using a hash function to
separate the traffic into separate (internally maintained) FIFOs
which are dequeued in a round-robin fashion. Because there is the
possibility for unfairness to manifest in the choice of hash function,
this function is altered periodically. Perturbation (the parameter
<TT
CLASS="parameter"
><I
>perturb</I
></TT
>) sets this periodicity.
</P
><DIV
CLASS="mediaobject"
><P
><IMG
SRC="images/sfq-qdisc.png"></P
></DIV
><DIV
CLASS="example"
><A
NAME="ex-qs-sfq"
></A
><P
><B
>Example 7. Creating an SFQ</B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>cat sfq.tcc</B
></TT
>
<TT
CLASS="computeroutput"
>/*
* make an SFQ on eth0 with a 10 second perturbation
*
*/
dev eth0 {
egress {
sfq( perturb 10s );
}
}</TT
>
<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>tcc &#60; sfq.tcc</B
></TT
>
<TT
CLASS="computeroutput"
># ================================ Device eth0 ================================
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
tc qdisc add dev eth0 handle 2:0 parent 1:0 sfq perturb 10</TT
>
</PRE
></FONT
></TD
></TR
></TABLE
></DIV
><P
>&#13; Unfortunately, some clever software (<I
CLASS="foreignphrase"
>e.g.</I
> Kazaa and eMule among others)
obliterate the benefit of this attempt at fair queuing by opening as
many TCP sessions (<A
HREF="#o-flows"
>flows</A
>) as can be sustained. In many
networks, with well-behaved users, SFQ can adequately distribute
the network resources to the contending flows, but other measures may be
called for when obnoxious applications have invaded the network.
</P
><P
>&#13; See also
<A
HREF="#qs-esfq"
>Section 6.4</A
> for an SFQ qdisc with more exposed
parameters for the user to manipulate.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="qs-esfq"
></A
>6.4. ESFQ, Extended Stochastic Fair Queuing</H2
><P
>&#13; Conceptually, this qdisc is no different than SFQ although it
allows the user to control more parameters than its simpler cousin.
This qdisc was conceived to overcome the shortcoming of SFQ
identified above. By allowing the user to control which hashing
algorithm is used for distributing access to network bandwidth, it
is possible for the user to reach a fairer real distribution of
bandwidth.
</P
><DIV
CLASS="example"
><A
NAME="ex-qs-esfq-usage"
></A
><P
><B
>Example 8. ESFQ usage</B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;Usage: ... esfq [ perturb SECS ] [ quantum BYTES ] [ depth FLOWS ]
[ divisor HASHBITS ] [ limit PKTS ] [ hash HASHTYPE]
Where:
HASHTYPE := { classic | src | dst }
</PRE
></FONT
></TD
></TR
></TABLE
></DIV
><P
>&#13; FIXME; need practical experience and/or attestation here.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="qs-gred"
></A
>6.5. GRED, Generic Random Early Drop</H2
><P
>&#13; FIXME; I have never used this. Need practical experience or
attestation.
</P
><P
>&#13; Theory declares that a RED algorithm is useful on a backbone or core
network, but not as useful near the end-user. See the section on
<A
HREF="#o-flows"
>flows</A
> to see a general discussion of the thirstiness of TCP.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="qs-tbf"
></A
>6.6. TBF, Token Bucket Filter</H2
><P
>&#13; This qdisc is built on <A
HREF="#o-tokens"
>tokens</A
> and <A
HREF="#o-buckets"
>buckets</A
>. It
simply shapes traffic transmitted on an interface. To limit the speed
at which packets will be dequeued from a particular interface, the
TBF qdisc is the perfect solution. It simply slows down
transmitted traffic to the specified rate.
</P
><P
>&#13; Packets are only transmitted if there are sufficient tokens available.
Otherwise, packets are deferred. Delaying packets in this fashion will
introduce an artificial latency into the packet's round trip time.
</P
><DIV
CLASS="mediaobject"
><P
><IMG
SRC="images/tbf-qdisc.png"></P
></DIV
><DIV
CLASS="example"
><A
NAME="ex-qs-tbf"
></A
><P
><B
>Example 9. Creating a 256kbit/s TBF</B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>cat tbf.tcc</B
></TT
>
<TT
CLASS="computeroutput"
>/*
* make a 256kbit/s TBF on eth0
*
*/
dev eth0 {
egress {
tbf( rate 256 kbps, burst 20 kB, limit 20 kB, mtu 1514 B );
}
}</TT
>
<TT
CLASS="prompt"
>[root@leander]# </TT
><TT
CLASS="userinput"
><B
>tcc &#60; tbf.tcc</B
></TT
>
<TT
CLASS="computeroutput"
># ================================ Device eth0 ================================
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
tc qdisc add dev eth0 handle 2:0 parent 1:0 tbf burst 20480 limit 20480 mtu 1514 rate 32000bps</TT
>
</PRE
></FONT
></TD
></TR
></TABLE
></DIV
><P
>&#13; </P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H1
CLASS="section"
><A
NAME="classful-qdiscs"
></A
>7. Classful Queuing Disciplines (<A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
>s)</H1
><P
>&#13; The flexibility and control of Linux traffic control can be unleashed
through the agency of the classful qdiscs. Remember that the classful
queuing disciplines can have filters attached to them, allowing packets to
be directed to particular classes and subqueues.
</P
><P
>&#13; There are several common terms to describe classes directly attached to
the <TT
CLASS="constant"
>root</TT
> qdisc and terminal classes. Classess attached to the
<TT
CLASS="constant"
>root</TT
> qdisc are known as root classes, and more generically inner
classes. Any terminal class in a particular queuing discipline is known
as a leaf class by analogy to the tree structure of the classes. Besides
the use of figurative language depicting the structure as a tree, the
language of family relationships is also quite common.
</P
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="qc-htb"
></A
>7.1. HTB, Hierarchical Token Bucket</H2
><P
>&#13; HTB uses the concepts of tokens and buckets
along with the class-based system and <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
>s to allow for
complex and granular control over traffic. With a complex
<A
HREF="#qc-htb-borrowing"
>borrowing model</A
>, HTB can perform a variety of sophisticated
traffic control techniques. One of the easiest ways to use HTB
immediately is that of <A
HREF="#qc-htb-borrowing"
>shaping</A
>.
</P
><P
>&#13; By understanding <A
HREF="#o-tokens"
>tokens</A
> and <A
HREF="#o-buckets"
>buckets</A
> or by grasping
the function of <A
HREF="#qs-tbf"
>TBF</A
>, HTB should be merely a logical
step. This queuing discipline allows the user to define the
characteristics of the tokens and bucket used and allows the user to
nest these buckets in an arbitrary fashion. When coupled with a
<A
HREF="#e-classifying"
>classifying</A
> scheme, traffic can be controlled in a very
granular fashion.
</P
><P
>&#13; </P
><P
>&#13; Below is example output of the syntax for HTB on the command line
with the <A
HREF="#s-iproute2-tc"
><B
CLASS="command"
>tc</B
></A
> tool. Although the syntax for <A
HREF="#s-tcng"
><B
CLASS="command"
>tcng</B
></A
> is a
language of its own, the rules for HTB are the same.
</P
><DIV
CLASS="example"
><A
NAME="ex-qc-htb-usage"
></A
><P
><B
>Example 10. <B
CLASS="command"
>tc</B
> usage for HTB</B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;Usage: ... qdisc add ... htb [default N] [r2q N]
default minor id of class to which unclassified packets are sent {0}
r2q DRR quantums are computed as rate in Bps/r2q {10}
debug string of 16 numbers each 0-3 {0}
... class add ... htb rate R1 burst B1 [prio P] [slot S] [pslot PS]
[ceil R2] [cburst B2] [mtu MTU] [quantum Q]
rate rate allocated to this class (class can still borrow)
burst max bytes burst which can be accumulated during idle period {computed}
ceil definite upper class rate (no borrows) {rate}
cburst burst but for ceil {computed}
mtu max packet size we create rate map for {1600}
prio priority of leaf; lower are served first {0}
quantum how much bytes to serve from leaf at once {use r2q}
TC HTB version 3.3
</PRE
></FONT
></TD
></TR
></TABLE
></DIV
><P
>&#13; </P
><DIV
CLASS="section"
><HR><H3
CLASS="section"
><A
NAME="qc-htb-software"
></A
>7.1.1. Software requirements</H3
><P
>&#13; Unlike almost all of the other software discussed, HTB is a
newer queuing discipline and your distribution may not have all of the
tools and capability you need to use HTB. The kernel must
support HTB; kernel version 2.4.20 and later support it in the
stock distribution, although earlier kernel versions require patching.
To enable userland support for HTB, see <A
HREF="http://luxik.cdi.cz/~devik/qos/htb/"
TARGET="_top"
>HTB</A
> for an
<B
CLASS="command"
>iproute2</B
> patch to <B
CLASS="command"
>tc</B
>.
</P
></DIV
><DIV
CLASS="section"
><HR><H3
CLASS="section"
><A
NAME="qc-htb-shaping"
></A
>7.1.2. Shaping</H3
><P
>&#13; One of the most common applications of HTB involves shaping
transmitted traffic to a specific rate.
</P
><P
>&#13; All shaping occurs in leaf classes. No shaping occurs in inner or
root classes as they only exist to suggest how the
<A
HREF="#qc-htb-borrowing"
>borrowing model</A
> should distribute available tokens.
</P
><P
>&#13; </P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H3
CLASS="section"
><A
NAME="qc-htb-borrowing"
></A
>7.1.3. Borrowing</H3
><P
>&#13; A fundamental part of the HTB qdisc is the borrowing mechanism.
Children classes borrow tokens from their parents once they have
exceeded <A
HREF="#vl-qc-htb-params-rate"
><TT
CLASS="parameter"
><I
>rate</I
></TT
></A
>. A child class will continue to
attempt to borrow until it reaches <A
HREF="#vl-qc-htb-params-ceil"
><TT
CLASS="parameter"
><I
>ceil</I
></TT
></A
>, at which
point it will begin to queue packets for transmission until more
tokens/ctokens are available. As there are only two primary types of
classes which can be created with HTB the following table and
diagram identify the various possible states and the behaviour of the
borrowing mechanisms.
</P
><P
>&#13; </P
><DIV
CLASS="table"
><A
NAME="tb-qc-htb-borrowing"
></A
><P
><B
>Table 2. HTB class states and potential actions taken</B
></P
><TABLE
BORDER="1"
CLASS="CALSTABLE"
><THEAD
><TR
><TH
ALIGN="LEFT"
VALIGN="MIDDLE"
>type of class</TH
><TH
ALIGN="LEFT"
VALIGN="MIDDLE"
>class state</TH
><TH
ALIGN="LEFT"
VALIGN="MIDDLE"
>HTB internal state</TH
><TH
ALIGN="LEFT"
VALIGN="MIDDLE"
>action taken</TH
></TR
></THEAD
><TBODY
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>leaf</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#60; <TT
CLASS="parameter"
><I
>rate</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
><TT
CLASS="parameter"
><I
>HTB_CAN_SEND</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#13; Leaf class will dequeue queued bytes up
to available tokens (no more than burst packets)
</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>leaf</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#62; <TT
CLASS="parameter"
><I
>rate</I
></TT
>, &#60; <TT
CLASS="parameter"
><I
>ceil</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
><TT
CLASS="parameter"
><I
>HTB_MAY_BORROW</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#13; Leaf class will attempt to borrow tokens/ctokens from
parent class. If tokens are available, they will be lent in
<TT
CLASS="parameter"
><I
>quantum</I
></TT
> increments and the leaf class will dequeue up
to <TT
CLASS="parameter"
><I
>cburst</I
></TT
> bytes
</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>leaf</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#62; <TT
CLASS="parameter"
><I
>ceil</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
><TT
CLASS="parameter"
><I
>HTB_CANT_SEND</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#13; No packets will be dequeued. This will cause packet
delay and will increase latency to meet the desired
rate.
</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>inner, root</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#60; <TT
CLASS="parameter"
><I
>rate</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
><TT
CLASS="parameter"
><I
>HTB_CAN_SEND</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#13; Inner class will lend tokens to children.
</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>inner, root</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#62; <TT
CLASS="parameter"
><I
>rate</I
></TT
>, &#60; <TT
CLASS="parameter"
><I
>ceil</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
><TT
CLASS="parameter"
><I
>HTB_MAY_BORROW</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#13; Inner class will attempt to borrow tokens/ctokens from
parent class, lending them to competing children in
<TT
CLASS="parameter"
><I
>quantum</I
></TT
> increments per request.
</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>inner, root</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#62; <TT
CLASS="parameter"
><I
>ceil</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
><TT
CLASS="parameter"
><I
>HTB_CANT_SEND</I
></TT
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>&#13; Inner class will not attempt to borrow from its parent
and will not lend tokens/ctokens to children classes.
</TD
></TR
></TBODY
></TABLE
></DIV
><P
>&#13; This diagram identifies the flow of borrowed tokens and the manner in
which tokens are charged to parent classes. In order for the
borrowing model to work, each class must have an accurate count of the
number of tokens used by itself and all of its children. For this
reason, any token used in a child or leaf class is charged to each
parent class until the root class is reached.
</P
><P
>&#13; Any child class which wishes to borrow a token will request a token
from its parent class, which if it is also over its <TT
CLASS="parameter"
><I
>rate</I
></TT
> will
request to borrow from its parent class until either a token is
located or the root class is reached. So the borrowing of tokens
flows toward the leaf classes and the charging of the usage of tokens
flows toward the root class.
</P
><DIV
CLASS="mediaobject"
><P
><IMG
SRC="images/htb-borrow.png"></P
></DIV
><P
>&#13; Note in this diagram that there are several HTB root classes.
Each of these root classes can simulate a virtual circuit.
</P
></DIV
><DIV
CLASS="section"
><HR><H3
CLASS="section"
><A
NAME="qc-htb-params"
></A
>7.1.4. HTB class parameters</H3
><P
>&#13; </P
><P
></P
><DIV
CLASS="variablelist"
><DL
><DT
><A
NAME="vl-qc-htb-params-default"
></A
><TT
CLASS="parameter"
><I
>default</I
></TT
></DT
><DD
><P
>&#13; An optional parameter with every HTB <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
> object,
the default <TT
CLASS="parameter"
><I
>default</I
></TT
> is 0, which cause any unclassified
traffic to be dequeued at hardware speed, completely bypassing
any of the classes attached to the <TT
CLASS="constant"
>root</TT
> qdisc.
</P
></DD
><DT
><A
NAME="vl-qc-htb-params-rate"
></A
><TT
CLASS="parameter"
><I
>rate</I
></TT
></DT
><DD
><P
>&#13; Used to set the minimum desired speed to which to limit
transmitted traffic. This can be considered the equivalent of a
committed information rate (<SPAN
CLASS="acronym"
>CIR</SPAN
>), or the
guaranteed bandwidth for a given leaf class.
</P
></DD
><DT
><A
NAME="vl-qc-htb-params-ceil"
></A
><TT
CLASS="parameter"
><I
>ceil</I
></TT
></DT
><DD
><P
>&#13; Used to set the maximum desired speed to which to limit the
transmitted traffic. The borrowing model should illustrate how
this parameter is used. This can be considered the equivalent
of <SPAN
CLASS="QUOTE"
>"burstable bandwidth"</SPAN
>.
</P
></DD
><DT
><A
NAME="vl-qc-htb-params-burst"
></A
><TT
CLASS="parameter"
><I
>burst</I
></TT
></DT
><DD
><P
>&#13; This is the size of the <A
HREF="#vl-qc-htb-params-rate"
><TT
CLASS="parameter"
><I
>rate</I
></TT
></A
> bucket (see
<A
HREF="#o-buckets"
>Tokens and buckets</A
>). HTB will dequeue
<TT
CLASS="parameter"
><I
>burst</I
></TT
> bytes before awaiting the arrival of more
tokens.
</P
></DD
><DT
><A
NAME="vl-qc-htb-params-cburst"
></A
><TT
CLASS="parameter"
><I
>cburst</I
></TT
></DT
><DD
><P
>&#13; This is the size of the <A
HREF="#vl-qc-htb-params-ceil"
><TT
CLASS="parameter"
><I
>ceil</I
></TT
></A
> bucket (see
<A
HREF="#o-buckets"
>Tokens and buckets</A
>). HTB will dequeue
<TT
CLASS="parameter"
><I
>cburst</I
></TT
> bytes before awaiting the arrival of more
ctokens.
</P
></DD
><DT
><A
NAME="vl-qc-htb-params-quantum"
></A
><TT
CLASS="parameter"
><I
>quantum</I
></TT
></DT
><DD
><P
>&#13; This is a key parameter used by HTB to control borrowing.
Normally, the correct <TT
CLASS="parameter"
><I
>quantum</I
></TT
> is calculated by
HTB, not specified by the user. Tweaking this parameter
can have tremendous effects on borrowing and shaping under
contention, because it is used both to split traffic between
children classes over <A
HREF="#vl-qc-htb-params-rate"
><TT
CLASS="parameter"
><I
>rate</I
></TT
></A
> (but below
<A
HREF="#vl-qc-htb-params-ceil"
><TT
CLASS="parameter"
><I
>ceil</I
></TT
></A
>) and to transmit packets from these same
classes.
</P
></DD
><DT
><A
NAME="vl-qc-htb-params-r2q"
></A
><TT
CLASS="parameter"
><I
>r2q</I
></TT
></DT
><DD
><P
>&#13; Also, usually calculated for the user, <TT
CLASS="parameter"
><I
>r2q</I
></TT
> is a hint to
HTB to help determine the optimal <A
HREF="#vl-qc-htb-params-quantum"
><TT
CLASS="parameter"
><I
>quantum</I
></TT
></A
>
for a particular class.
</P
></DD
><DT
><A
NAME="vl-qc-htb-params-mtu"
></A
><TT
CLASS="parameter"
><I
>mtu</I
></TT
></DT
><DD
><P
>&#13; </P
></DD
><DT
><A
NAME="vl-qc-htb-params-prio"
></A
><TT
CLASS="parameter"
><I
>prio</I
></TT
></DT
><DD
><P
>&#13; </P
></DD
></DL
></DIV
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H3
CLASS="section"
><A
NAME="qc-htb-rules"
></A
>7.1.5. Rules</H3
><P
>&#13; Below are some general guidelines to using HTB culled from
<A
HREF="http://docum.org/"
TARGET="_top"
>http://docum.org/</A
> and the <A
HREF="http://mailman.ds9a.nl/mailman/listinfo/lartc/"
TARGET="_top"
>LARTC
mailing list</A
>. These rules are
simply a recommendation for beginners to maximize the benefit of
HTB until gaining a better understanding of the practical
application of HTB.
</P
><P
>&#13; </P
><P
></P
><UL
><LI
><P
>&#13; Shaping with HTB occurs only in leaf classes. See also
<A
HREF="#qc-htb-shaping"
>Section 7.1.2</A
>.
</P
></LI
><LI
><P
>&#13; Because HTB does not shape in any class except the leaf
class, the sum of the <TT
CLASS="parameter"
><I
>rate</I
></TT
>s of leaf classes should not
exceed the <TT
CLASS="parameter"
><I
>ceil</I
></TT
> of a parent class. Ideally, the sum of
the <TT
CLASS="parameter"
><I
>rate</I
></TT
>s of the children classes would match the
<TT
CLASS="parameter"
><I
>rate</I
></TT
> of the parent class, allowing the parent class to
distribute leftover bandwidth (<TT
CLASS="parameter"
><I
>ceil</I
></TT
> - <TT
CLASS="parameter"
><I
>rate</I
></TT
>) among
the children classes.
</P
><P
>&#13; This key concept in employing HTB bears repeating. Only
leaf classes actually shape packets; packets are only delayed in
these leaf classes. The inner classes (all the way up to the root
class) exist to define how borrowing/lending occurs (see also
<A
HREF="#qc-htb-borrowing"
>Section 7.1.3</A
>).
</P
></LI
><LI
><P
>&#13; The <TT
CLASS="parameter"
><I
>quantum</I
></TT
> is only only used when a class is over
<TT
CLASS="parameter"
><I
>rate</I
></TT
> but below <TT
CLASS="parameter"
><I
>ceil</I
></TT
>.
</P
></LI
><LI
><P
>&#13; The <TT
CLASS="parameter"
><I
>quantum</I
></TT
> should be set at MTU or higher. HTB
will dequeue a single packet at least per service opportunity even
if <TT
CLASS="parameter"
><I
>quantum</I
></TT
> is too small. In such a case, it will not be
able to calculate accurately the real bandwidth consumed
<A
NAME="AEN1146"
HREF="#FTN.AEN1146"
><SPAN
CLASS="footnote"
>[9]</SPAN
></A
>.
</P
></LI
><LI
><P
>&#13; Parent classes lend tokens to children in increments of
<TT
CLASS="parameter"
><I
>quantum</I
></TT
>, so for maximum granularity and most
instantaneously evenly distributed bandwidth, <TT
CLASS="parameter"
><I
>quantum</I
></TT
>
should be as low as possible while still no less than MTU.
</P
></LI
><LI
><P
>&#13; A distinction between tokens and ctokens is only meaningful in a
leaf class, because non-leaf classes only lend tokens to child
classes.
</P
></LI
><LI
><P
>&#13; HTB borrowing could more accurately be described as
<SPAN
CLASS="QUOTE"
>"using"</SPAN
>.
</P
></LI
></UL
><P
>&#13; </P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="qc-hfsc"
></A
>7.2. HFSC, Hierarchical Fair Service Curve</H2
><P
>&#13; The HFSC classful qdisc balances delay-sensitive traffic against
throughput sensitive traffic. In a congested or backlogged state, the
HFSC queuing discipline interleaves the delay-sensitive traffic when
required according service curve definitions. Read about the Linux
implementation in German, <A
HREF="http://klaus.geekserver.net/hfsc/hfsc.html"
TARGET="_top"
>HFSC
Scheduling mit Linux</A
> or read a
translation into English, <A
HREF="http://linux-ip.net/tc/hfsc.en/"
TARGET="_top"
>HFSC Scheduling
with Linux</A
>. The original
research article, <A
HREF="http://acm.org/sigcomm/sigcomm97/program.html#ab011"
TARGET="_top"
>A
Hierarchical Fair Service Curve Algorithm For Link-Sharing, Real-Time
and Priority Services</A
>, also remains available.
</P
><P
>&#13; This section will be completed at a later date.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="qc-prio"
></A
>7.3. PRIO, priority scheduler</H2
><P
>&#13; The PRIO classful qdisc works on a very simple precept. When it
is ready to dequeue a packet, the first class is checked for a packet.
If there's a packet, it gets dequeued. If there's no packet, then the
next class is checked, until the queuing mechanism has no more classes
to check.
</P
><P
>&#13; This section will be completed at a later date.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="qc-cbq"
></A
>7.4. CBQ, Class Based Queuing</H2
><P
>&#13; CBQ is the classic implementation (also called venerable) of a traffic
control system. This section will be completed at a later date.
</P
><P
>&#13; </P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H1
CLASS="section"
><A
NAME="rules"
></A
>8. Rules, Guidelines and Approaches</H1
><P
>&#13; </P
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="r-general"
></A
>8.1. General Rules of Linux Traffic Control</H2
><P
>&#13; There are a few general rules which ease the study of Linux traffic
control.
Traffic control structures under Linux are the same whether the initial
configuration has been done with <A
HREF="#s-tcng"
><B
CLASS="command"
>tcng</B
></A
> or with <A
HREF="#s-iproute2-tc"
><B
CLASS="command"
>tc</B
></A
>.
</P
><P
></P
><UL
><LI
><P
>&#13; Any router performing a shaping function should be the bottleneck on
the link, and should be shaping slightly below the maximum available
link bandwidth. This prevents queues from forming in other routers,
affording maximum control of packet latency/deferral to the shaping
device.
</P
></LI
><LI
><P
>&#13; A device can only shape traffic it transmits
<A
NAME="AEN1189"
HREF="#FTN.AEN1189"
><SPAN
CLASS="footnote"
>[10]</SPAN
></A
>. Because the traffic has already been received on an
input interface, the traffic cannot be shaped. A traditional
solution to this problem is an ingress policer.
</P
></LI
><LI
><P
>&#13; Every interface must have a <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
>. The default qdisc
(the <A
HREF="#qs-pfifo_fast"
><TT
CLASS="constant"
>pfifo_fast</TT
></A
> qdisc) is used when another qdisc is not
explicitly attached to the interface.
</P
></LI
><LI
><P
>&#13; One of the <A
HREF="#classful-qdiscs"
>classful qdiscs</A
> added to an interface with no children
classes typically only consumes CPU for no benefit.
</P
></LI
><LI
><P
>&#13; Any newly created class contains a <A
HREF="#qs-fifo"
>FIFO</A
>.
This qdisc can be replaced explicitly with any other qdisc. The
FIFO qdisc will be removed implicitly if a child class is
attached to this class.
</P
></LI
><LI
><P
>&#13; Classes directly attached to the <TT
CLASS="constant"
>root</TT
> qdisc can be used to
simulate virtual circuits.
</P
></LI
><LI
><P
>&#13; A <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
> can be attached to classes or one of the
<A
HREF="#classful-qdiscs"
>classful qdiscs</A
>.
</P
></LI
></UL
><P
>&#13; </P
><P
>&#13; </P
><P
>&#13; </P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="r-known-bandwidth"
></A
>8.2. Handling a link with a known bandwidth</H2
><P
>&#13; HTB is an ideal <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
> to use on a link with a known
bandwidth, because the innermost (root-most) class can be set to the
maximum bandwidth available on a given link. Flows can be further
subdivided into children classes, allowing either guaranteed bandwidth
to particular classes of traffic or allowing preference to specific
kinds of traffic.
</P
><P
>&#13; </P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="r-unknown-bandwidth"
></A
>8.3. Handling a link with a variable (or unknown) bandwidth</H2
><P
>&#13; In theory, the PRIO scheduler is an ideal match for links with
variable bandwidth, because it is a work-conserving <A
HREF="#c-qdisc"
><TT
CLASS="constant"
>qdisc</TT
></A
> (which
means that it provides no <A
HREF="#e-shaping"
>shaping</A
>). In the case of a link
with an unknown or fluctuating bandwidth, the PRIO scheduler
simply prefers to dequeue any available packet in the highest priority
band first, then falling to the lower priority queues.
</P
><P
>&#13; </P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="r-sharing-flows"
></A
>8.4. Sharing/splitting bandwidth based on flows</H2
><P
>&#13; Of the many types of contention for network bandwidth, this is one of
the easier types of contention to address in general. By using the
SFQ qdisc, traffic in a particular queue can be separated into
flows, each of which will be serviced fairly (inside that queue).
Well-behaved applications (and users) will find that using SFQ and
ESFQ are sufficient for most sharing needs.
</P
><P
>&#13; The Achilles heel of these fair queuing algorithms is a misbehaving user
or application which opens many connections simultaneously (e.g., eMule,
eDonkey, Kazaa). By creating a large number of individual flows, the
application can dominate slots in the fair queuing algorithm. Restated,
the fair queuing algorithm has no idea that a single application is
generating the majority of the flows, and cannot penalize the user.
Other methods are called for.
</P
><P
>&#13; </P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="r-sharing-ips"
></A
>8.5. Sharing/splitting bandwidth based on IP</H2
><P
>&#13; For many administrators this is the ideal method of dividing bandwidth
amongst their users. Unfortunately, there is no easy solution, and it
becomes increasingly complex with the number of machine sharing a
network link.
</P
><P
>&#13; To divide bandwidth equitably between <TT
CLASS="parameter"
><I
>N</I
></TT
> IP
addresses, there must be <TT
CLASS="parameter"
><I
>N</I
></TT
> classes.
</P
><P
>&#13; </P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H1
CLASS="section"
><A
NAME="scripts"
></A
>9. Scripts for use with QoS/Traffic Control</H1
><P
>&#13; </P
><P
>&#13; </P
><P
>&#13; </P
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="sc-wondershaper"
></A
>9.1. wondershaper</H2
><P
>&#13; More to come, see <A
HREF="http://lartc.org/wondershaper/"
TARGET="_top"
>wondershaper</A
>.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="sc-myshaper"
></A
>9.2. ADSL Bandwidth HOWTO script (<TT
CLASS="filename"
>myshaper</TT
>)</H2
><P
>&#13; More to come, see <A
HREF="http://www.tldp.org/HOWTO/ADSL-Bandwidth-Management-HOWTO/implementation.html"
TARGET="_top"
>myshaper</A
>.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="sc-htb.init"
></A
>9.3. <TT
CLASS="filename"
>htb.init</TT
></H2
><P
>&#13; More to come, see <A
HREF="http://sourceforge.net/projects/htbinit/"
TARGET="_top"
><TT
CLASS="filename"
>htb.init</TT
></A
>.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="sc-tcng.init"
></A
>9.4. <TT
CLASS="filename"
>tcng.init</TT
></H2
><P
>&#13; More to come, see <A
HREF="http://linux-ip.net/code/tcng/tcng.init"
TARGET="_top"
><TT
CLASS="filename"
>tcng.init</TT
></A
>.
</P
></DIV
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="sc-cbq.init"
></A
>9.5. <TT
CLASS="filename"
>cbq.init</TT
></H2
><P
>&#13; More to come, see <A
HREF="http://sourceforge.net/projects/cbqinit/"
TARGET="_top"
><TT
CLASS="filename"
>cbq.init</TT
></A
>.
</P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H1
CLASS="section"
><A
NAME="diagram"
></A
>10. Diagram</H1
><P
>&#13; </P
><P
>&#13; </P
><DIV
CLASS="section"
><HR><H2
CLASS="section"
><A
NAME="d-general"
></A
>10.1. General diagram</H2
><P
>&#13; Below is a general diagram of the relationships of the components of a
classful queuing discipline (HTB pictured). A larger version of
the diagram is
<A
HREF="http://linux-ip.net/traffic-control/htb-class.png"
TARGET="_top"
>available</A
>.
</P
><P
>&#13; </P
><DIV
CLASS="example"
><A
NAME="d-tcng-config"
></A
><P
><B
>Example 11. An example HTB <B
CLASS="command"
>tcng</B
> configuration</B
></P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="programlisting"
>&#13;/*
*
* possible mock up of diagram shown at
* http://linux-ip.net/traffic-control/htb-class.png
*
*/
$m_web = trTCM (
cir 512 kbps, /* commited information rate */
cbs 10 kB, /* burst for CIR */
pir 1024 kbps, /* peak information rate */
pbs 10 kB /* burst for PIR */
) ;
dev eth0 {
egress {
class ( &#60;$web&#62; ) if tcp_dport == PORT_HTTP &#38;&#38; __trTCM_green( $m_web );
class ( &#60;$bulk&#62; ) if tcp_dport == PORT_HTTP &#38;&#38; __trTCM_yellow( $m_web );
drop if __trTCM_red( $m_web );
class ( &#60;$bulk&#62; ) if tcp_dport == PORT_SSH ;
htb () { /* root qdisc */
class ( rate 1544kbps, ceil 1544kbps ) { /* root class */
$web = class ( rate 512kbps, ceil 512kbps ) { sfq ; } ;
$bulk = class ( rate 512kbps, ceil 1544kbps ) { sfq ; } ;
}
}
}
}
</PRE
></FONT
></TD
></TR
></TABLE
></DIV
><DIV
CLASS="mediaobject"
><P
><IMG
SRC="images/htb-class.png"></P
></DIV
><P
>&#13; </P
></DIV
></DIV
><DIV
CLASS="section"
><HR><H1
CLASS="section"
><A
NAME="links"
></A
>11. Annotated Traffic Control Links</H1
><P
>&#13; This section identifies a number of links to documentation
about traffic control and Linux traffic control software. Each link will
be listed with a brief description of the content at that site.
</P
><P
></P
><UL
><LI
><P
>&#13; <A
HREF="http://luxik.cdi.cz/~devik/qos/htb/"
TARGET="_top"
>HTB
site</A
>,
<A
HREF="http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm"
TARGET="_top"
>HTB
user guide</A
> and
<A
HREF="http://luxik.cdi.cz/~devik/qos/htb/manual/theory.htm"
TARGET="_top"
>HTB
theory</A
>
(<EM
>Martin <SPAN
CLASS="QUOTE"
>"devik"</SPAN
> Devera</EM
>)
</P
><P
>&#13; Hierarchical Token Bucket, <A
HREF="#qc-htb"
>HTB</A
>, is a classful queuing
discipline. Widely used and supported it is also fairly well
documented in the user guide and at
<A
HREF="http://www.docum.org/"
TARGET="_top"
>Stef Coene's site</A
>
(see below).
</P
></LI
><LI
><P
>&#13; <A
HREF="http://opalsoft.net/qos/"
TARGET="_top"
>General Quality of
Service docs</A
> (<EM
>Leonardo Balliache</EM
>)
<P
>&#13; </P
>
There is a good deal of understandable and introductory documentation
on his site, and in particular has some excellent overview material.
See in particular, the detailed
<A
HREF="http://opalsoft.net/qos/DS.htm"
TARGET="_top"
>Linux QoS</A
> document
among others.
</P
></LI
><LI
><P
>&#13; <A
HREF="http://tcng.sourceforge.net/"
TARGET="_top"
><B
CLASS="command"
>tcng</B
> (Traffic Control
Next Generation)</A
> and
<A
HREF="http://linux-ip.net/gl/tcng/"
TARGET="_top"
><B
CLASS="command"
>tcng</B
> manual</A
>
(<EM
>Werner Almesberger</EM
>)
</P
><P
>&#13; The <B
CLASS="command"
>tcng</B
> software includes a language and a set of tools for
creating and testing traffic control structures. In addition to
generating <B
CLASS="command"
>tc</B
> commands as output, it is also capable of providing
output for non-Linux applications. A key piece of the <B
CLASS="command"
>tcng</B
> suite
which is ignored in this documentation is the <B
CLASS="command"
>tcsim</B
>
traffic control simulator.
</P
><P
>&#13; The user manual provided with the <B
CLASS="command"
>tcng</B
> software has been converted
to HTML with <B
CLASS="command"
>latex2html</B
>. The distribution comes
with the TeX documentation.
</P
></LI
><LI
><P
>&#13; <A
HREF="ftp://ftp.inr.ac.ru/ip-routing/"
TARGET="_top"
><B
CLASS="command"
>iproute2</B
></A
> and
<A
HREF="http://linux-ip.net/gl/ip-cref/"
TARGET="_top"
><B
CLASS="command"
>iproute2</B
> manual</A
>
(<EM
>Alexey Kuznetsov</EM
>)
</P
><P
>&#13; This is a the source code for the <B
CLASS="command"
>iproute2</B
> suite, which includes the
essential <B
CLASS="command"
>tc</B
> binary. Note, that as of
iproute2-2.4.7-now-ss020116-try.tar.gz, the package did not support
HTB, so a patch available from the <A
HREF="http://luxik.cdi.cz/~devik/qos/htb/"
TARGET="_top"
>HTB</A
> site will be
required.
</P
><P
>&#13; The manual documents the entire suite of tools, although the <B
CLASS="command"
>tc</B
>
utility is not adequately documented here. The ambitious reader is
recommended to the LARTC HOWTO after consuming this introduction.
</P
></LI
><LI
><P
>&#13; <A
HREF="http://www.docum.org/"
TARGET="_top"
>Documentation, graphs, scripts and
guidelines to traffic control under Linux</A
>
(<EM
>Stef Coene</EM
>)
</P
><P
>&#13; Stef Coene has been gathering statistics and test results, scripts and
tips for the use of QoS under Linux. There are some particularly
useful graphs and guidelines available for implementing traffic
control at Stef's site.
</P
></LI
><LI
><P
>&#13; <A
HREF="http://lartc.org/howto/"
TARGET="_top"
>LARTC HOWTO</A
>
(<EM
>bert hubert, et. al.</EM
>)
</P
><P
>&#13; The Linux Advanced Routing and Traffic Control HOWTO is one of the key
sources of data about the sophisticated techniques which are available
for use under Linux. The Traffic Control Introduction HOWTO should
provide the reader with enough background in the language and concepts
of traffic control. The LARTC HOWTO is the next place the reader
should look for general traffic control information.
</P
></LI
><LI
><P
>&#13; <A
HREF="http://linux-ip.net/"
TARGET="_top"
>Guide to IP Networking with
Linux</A
> (<EM
>Martin A. Brown</EM
>)
</P
><P
>&#13; Not directly related to traffic control, this site includes articles
and general documentation on the behaviour of the Linux IP layer.
</P
></LI
><LI
><P
>&#13; <A
HREF="http://www.almesberger.net/cv/papers.html"
TARGET="_top"
>Werner
Almesberger's Papers</A
>
</P
><P
>&#13; Werner Almesberger is one of the main developers and champions of
traffic control under Linux (he's also the author of <B
CLASS="command"
>tcng</B
>, above).
One of the key documents describing the entire traffic control
architecture of the Linux kernel is his Linux Traffic Control -
Implementation Overview which is available in
<A
HREF="http://www.almesberger.net/cv/papers/tcio8.pdf"
TARGET="_top"
>PDF</A
>
or
<A
HREF="http://www.almesberger.net/cv/papers/tcio8.ps.gz"
TARGET="_top"
>PS</A
>
format.
</P
></LI
><LI
><P
>&#13; <A
HREF="http://diffserv.sourceforge.net/"
TARGET="_top"
>Linux DiffServ
project</A
>
</P
><P
>&#13; Mercilessly snipped from the main page of the DiffServ site...
</P
><A
NAME="AEN1370"
></A
><BLOCKQUOTE
CLASS="BLOCKQUOTE"
>Differentiated Services (short: Diffserv) is an architecture for
providing different types or levels of service for network traffic.
One key characteristic of Diffserv is that flows are aggregated in
the network, so that core routers only need to distinguish a
comparably small number of aggregated flows, even if those flows
contain thousands or millions of individual flows.
</BLOCKQUOTE
></LI
></UL
></DIV
></DIV
><H3
CLASS="FOOTNOTES"
>Notes</H3
><TABLE
BORDER="0"
CLASS="FOOTNOTES"
WIDTH="100%"
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN91"
HREF="#AEN91"
><SPAN
CLASS="footnote"
>[1]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>&#13; See <A
HREF="#software"
>Section 5</A
> for more details on the use or
installation of a particular traffic control mechanism, kernel or
command line utility.
</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN220"
HREF="#AEN220"
><SPAN
CLASS="footnote"
>[2]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>&#13; This queueing model has long been used in civilized countries to
distribute scant food or provisions equitably. William Faulkner is
reputed to have walked to the front of the line for to fetch his
share of ice, proving that not everybody likes the FIFO model, and
providing us a model for considering priority queuing.
</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN225"
HREF="#AEN225"
><SPAN
CLASS="footnote"
>[3]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>&#13; Similarly, the entire traffic control system appears as a queue or
scheduler to the higher layer which is enqueuing packets into this
layer.
</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN271"
HREF="#AEN271"
><SPAN
CLASS="footnote"
>[4]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>&#13; This smoothing effect is not always desirable, hence the HTB
parameters burst and cburst.
</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN422"
HREF="#AEN422"
><SPAN
CLASS="footnote"
>[5]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>&#13; A classful qdisc can only have children classes of its type. For
example, an HTB qdisc can only have HTB classes as children. A CBQ
qdisc cannot have HTB classes as children.
</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN483"
HREF="#AEN483"
><SPAN
CLASS="footnote"
>[6]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>&#13; In this case, you'll have a <A
HREF="#c-filter"
><TT
CLASS="constant"
>filter</TT
></A
> which uses a
<A
HREF="#c-classifier"
><TT
CLASS="constant"
>classifier</TT
></A
> to select the packets you wish to drop. Then
you'll use a <A
HREF="#c-police"
><TT
CLASS="constant"
>policer</TT
></A
> with a with a drop action like this
<B
CLASS="command"
>police rate 1bps burst 1 action drop/drop</B
>.
</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN505"
HREF="#AEN505"
><SPAN
CLASS="footnote"
>[7]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>&#13; I do not know the range nor base of these numbers. I believe they
are u32 hexadecimal, but need to confirm this.
</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN542"
HREF="#AEN542"
><SPAN
CLASS="footnote"
>[8]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>&#13; The options listed in this example are taken from a 2.4.20 kernel
source tree. The exact options may differ slightly from kernel
release to kernel release depending on patches and new schedulers
and classifiers.
</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN1146"
HREF="#AEN1146"
><SPAN
CLASS="footnote"
>[9]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>&#13; HTB will report bandwidth usage in this scenario
incorrectly. It will calculate the bandwidth used by
<TT
CLASS="parameter"
><I
>quantum</I
></TT
> instead of the real dequeued packet size.
This can skew results quickly.
</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN1189"
HREF="#AEN1189"
><SPAN
CLASS="footnote"
>[10]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>&#13; In fact, the
<A
HREF="#s-imq"
>Intermediate Queuing Device
(IMQ)</A
> simulates an output device onto which traffic
control structures can be attached. This clever solution allows
a networking device to shape ingress traffic in the same fashion
as egress traffic. Despite the apparent contradiction of the
rule, IMQ appears as a device to the kernel. Thus, there has
been no violation of the rule, but rather a sneaky
reinterpretation of that rule.
</P
></TD
></TR
></TABLE
></BODY
></HTML
>