<para>Mark any "small" packets as 0x02. Outbound ACK packets from inbound downloads should be sent promptly
to assure efficient downloads. This is possible using the iptables length module.</para>
</listitem>
</orderedlist>
<para>Obviously, this can be customized to fit your needs.</para>
</sect2>
<sect2>
<title>A few more tweaks...</title>
<para>There are two more things that you can do to improve your latency. First, you can set the Maximum Transmittable
Unit (mtu) to be lower than the default of 1500 bytes. Lowering this number will lower the average time you have
to wait to send a priority packet if there is already a full-sized low-priority packet being sent. Lowering this
number will also slightly decrease your throughput because each packet contains at least 40 bytes worth of IP and
TCP header information.</para>
<para>The other thing you can do to improve latency even on your low-priority traffic is to lower your queue length
from the default of 100, which on an ADSL line could take as much as 10 seconds to empty with a 1500 byte mtu.</para>
</sect2>
<sect2>
<title>Attempting to Throttle Inbound Traffic</title>
<para>By using the Intermediate Queuing Device (IMQ), we can run all incoming packets through a queue in the same
way that we queue outbound packets. Packet priority is much simpler in this case. Since we can only (attempt to)
control inbound TCP traffic, we'll put all non-TCP traffic in the 0x00 class, and all TCP traffic in the 0x01 class.
We'll also place "small" TCP packets in the 0x00 class since these are most likely ACK packets for outbound data that
has already been sent. We'll set up a standard FIFO queue on the 0x00 class, and we'll set up a Random Early Drop (RED)
queue on the 0x01 class. RED is better than a FIFO (tail-drop) queue at controlling TCP because it will drop packets
before the queue overflows in an attempt to slow down transfers that look like they're about to get out of control.
We'll also rate-limit both classes to some maximum inbound rate which is less than your true inbound speed over the
ADSL modem.</para>
<sect3>
<title>Why Inbound Traffic Limiting isn't all That Good</title>
<para>We want to limit our inbound traffic to avoid filling up the queue at the ISP, which can sometimes buffer
as much as 5 seconds worth of data. The problem is that currently the only way to limit inbound TCP traffic
is to drop perfectly good packets. These packets have already taking up some share of bandwidth on the ADSL modem
only to be dropped by the Linux box in an effort to slow down future packets. These dropped packets will eventually
be retransmitted consuming more bandwidth. When we limit traffic, we are limiting the rate of packets which we
will accept into our network. Since the <emphasis>actual</emphasis> inbound data rate is somewhere above this because
of the packets we drop, we'll actually have to limit our downstream to <emphasis>much</emphasis> lower than the
actual rate of the ADSL modem in order to assure low latency. In practice I had to limit my 1.5mbit/s downstream
ADSL to 700kbit/sec in order to keep the latency acceptable with 5 concurrent downloads. The more TCP sessions
you have, the more bandwidth you'll waste with dropped packets, and the lower you'll have to set your limit rate.</para>
<para>A much better way to control inbound TCP traffic would be TCP window manipulation, but as of this writing
there exists no (free) implementation of it for Linux (that I know of...).</para>
</sect3>
</sect2>
</sect1>
<sect1 id="implementation">
<title>Implementation</title>
<para>Now with all of the explanation out of the way it's time to implement bandwidth management with Linux.</para>
<sect2>
<title>Caveats</title>
<para>
Limiting the actual rate of data sent to the DSL modem is not as simple as it may seem. Most DSL modems
are really just ethernet bridges that bridge data back and forth between your linux box and the gateway
at your ISP. Most DSL modems use ATM as a link layer to send data. ATM sends data in cells that are always
53 bytes long. 5 of these
bytes are header information, leaving 48 bytes available for data. Even if you are sending 1 byte of data,
an entire 53 bytes of bandwidth are consumed sent since ATM cells are always 53 bytes long. This means that
if you are
sending a typical TCP ACK packet which consists of 0 bytes data + 20 bytes TCP header + 20 bytes IP header
+ 18 bytes Ethernet header. In actuality, even though the ethernet packet you are sending has only 40 bytes of
payload (TCP and IP header), the minimum payload for an Ethernet packet is 46 bytes of data, so the remaining
6 bytes are padded with nulls. This means that the actual length of the Ethernet packet plus header is
18 + 46 = 64 bytes. In order to send 64 bytes over ATM, you have to send two ATM cells which consume 106
bytes of bandwidth. This means for every TCP ACK packet, you're wasting 42 bytes of bandwidth.
This would be okay if Linux accounted for the
encapsulation that the DSL modem uses, but instead, Linux only accounts the TCP header, IP header, and 14 bytes
of the MAC address (Linux doesn't count the 4 bytes CRC since this is handled at the hardware level). Linux
doesn't count the minimum Ethernet packet size of 46 bytes, nor does it take into account the fixed ATM cell
size.
</para>
<para>
What all of this means is that you'll have to limit your outbound bandwidth to somewhat less than your
true capacity (until we can figure out a packet scheduler that can account for the various types of
encapsulation being used). You may find that you've figured out a good number to limit your bandwidth to, but
then you download a big file and the latency starts to shoot up over 3 seconds. This is most likely
because the bandwidth those small ACK packets consume is being miscalculated by Linux.
</para>
<para>
I have been working on a solution to this problem for a few months and have almost settled on a solution
that I will soon release to the public for further testing. The solution involves using a user-space
queue instead of linux's QoS to rate-limit packets. I've basically implemented a simple HTB queue
using linux user-space queues. This solution (so far) has been able to regulate outbound traffic SO WELL
that even during a massive bulk download (several streams) and bulk upload (gnutella, several streams)
the latency PEAKS at 400ms over my nominal no-traffic latency of about 15ms. For more information on
this QoS method, subscribe to the email list for updates or check back on updates to this HOWTO.
</para>
</sect2>
<sect2>
<title>Script: myshaper</title>
<para>The following is a listing of the script which I use to control bandwidth on my Linux router. It uses
several of the concepts covered in the document. Outbound traffic is placed into one of 7 queues depending
on type. Inbound traffic is placed into two queues with TCP packets being dropped first (lowest priority) if
the inbound data is over-rate. The rates given in this script seem to work OK for my setup but your results may vary.</para>
<sidebar><para>This script was originally based on the ADSL WonderShaper as seen at the <ulink url="http://www.lartc.org">LARTC website</ulink>.</para></sidebar>
<programlisting>#!/bin/bash
#
# myshaper - DSL/Cable modem outbound traffic shaper and prioritizer.
# Based on the ADSL/Cable wondershaper (www.lartc.org)
#
# Written by Dan Singletary (8/7/02)
#
# NOTE!! - This script assumes your kernel has been patched with the
# appropriate HTB queue and IMQ patches available here:
# (subnote: future kernels may not require patching)
#
# http://luxik.cdi.cz/~devik/qos/htb/
# http://luxik.cdi.cz/~patrick/imq/
#
# Configuration options for myshaper:
# DEV - set to ethX that connects to DSL/Cable Modem
# RATEUP - set this to slightly lower than your
# outbound bandwidth on the DSL/Cable Modem.
# I have a 1500/128 DSL line and setting
# RATEUP=90 works well for my 128kbps upstream.
# However, your mileage may vary.
# RATEDN - set this to slightly lower than your
# inbound bandwidth on the DSL/Cable Modem.
#
#
# Theory on using imq to "shape" inbound traffic:
#
# It's impossible to directly limit the rate of data that will
# be sent to you by other hosts on the internet. In order to shape
# the inbound traffic rate, we have to rely on the congestion avoidance
# algorithms in TCP. Because of this, WE CAN ONLY ATTEMPT TO SHAPE
# INBOUND TRAFFIC ON TCP CONNECTIONS. This means that any traffic that
# is not tcp should be placed in the high-prio class, since dropping
# a non-tcp packet will most likely result in a retransmit which will
# do nothing but unnecessarily consume bandwidth.
# We attempt to shape inbound TCP traffic by dropping tcp packets
# when they overflow the HTB queue which will only pass them on at
# a certain rate (RATEDN) which is slightly lower than the actual
# capability of the inbound device. By dropping TCP packets that
# are over-rate, we are simulating the same packets getting dropped
# due to a queue-overflow on our ISP's side. The advantage of this
# is that our ISP's queue will never fill because TCP will slow it's
# transmission rate in response to the dropped packets in the assumption
# that it has filled the ISP's queue, when in reality it has not.
# The advantage of using a priority-based queuing discipline is
# that we can specifically choose NOT to drop certain types of packets
# that we place in the higher priority buckets (ssh, telnet, etc). This
# is because packets will always be dequeued from the lowest priority class
# with the stipulation that packets will still be dequeued from every
# class fairly at a minimum rate (in this script, each bucket will deliver
# at least it's fair share of 1/7 of the bandwidth).
#
# Reiterating main points:
# * Dropping a tcp packet on a connection will lead to a slower rate
# of reception for that connection due to the congestion avoidance algorithm.
# * We gain nothing from dropping non-TCP packets. In fact, if they
# were important they would probably be retransmitted anyways so we want to
# try to never drop these packets. This means that saturated TCP connections
# will not negatively effect protocols that don't have a built-in retransmit like TCP.
# * Slowing down incoming TCP connections such that the total inbound rate is less
# than the true capability of the device (ADSL/Cable Modem) SHOULD result in little
# to no packets being queued on the ISP's side (DSLAM, cable concentrator, etc). Since
# these ISP queues have been observed to queue 4 seconds of data at 1500Kbps or 6 megabits
# of data, having no packets queued there will mean lower latency.
#
# Caveats (questions posed before testing):
# * Will limiting inbound traffic in this fashion result in poor bulk TCP performance?
# - Preliminary answer is no! Seems that by prioritizing ACK packets (small <64b)
# we maximize throughput by not wasting bandwidth on retransmitted packets
# that we already have.
#
# NOTE: The following configuration works well for my
# setup: 1.5M/128K ADSL via Pacific Bell Internet (SBC Global Services)
DEV=eth0
RATEUP=90
RATEDN=700 # Note that this is significantly lower than the capacity of 1500.
# Because of this, you may not want to bother limiting inbound traffic
# until a better implementation such as TCP window manipulation can be used.