ESFQ for Linux 2.6

ESFQ is the Enhanced Stochastic Fairness Queueing discipline.

For current information on ESFQ, go to http://fatooh.org/esfq-2.6/

The ESFQ packages are named according to the current version of the Linux
kernel. They may or may not work with earlier versions; I only test using the
latest. If you find that the most recent ESFQ package available does not work
with the newest kernel, please send me an email. My address is at the bottom of
the web page listed above.


=============INSTALLATION=============

1.  Download the ESFQ tar file from http://fatooh.org/esfq-2.6/
    current: http://fatooh.org/esfq-2.6/esfq-2.6.13.tar.gz

2.  Download iproute2 from http://linux-net.osdl.org/index.php/Iproute2
    current: http://developer.osdl.org/dev/iproute2/download/iproute2-051007.tar.gz

3.  Download Linux 2.6 from http://www.kernel.org/
    current: http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.13.4.tar.bz2

4.  Untar the files you downloaded, somewhere. Commands in the following
    instructions are examples; you'll have to modify them depending on where
    you untarred the files.

5.  Patch the kernel with ESFQ.
    $ cd /usr/src/linux
    $ patch -p1 --dry-run < ~/esfq-2.6.13/esfq-kernel.patch
    (and, if it works...)
    $ patch -p1 < ~/esfq-2.6.13/esfq-kernel.patch

6.  Configure the kernel to include ESFQ.
    a. The easiest way is use an old config file, if you have it. The config
       you use doesn't have to be from the same kernel version, but if it isn't
       you'll probably be asked other questions besides ESFQ.
      $ cd /usr/src/linux
      $ cp /boot/config-2.6.13 .config
      $ make oldconfig
      When you get to ESFQ, choose y, or m to use it as a module.
      
    b. If you want to use 'make menuconfig' instead, ESFQ is buried in:
       Networking --->
        Networking options --->
         QoS and/or fair queueing --->
          < >   ESFQ queue

    c. In order for the fwmark and fwmark_direct hashes to work, you must enable
       support for "Network Packet Filtering".

7.  Compile and install the kernel as usual. You can reboot to the new kernel
    now if you want to. ESFQ will not work, though, until you compile a patched
    version of tc (in iproute2).
    
8.  Patch the iproute2 package so tc can use ESFQ.
    cd /usr/local/src/iproute2-051007/
    patch -p1 --dry-run < ~/esfq-2.6.13/esfq-iproute2.patch
    (and, if it works...)
    patch -p1 < ~/esfq-2.6.13/esfq-iproute2.patch
   
9.  Compile iproute2.
    $ make
    a. You don't necessarily have to install the entire patched version of
       iproute2 if you don't want to (perhaps you would rather keep your
       distribution's package). All you need is the tc binary.
       # cp -p tc/tc /sbin/tc-esfq
       # chown root:root /sbin/tc-esfq
       Just modify your scripts to use /sbin/tc-esfq instead of /sbin/tc.

    b. Otherwise, go ahead and install iproute2.
       $ make install

10. Reboot to your new kernel if you haven't already, and have fun!



=============USAGE==============

... esfq [ perturb SECS ] [ quantum BYTES ] [ limit PKTS ]
         [ depth FLOWS ] [ divisor HASHBITS ] [ hash HASHTYPE] 
Where:
HASHTYPE := { classic | src | dst }

Options:

perturb SECS
   Default: 0 (no perturbation)
   Recommended: 10
   
   SFQ and ESFQ don't actually allocate traffic to queues in a one-to-one
   manner; instead, they divide traffic among a large (but limited) number of
   queues. When two "flows" of traffic share a queue, they have to divide up
   the bandwidth that queue is allocated. Setting perturb to some value causes
   the flows to be redistributed, and flows that have to share a queue don't
   need to for long.

   The _direct hash types don't use perturbation, and this parameter is ignored.


quantum BYTES
   Default: 1 MTU-sized packet
   Recommended: default

   Amount of bytes a stream is allowed to dequeue before the next queue gets a
   turn. Do not set below the MTU!


limit PKTS
   Default: 128
   Recommended: default
   
   The total number of packets that will be queued by this ESFQ before packets
   start getting dropped. In practice, it is best to have ESFQ as a leaf
   qdisc below a classful qdisc such as HTB. In that case, HTB can control what
   packets get dropped when, and this parameter is irrelevant. Limit must be
   less than or equal to depth.


depth FLOWS
   Default: 128
   Recommended: default


divisor HASHBITS
   Default: 10
   Maximum: 14

   Divisor sets the number of bits to use for the hash table, and results in
   2^HASHBITS possible slots. 10 means 1024 slots, 11 means 2048 slots, etc..
   ESFQ's hash table will consume memory in proportion to the number of slots:
   hash_table_memory = 4 * num_slots
   --or--
   hash_table_memory = 4 * 2^HASHBITS

   A larger hash table decreases the likelihood of collisions. For the direct
   hash types (see below), the hash table should be large enough to encompass
   the range of possible values to be hashed.
   

hash HASHTYPE
   Default: classic

   The hash option determines the basis upon which ESFQ separates traffic into
   queues. HASHTYPE can be either classic, src, or dst. The classic type is how
   the original SFQ works: traffic is separated by flow (TCP connection, UDP
   stream, etc. Src and dst divide traffic based upon t.he source or destination,
   respectively, of each packet.


=============HASH TYPES==============

classic
   This is original SFQ hash. Traffic is separated by flow (TCP connection, UDP
   stream, etc.).

src, dst, fwmark
   Hash by the packet's source IP, destination IP, or iptables mark,
   respectively. The hashing algorithm is well suited for a large range of
   diverse input values. Collisions will happen occasionally; see the "perturb"
   parameter. If the range of values to be hashed is less than 2^14 (16384),
   consider using the direct hash methods listed below. 

src_direct, dst_direct, fwmark_direct
   These hash types are like the three listed above, but use a more simple and
   direct hashing algorithm. This works very well when you can control and/or
   anticipate the range of input values values -- if this range is smaller than
   the hash table, there will be no collisions and the queueing will be
   entirely fair.

   Do not blithely use these hash types and assume collisions will not occur.
   Since the direct hashes learn and adapt to the observed range of input
   values, a single rogue value could make ESFQ expect a range larger than the
   hash table. If this happens, collisions may occur. ESFQ prints a warning
   message to the kernel log:
   
   ESFQ: (direct hash) Input range <range> is larger than hash table.
   
   The direct hashes don't use any perturbation, so you must take a few steps to
   ensure they only receive data you expect them to. See the EXAMPLES section
   below.


==============EXAMPLES=============

# Example 1: Set up ESFQ just like SFQ, as root for eth0. Note that "hash
# classic" is the default.
tc qdisc add dev eth0 root esfq perturb 10


# Example 2: Attach ESFQ to a parent class (such as HTB) and hash by destination
# IP. Very useful if eth0 is on a LAN with bandwidth-consuming clients.
# This example assumes you already have a classful qdisc set up.
tc qdisc add dev eth0 parent 1:12 handle 12: esfq perturb 10 hash dst


# Example 3: Like the example above, but for the WAN interface (eth1 instead of
# eth0, and src instead of dst.
tc qdisc add dev eth1 parent 1:12 handle 12: esfq perturb 10 hash src


# Example 4: Hash by iptables mark, using the direct hash. The default divisor
# is 10, giving 2^10 or 1024 hash slots. As long as the range of marks you use
# is less than 1024, there will be no collisions. 1 to 1023, is ok, 2000 to 3023
# is ok, etc. If you want to use a larger range, increase the divisor value.
tc qdisc add dev eth0 root esfq hash fwmark_direct


# Example 5: Hash by destination IP like Example 2, but use a direct hash
# instead. In this example, the range of IP addresses to hash is 10.0.0.0/18.
# 18 bits for the network leaves 14 bits for the IP addresses, so we must use
# a 14 bit divisor as well (instead of the default 10 bits).
tc qdisc add dev eth0 parent 1:12 handle 12: esfq hash dst_direct divisor 14
# If you have packets for other IP addresses going out the same interface, you
# must make sure they don't interfere. Set the default for your classful qdisc
# to be a separate class, and then use a filter to catch only the IPs you want
# this instance of ESFQ to handle.
tc filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 10.0.0.0/18 flowid 1:12
