I know, this might sound like blasphemy. After all, why on earth should one want to limit bandwidth, and serve your clients slower, when using Varnish to accelerate HTTP serving?
Sounds silly. And it is, but if your Varnish servers are saturating your internet wires, and you have other services running, you might want to investigate on bandwidth limiting possibilities. On an installation of Varnish here at Opera, I'm constantly seeing nodes hitting 950+ Mbit/s peak, and 600-700 Mbit/s average. That's a damn fast bit pushing, near to the physical theoretical capacity of 1 Gbit/s for one Gb ethernet card.
Now, back to the problem. How to bw limit Varnish?
I think I know the Varnish project well enough to be sure that something like bandwidth limiting will never be implemented inside Varnish. Asking on the mailing list, I got the tip to start looking into tc
and its companion howto.
I'd never heard of tc
before, so I started searching around. Turns out that half of the internet is heavily outdated :) I found lots of old stale material, and very few working examples. After a bit of research, I came up with a semi-working solution.
tc
is a linux command installed by default almost anywhere I think. Its purpose is to control network traffic. It's not very simple to use, and I don't want to pretend I know it. I just wished there were more examples about it, easier to understand, and recently updated.
The best example I found is archived at http://blog.edseek.com/~jasonb/articles/traffic_shaping/scenarios.html ("8.2 Guaranteeing rate"):
#!/bin/bash RATE=8000 if [ x$1 = 'xstop' ] then tc qdisc del dev eth0 root >/dev/null 2>&1 fi tc qdisc add dev eth0 root handle 1: htb default 90 tc class add dev eth0 parent 1: classid 1:1 htb rate ${RATE}kbit ceil ${RATE}kbit tc class add dev eth0 parent 1:1 classid 1:10 htb rate 6000kbit ceil ${RATE}kbit tc class add dev eth0 parent 1:1 classid 1:20 htb rate 1000kbit ceil ${RATE}kbit tc class add dev eth0 parent 1:1 classid 1:50 htb rate 500kbit ceil ${RATE}kbit tc class add dev eth0 parent 1:1 classid 1:90 htb rate 500kbit ceil 500kbit tc qdisc add dev eth0 parent 1:10 handle 10: sfq perturb 10 tc qdisc add dev eth0 parent 1:20 handle 20: sfq perturb 10 tc qdisc add dev eth0 parent 1:50 handle 50: sfq perturb 10 tc qdisc add dev eth0 parent 1:90 handle 90: sfq perturb 10 tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 80 0xffff classid 1:10 tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 22 0xffff classid 1:20 tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 25 0xffff classid 1:50 tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 110 0xffff classid 1:5
These commands create a queue discipline class tree ("qdisc") that is a control structure that is used by tc
to know how to shape or rate limit the network traffic. You can do pretty much anything, but the previous code creates one "htb" bucket for all the traffic. "htb" means "Hierarchy Token Bucket". That is supposed to contain other "buckets" that can define different arbitrary bandwidth limits.
Simplifying: the main "htb" bucket (named "1:1") corresponds to the full pipe bandwidth, say 8Mbit/s. Then under this one, we create other 4 buckets, named "1:10", "1:20", "1:50", "1:90" of respectively, 6Mbit/s, 1Mbit/s, 500kbit/s, 500kbit/s. These ones are managed through the "sfq", "Stochastic Fairness Queueing". Read: everyone gets their fair piece of the pie :)
So we have these 4 different pipes, 6Mbit/s, 1Mbit/s, 500kbit/s, 500kbit/s. After that, last block, we can decide which pipe should the traffic go through.
sport 80
means that source (outgoing source, so it's your clients destination) port 80, where your HTTP is supposedly listening to, will get the big 6Mbit/s slice, sport 22 (ssh) will get 1Mbit/s, and so on…
Now this did work on my test machine, and I could set the bandwidth limit, download a file with wget
and see that the speed was exactly matching the desired one, while other connections were unlimited. However, when I tried to put this in production on the actual Varnish machines, the same script and settings didn't work.
I figured I had to bandwidth limit the whole "htb" bucket, instead of limiting just the HTTP traffic. Which sucks, I guess. But nevertheless, it works. So, I'll copy/paste the entire magic here for whoever might be interested. And maybe explain me why this doesn't work exactly like in my tests. Traffic measured with iptraf
and iftop
show consistent results.
!/bin/sh # # Set up bandwidth limiting for an interface / service. Based on 'tc'. # Defaults can be overridden by /etc/default/traffic-shaper # Cosimo, 2010/07/13 # TC=/sbin/tc test -f /etc/default/traffic-shaper && . /etc/default/traffic-shaper IF=${IF:-eth0} RATE=${RATE:-100Mbit} HTTP_RATE=${HTTP_RATE:-50Mbit} HTTP_PORT=${HTTP_PORT:-80} SSH_RATE=${SSH_RATE:-500kbit} echo "[$IF] HTTP (:$HTTP_PORT) rate=$HTTP_RATE/$RATE" echo "[$IF] SSH (:22) rate=$SSH_RATE" #exit if [ "x$1" = "xstop" ]; then echo 'Stopping traffic shaper...' $TC qdisc del dev $IF root >/dev/null 2>&1 && echo 'Done' exit elif [ "x$1" = "xshow" ]; then $TC qdisc show dev $IF exit elif [ "x$1" = "xstats" ]; then $TC -d -s qdisc show dev $IF exit fi echo "Traffic shaping setup ($HTTP_RATE/$RATE) on port $HTTP_PORT." echo "Reserving $SSH_RATE for interactive sessions." $TC qdisc add dev $IF root handle 1: htb default 10 # I should be using this line, but I had to replace it with the following ### $TC class add dev $IF parent 1: classid 1:1 htb rate ${RATE} ceil ${RATE} $TC class add dev $IF parent 1: classid 1:1 htb rate ${HTTP_RATE} ceil ${RATE} # Doesn't seem to have any effect (?) $TC class add dev $IF parent 1:1 classid 1:10 htb rate ${HTTP_RATE} ceil ${RATE} $TC class add dev $IF parent 1:1 classid 1:90 htb rate ${SSH_RATE} ceil ${RATE} $TC qdisc add dev $IF parent 1:10 handle 10: sfq perturb 10 $TC qdisc add dev $IF parent 1:90 handle 90: sfq perturb 10 $TC filter add dev $IF parent 1:0 protocol ip u32 match ip sport $HTTP_PORT 0xffff classid 1:10 $TC filter add dev $IF parent 1:0 protocol ip u32 match ip sport 22 0xffff match ip dport 22 0xffff classid 1:90
Have fun, but don't try this at home :)