Monthly Archives: July 2010

Bandwidth limiting a varnish server

I know, this might sound like blasphemy. After all, why on earth should one want to limit bandwidth, and serve your clients slower, when using Varnish to accelerate HTTP serving?

Sounds silly. And it is, but if your Varnish servers are saturating your internet wires, and you have other services running, you might want to investigate on bandwidth limiting possibilities. On an installation of Varnish here at Opera, I'm constantly seeing nodes hitting 950+ Mbit/s peak, and 600-700 Mbit/s average. That's a damn fast bit pushing, near to the physical theoretical capacity of 1 Gbit/s for one Gb ethernet card.

Now, back to the problem. How to bw limit Varnish?

I think I know the Varnish project well enough to be sure that something like bandwidth limiting will never be implemented inside Varnish. Asking on the mailing list, I got the tip to start looking into tc and its companion howto.

I'd never heard of tc before, so I started searching around. Turns out that half of the internet is heavily outdated :) I found lots of old stale material, and very few working examples. After a bit of research, I came up with a semi-working solution.

tc is a linux command installed by default almost anywhere I think. Its purpose is to control network traffic. It's not very simple to use, and I don't want to pretend I know it. I just wished there were more examples about it, easier to understand, and recently updated.

The best example I found is archived at ("8.2 Guaranteeing rate"):



if [ x$1 = 'xstop' ]
        tc qdisc del dev eth0 root >/dev/null 2>&1

tc qdisc add dev eth0 root handle 1: htb default 90
tc class add dev eth0 parent 1: classid 1:1 htb rate ${RATE}kbit ceil ${RATE}kbit

tc class add dev eth0 parent 1:1 classid 1:10 htb rate 6000kbit ceil ${RATE}kbit
tc class add dev eth0 parent 1:1 classid 1:20 htb rate 1000kbit ceil ${RATE}kbit
tc class add dev eth0 parent 1:1 classid 1:50 htb rate 500kbit ceil ${RATE}kbit
tc class add dev eth0 parent 1:1 classid 1:90 htb rate 500kbit ceil 500kbit

tc qdisc add dev eth0 parent 1:10 handle 10: sfq perturb 10
tc qdisc add dev eth0 parent 1:20 handle 20: sfq perturb 10
tc qdisc add dev eth0 parent 1:50 handle 50: sfq perturb 10
tc qdisc add dev eth0 parent 1:90 handle 90: sfq perturb 10

tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 80 0xffff classid 1:10
tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 22 0xffff classid 1:20
tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 25 0xffff classid 1:50
tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 110 0xffff classid 1:5

These commands create a queue discipline class tree ("qdisc") that is a control structure that is used by tc to know how to shape or rate limit the network traffic. You can do pretty much anything, but the previous code creates one "htb" bucket for all the traffic. "htb" means "Hierarchy Token Bucket". That is supposed to contain other "buckets" that can define different arbitrary bandwidth limits.

Simplifying: the main "htb" bucket (named "1:1") corresponds to the full pipe bandwidth, say 8Mbit/s. Then under this one, we create other 4 buckets, named "1:10", "1:20", "1:50", "1:90" of respectively, 6Mbit/s, 1Mbit/s, 500kbit/s, 500kbit/s. These ones are managed through the "sfq", "Stochastic Fairness Queueing". Read: everyone gets their fair piece of the pie :)

So we have these 4 different pipes, 6Mbit/s, 1Mbit/s, 500kbit/s, 500kbit/s. After that, last block, we can decide which pipe should the traffic go through.

sport 80 means that source (outgoing source, so it's your clients destination) port 80, where your HTTP is supposedly listening to, will get the big 6Mbit/s slice, sport 22 (ssh) will get 1Mbit/s, and so on…

Now this did work on my test machine, and I could set the bandwidth limit, download a file with wget and see that the speed was exactly matching the desired one, while other connections were unlimited. However, when I tried to put this in production on the actual Varnish machines, the same script and settings didn't work.

I figured I had to bandwidth limit the whole "htb" bucket, instead of limiting just the HTTP traffic. Which sucks, I guess. But nevertheless, it works. So, I'll copy/paste the entire magic here for whoever might be interested. And maybe explain me why this doesn't work exactly like in my tests. Traffic measured with iptraf and iftop show consistent results.

# Set up bandwidth limiting for an interface / service. Based on 'tc'.
# Defaults can be overridden by /etc/default/traffic-shaper
# Cosimo, 2010/07/13

test -f /etc/default/traffic-shaper && . /etc/default/traffic-shaper


echo "[$IF] HTTP (:$HTTP_PORT) rate=$HTTP_RATE/$RATE"
echo "[$IF] SSH  (:22) rate=$SSH_RATE"


if [ "x$1" = "xstop" ]; then
        echo 'Stopping traffic shaper...'
        $TC qdisc del dev $IF root >/dev/null 2>&1 && echo 'Done'
elif [ "x$1" = "xshow" ]; then
        $TC qdisc show dev $IF
elif [ "x$1" = "xstats" ]; then
        $TC -d -s qdisc show dev $IF

echo "Traffic shaping setup ($HTTP_RATE/$RATE) on port $HTTP_PORT."
echo "Reserving $SSH_RATE for interactive sessions."

$TC qdisc add dev $IF root handle 1: htb default 10

# I should be using this line, but I had to replace it with the following
### $TC class add dev $IF parent 1: classid 1:1 htb rate ${RATE} ceil ${RATE}
$TC class add dev $IF parent 1: classid 1:1 htb rate ${HTTP_RATE} ceil ${RATE}

# Doesn't seem to have any effect (?)
$TC class add dev $IF parent 1:1 classid 1:10 htb rate ${HTTP_RATE} ceil ${RATE}
$TC class add dev $IF parent 1:1 classid 1:90 htb rate ${SSH_RATE} ceil ${RATE}

$TC qdisc add dev $IF parent 1:10 handle 10: sfq perturb 10
$TC qdisc add dev $IF parent 1:90 handle 90: sfq perturb 10

$TC filter add dev $IF parent 1:0 protocol ip u32 match ip sport $HTTP_PORT 0xffff classid 1:10
$TC filter add dev $IF parent 1:0 protocol ip u32 match ip sport 22 0xffff match ip dport 22 0xffff classid 1:90

Have fun, but don't try this at home :)

Perl 6 LWP::Simple gets chunked transfers support

With this one I think we're basically done!

Perl6 LWP::Simple gets chunked transfers support. It's probably not excellent or universally working, but for the examples I could try and test, it's totally fine. If you find some URLs where it's broken, please tell me.

I also threw in the getstore() method to save URLs locally.

So, LWP::Simple for Perl 6 is here and it's working. It's not yet "complete" compared to the Perl 5 version, but now that I got the hard bits working, and the internals can perform a full http response parsing, I'll try to reach a 100% API compatibility with the Perl 5 one (where it makes sense).

Try it and let me know. Have fun!

“DebPAN”, a production-grade Debian CPAN repository

The problem

This is a proposal I came up with after talking to Gabor Szabo about his Perl Ecosystem Development proposal.

One of the major "problems" we face while developing and deploying production Perl-based systems with Debian is that the state of the Debian CPAN modules is depressingly outdated. As an example, we're using Catalyst in Lenny, and that dates back to 2008 for the most parts.

This is just not enough.

A solution: maintain your own APT repository

Our current solution is to manually package every bit and maintain our own internal Opera APT repository that our servers and applications depend on. That's not optimal for two reasons:

  • we have to package lots of modules, due to interdependencies, dedicating a fair amount of time to this activity that is not exactly "productive"
  • we can't trust our systems to be Debian anymore, since we're updating bits and pieces with the bold assumption that everything will work fine

Of course, 99.999% will work fine, since it's Perl, but the problem is that one day this could fall down on our heads.

So, given the problem, what are the solutions?

  • Continuing to manually keep an apt repository. Downside: Some(tm) waste of time
  • Use a different packaging/deployment system, like PAR::Repository. I would personally like this, but it doesn't eliminate the need to maintain an own repository. You just don't use dh-make-perl and friends, that are, IMHO, nice to have and useful
  • The "DebPAN"

The DebPAN

I know Jeremiah Foster, and at a couple of Perl events I heard him talking to other CPAN/Perl developers about these issues. His answer would probably be to file a request for packaging for the modules we're interested in.

The reason why that doesn't work is that the lead time for a given RFP to land on Debian stable is unacceptably long, and I realize that is for a good reason. After all, it's supposed to be stable, right?

What about a "DebPAN" repository?

That could be a 3rd party APT repository, something like

  • maintained by a close group of Perl/Debian/CPAN developers
  • guaranteed to have a selection of the most important modules (more on that…) in a reasonably recent version
  • targeted to Debian stable, and maybe other distributions? I think Ubuntu 10.04 suffers from this same problem, but much less than Debian Lenny, just to pick two versions
  • maybe even with patches applied?, but that might be way too much, actually

The can of worms

Of course, lots of problems can arise. However, if we think this is a good idea, then we should try to have something even minimal up and running. Then we'll worry about all the problems…

However, who gets to decide the most useful modules? That should go by popular demand I guess. Even looking at Debian requests for packaging stats, maybe? I can also imagine that bigger companies using Perl would be interested in this to potentially save lots of "infrastructural work".

I'd be really interesting to know other people opinions on this, especially if they use Debian stable, Debian developers, or the Debian-Perl group itself.

Cache::Memcached::Mock, instant in-process memcached mock

A week ago, I wrote a Cache::Memcached mock module for some complicated unit tests in this project I'm working on.

A few people asked to upload it to CPAN, so here it is:

Cache::Memcached::Mock v0.01 is on CPAN.

I didn't spend that much time polishing it and making documentation, so it's a bit rough around the edges, but you get the idea.

You can use it as a drop-in replacement for Cache::Memcached when you don't want, or can't afford, to run your own memcached daemon.

I've already got a feature request from a colleague: making sure set() fails if you try to store a value bigger than 1Mb.