Category Archives: Uncategorized

Surge 2010 scalability conference in Baltimore, USA – DAY 1

This was the first year the Surge conference took place, in Baltimore, USA. OmniTI is the company that organized it.

30" summary (TL;DR)

The conference was amazing. Main topic was scalability. Met a lot of people. 2 days, 2 tracks and 20+ speakers. Several interesting new products and technologies to evaluate.

The long story

The conference topics were scalability, databases and web operations. It took place over two days filled with high-level talks about experiences, failures, and advice on scaling web sites.

The only downside is that I had to miss half of the talks, being alone :). The good thing is that all videos and slides will be up on the conference website Soon™

Lots of things to be mentioned but I'll try to summarize what happened in Day 1.

John Allspaw – Web Engineering

First keynote session by John Allspaw, former Flickr dev, now Etsy.com.

Summary: Web engineering (aka Web Operations) is still a young field. We must set out to achieve much higher goals, be more scientific. We don't need to invent anything. We should be able to get inspiration and prior art from other fields like aerospace, civil engineering, etc…

He had lots of examples in his slides. I want to go through this talk again. Really inspiring.

Theo Schlossnagle – Scalable Design Patterns

Theo's message was clear. Tools can work no matter what technology. Bend technologies to your needs. You don't need the shiniest/awesomest/webscalest. Monitoring is key. Tie metrics to your business. Be relevant to your business people.

Ronald Bradford – Most common MySQL scalability mistakes

If you're starting with MySQL, or don't have too much experience, then you definitely want to listen to Ronald's talk. Will save you a few years of frustration. :)

Companion website, monitoring-mysql.com.

Ruslan Belkin – Scaling LinkedIn

Ruslan is very prepared and technical, but maybe I expected a slightly different type of content. I must read again the slides when they're up. LinkedIn is a mostly ("99%") Java, uses Lucene as main search tier. Very interesting: they mentioned that since 2005-2006, they have been using several specific services (friends, groups, profiles, etc…) instead of one big database. This allows them to scale better and more predictably.

They also seem to use a really vast array of different technologies, like Voldemort, and many others I don't remember the names right now.

Robert Treat – Database scalability patterns

Robert is a very experienced DBA with no doubt. He talked about all different types of MySQL configurations available to developers in need of scaling their apps, explaining them and providing examples: horizontal/vertical partitioning, h/v scaling, etc…

I was late for this talk so I only got the final part.

Tom Cook – A day in the life of Facebook operations

I listened to the first 10-15 minutes of this talk, and I had the impression that this was probably the 3rd time I listen to the same talk, that tells us how big Facebook is, upload numbers, status updates, etc… without going into specific details. This of course is very impressive, but it's the low-level stuff that's more interesting, at least for me.

Last time I had attended this talk was in Brussels for Fosdem. I was a bit disappointed so I left early. According to some later tweets, the last part was the most interesting. Have to go back on this one, and watch the video. Well… at least I got to listen to the last part of…

Arthur Bergman – Scaling Wikia

Lots of Varnish knowledge (and more) in this talk!

I had read some earlier talks by Artur, always about Varnish, and I have learnt a lot from him.
I strongly suggest to go through his talks if you're interested in Varnish.

They "abused" Urchin tracker (Google Analytics) javascript code to measure their own statistics about server errors and client-side page loading times. Another cool trick is the use of a custom made-up X-Vary-URL HTTP header to keep all linked URLs (view/edit/etc.. regarding a single wiki page) in one varnish hash slot. In this case, with a single purge command you can get rid of all relevant pages linked to the same content.

They use SSDs extensively. A typical Wikia server (Varnish and/or DB) has got:

2 x 6 cores westmere processor
6 x Intel X25 SSD (~ $2000)
2 x spinning drives for transaction logs (db)

"SSD allows you JOINs with no performance degradation."

Peak speeds reached (this is random not sequential: amazing!):

500 Mbyte/s random read with avg latency of 0.2 ms
220 Mbyte/s random writes

They use their own CDN based on Dynect (I think a Dyn Inc. service, see below).
Still using Akamai for a minor part of their static content.

Wikia is looking into using Riak, and a Riak-based filesystem to hook up directly to Varnish for really fast file serving.

Mike Malone – SimpleGeo

SimpleGeo implemented a geographic database over apache cassandra, able to answer spatial queries. They researched into using PostGIS (postgres-based GIS DB, very common product), but wasn't as flexible as they needed (don't remember exactly why).

TODO: look into "Distributed indexes over-DHT". He indicated it as prior art for their system.
This talk was a bit complicated for me to follow, so I'll have to watch it again.

Closing day 1

At the end of the day, there was a SQL vs NoSQL panel, which I skipped entirely. Maybe it was interesting :) The after-hours event that closed day 1 was organized by Dyn Inc. It was fantastic. Lots of good beer, martinis, and good food. I went to bed early, since I was still jetlagged. Day 2 started at 9 AM.

Time for a break :)

And then on to Day 2:

http://my.opera.com/cstrep/blog/2010/10/07/surge-2010-scalability-conference-in-baltimore-usa-day-2

Survival guide to UTF-8

Leave a reply

Please, future me, and please, you cool programmer that one way or another, one day or the other, have struggled understanding UTF-8 (in Perl or not), do yourself a really big favor and read the following links:

Encodings and Unicode in Perl (http://perlgeek.de/en/article/encodings-and-unicode)

Unicode-processing issues in Perl and how to cope with it (http://www.ahinea.com/en/tech/perl-unicode-struggle.html)

and:
[Slides] UTF-8 survival guide (http://www.webtuesday.ch/_media/meetings/utf-8_survival.pdf),

and if you still have spare time to read:

Characters vs. Bytes (http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF).

After reading these few articles, you will be a much better human being. I promise. In the meantime, Perl programmer, remember that:

use utf8; is only for the source code, not for the encoding of your data. Let's say you define a scalar variable like:
```
my $username = 'ãƒã‚ª';
```
Ok. Now, if you happen to have use utf8 or not inside your script, there will be no whatsoever difference in the actual content of that scalar variable. Exactly, no difference. Except there's one difference. The variable itself (the $username box) will be flagged as containing UTF-8 characters (if you used utf8, of course). Clear, right?
For the rest, open your filehandles declaring the encoding (open my $fh, '<:utf8', $file;), or explicitly use Encode::(en|de)code_utf8($data).
You can make sure the strings you define in your source code are UTF-8 encoded by opening and then writing to your source code file with an editor that supports UTF-8 encoding, for example vim has a :set encoding=utf8 command.
Also, make sure your terminal, if you're using one, is set to UTF-8 encoding otherwise you will see gibberish instead of your beloved Unicode characters. You can do that with any terminal on this planet, bar the windows cmd.exe shell… If anyone knows how to, please tell me.
And finally, use a font with Unicode characters in it, like Bitstream Vera Sans Mono (the default Linux font), Envy R, plain Courier, etc… or you will just see the broken-UTF8-character-of-doom. Yes, this one → ï¿½ :-)

There's an additional problem, and that is when you need to feed some strings to a Digest module like Digest::SHA1, to obtain back a hash. In that case, I presume the SHA1 algorithm, as MD5 and others, they don't really work on Unicode characters, or UTF8-encoded characters, they just work on bytes, or octets.

So, if you try something like:


use utf8;
use Digest::SHA1;

my $string = "ãƒã‚°ã‚¤ãƒ³ãƒ¡ãƒ¼ãƒ«ã‚¢ãƒ‰ãƒ¬ã‚¹";
my $sha1 = Digest::SHA1->new();
$sha1->add($string);

print $sha1->hexdigest();

it will miserably fail (Wide character in subroutine entry at line 6) because $string is marked as containing "wide" characters, so it must be turned into octets, by doing:


use utf8;
use Encode;
use Digest::SHA1;

my $string = "ãƒã‚°ã‚¤ãƒ³ãƒ¡ãƒ¼ãƒ«ã‚¢ãƒ‰ãƒ¬ã‚¹";
my $sha1 = Digest::SHA1->new();
$sha1->add( Encode::encode_utf8($string) );

print $sha1->hexdigest();

I need to remind myself all the time that:

Encode::encode_utf8($string) wants a string with Unicode characters and will give you a string converted to UTF-8 octets, with the UTF8 flag *turned off*. Basically bytes. You can then do anything with them, print, put in a file, calculate a hash, etc…
Encode::decode_utf8($octets) wants a string of (possibly UTF-8) octets, and will give you a string of Unicode characters, with the UTF8 flag *turned on*, so for example trying to lowercase (lc) a "Ã…" will result in a "Ã¥" character.

So, there you go! Now you are a 1st level UTF-8 wizard. Go and do your UTF-8 magic!

Epilogue: now I'm sure: in a couple of weeks I will come back to this post, and think that I still don't understand how UTF-8 works in Perl… :-)

Another Ubiquity for Opera update, DuckDuckGo search

Leave a reply

My small Ubiquity for Opera experiment gets another quick update.

This time I added one of my favorite search engines, DuckDuckGo. Despite being a young project, I think it's really interesting, and its results are highly relevant and up-to-date. I like it! Plus, it's a Perl project.

So that's it, I just added the duckduckgo command.

This new version also fixes an annoying problem with a couple of Google-related search commands, that were showing just 1 result, instead of the default 20 search results. There's so much more that could be improved, but I rarely find the time to work on it…

As always, the updated code is available on the Ubiquity for Opera project page, where you will also find the minified version (~40 kb instead of ~70).

Enjoy!

Running Ubiquity on Google Chrome

Leave a reply

It's been a while since I started working on Ubiquity for Opera. It's my limited, but for me totally awesome, port of Mozilla's Ubiquity project, originally only for Firefox, to the Opera browser.

I had several people ask me through my blog or email to write a version for Google Chrome. And, by popular demand, here it is! To my surprise, it took much less than I had originally thought. I had a few small problems though, from the event handlers, different from Opera UserJS model, to style attributes for dynamically created elements, and other minor things as well.

It's still lacking Ajax/xmlHttpRequest support, but that shouldn't be a huge problem.

I uploaded it to userscripts.org too. You can see it here: http://userscripts.org/scripts/show/85550.

The code, as usual, is up on Github

http://github.com/cosimo/ubiquity-chrome/.

If you try it, I would be interested to know what you think. Have fun!

If Text::Hunspell never worked for you, now it’s time to try it again!

Leave a reply

If you don't know, Hunspell is the spell checker engine of OpenOffice.org, and it's also included in the Opera and Mozilla browsers.

We were trying to use it from Perl, using the old Text::Hunspell module, version 1.3, but we had problems with it. Big problems. Like segfaults and tests that wouldn't run.

A bright hacker from Italy :) was then called in to fix the problem, with the promise of a fantastic prize he hasn't seen yet… [ping?] :-)

During the process, I found out I know absolutely nothing about dictionary files and stuff, and my fixes were – I would say – definitely horrible.

But! There's a bright side, of course, and that is that the module works just fine now, at least on Debian/Ubuntu systems. Before using Text::Hunspell, you want to install the following packages:

hunspell
libhunspell-dev

The example in the POD documentation (and in the examples dir) uses the standard US english dictionary. If you don't have that, you will need to change the script slightly. But the code is tested and should work without a problem. If you try it out and you have feedback, by all means let me know. Thanks!

Source code available on GitHub at:
http://github.com/cosimo/perl5-text-hunspell/

The module, tagged as 2.00 because it's cool :), will be up on CPAN shortly at this address:
http://search.cpan.org/dist/Text-Hunspell/

Self-contained instant functional test suites

Leave a reply

I first learned about functional test suites when I started working on the My Opera code. Two years and some time later, I found myself slightly hating the My Opera functional test suite.

The main reasons:

It's too slow. Currently, it takes anywhere between 20 and 30 minutes to complete. Sure, it's thousands of tests, divided into hundreds of test scripts, grouped by functional area, like login, blogs, albums, etc…

Even considered all of this, I think we should aim to have a single functional test run complete in 5-10 minutes.

Most of the time is being wasted in the communication with the test server and creation and destruction of the database.
It's unreliable, sometimes cumbersome to manage. In our setup, CruiseControl fires a functional test suite run after every commit into the source code repository.

That initiates an ssh connection to the test server, where a shell script takes care of restarting the running apache, dropping and creating the test database, updating required packages, etc…

This implies that you need an apache instance running on a specific port, a mysql instance running on another port, etc… In the long run, this proved to be an approach that doesn't scale very well. It's too error prone, and not very reliable. Maybe your test run will fail because mysql has mysteriously crashed, or some other random bad thing.

A functional test suite that sucks less

Given the motivation, I came up with this idea of a functional test suite that is:

instant: check out the source code from the repository, and you're ready to go.
self-contained: it shouldn't have any external servers that need to be managed or even running.
reusable: it doesn't have to be a functional test suite. The same concept can be reused for unit test suites, or anything really.

A few months later the first idea, and a couple of weeks of work on it, and we were ready with the first prototype. Here's how we use it:

Check out the source code from the repository in ~/src/myproject
cd ~/src/myproject
./bin/run-functional-test-suite
Private instances of Apache, MySQL, and the main application are created and started
A custom WWW::Mechanizer-based client runs the functional test cases against the Apache instance
A TAP stream from the test run is produced and collected
All custom instances are destroyed
???
Profit!

The temporary test run directory is left there untouched, so you can inspect it in case of problems. This is priceless, because that folder contains all the configuration and log files that the test run generated.

The main ideas

The entire functional test suite should run within a unique temporary directory created on the fly. You can run many instances as you want, with your username or a different one. They won't conflict with each other.
Use as much as possible the same configuration files as the other environments. If you already have development, staging and production, then functional-test is just another one of your environments. If possible, avoid creating special config files just for the functional test suite.
The Apache, MySQL, and Application configuration files are simple templates where you need to fill in the apache hostname, apache port, mysql port, username, password, etc…
Apache is started up from the temporary directory using the full command invocation, as:
/usr/sbin/apache2 -k start -f /tmp/{project}_{user}_testsuite_{pid}/conf/apache.conf

and all paths in config files are absolute, to avoid any relative references that would lead to files not found.
The configuration template files have to take this into account, and put a $prefix style variable
everywhere, like (Apache config, for example):

Timeout 10 Keepalive On HostnameLookups Off ... Listen [% suite.apache.port %] ...
The MySQL instances are created and destroyed from scratch for every test suite run, using the amazing MySQL::Sandbox. My colleague Terje had to patch the sandbox creation script to fix a problem with 64-bit environments. We filed a bug on the MySQL::Sandbox RT queue about this. This is the only small problem found in a really excellent Perl tool. If you haven't looked at it, do it now.

All in all, I'm really satisfied of the result. We applied it to the Auth project for now, which is smaller than My Opera, and we fine-tuned the wrapping shell script to compensate for the underlyiung distribution automatically. We've been able to run this functional test suite on Ubuntu from version 8.04 to 10.04, and on Debian Lenny, with no modifications. The required MySQL version is detected according to your system, and MySQL::Sandbox is instructed accordingly.

It also results in a faster execution of the entire functional test suite.

What can we improve?

There's lots of things to improve:

Speed. Instead of creating and destroying the database for every test group, it would be nice to use transactions, and rollback everything at the end of the test group (t/{group-name}/*.t). I used to do this many years ago with Postgres, and it was perfectly safe. I'm not sure we will ever make it with MySQL, since for example, ALTER TABLE (and other SQL statements) completely ignores transactions.
Reliability. The private instances of Apache, MySQL (and recently we added memcached too), are started up from ports that are calculated from the originating bash script PID. So, if you run the test suite from the bash script running as PID 8029, then MySQL is started up on port 8030, Apache on 8031, memcached on 8032. This has worked very well for now, and we made sure we don't use ports < 1024, but could lead to mysterious failures if some local services are using ports like 5432, or others. The idea is that we could test if a port is available before using it. However, this error is already detected at startup time, so it shouldn't be a huge problem.

Feedback welcome!

Puppet custom facts, and master-less puppet deployment

Leave a reply

As I mentioned a few weeks ago, I'm using Puppet for some smaller projects here at work. They're pilot projects to see how puppet behaves and scales for us before taking it into bigger challenges.

One of the problems so far is that we're using fabric as the "last-mile" deployment tool, and that doesn't yet have any way to run jobs in parallel. That's the reason why I'm starting to look elsewhere, like mcollective for example.

However, today I had to prepare a new varnish box for files.myopera.com. This new machine is in a different data center than our current one, so we don't have any puppetmaster deployed there yet. This stopped me from using puppet in also another project. But lately I've been reading on the puppet-users mailing list that several people have tried to deploy a master-less puppet configuration, where you have no puppetmasterd running. You just deploy the puppet files, via rsync, source control or pigeons, and then let the standalone puppet executable run.

Puppet master-less setup

To do this, you have to at least have a good set of reusable puppet modules, which I tried to build small pieces at a time during the last few months. So I decided to give it a shot, and got everything up and running quickly. Deployed my set of modules in /etc/puppet/modules, and built a single manifest file that looks like the following:


#
# Puppet standalone no-master deployment
# for files.myopera.com varnish nodes
#
node varnish_box {

    # Basic debian stuff
    $disableservices = [ "exim4", "nfs-common", "portmap" ]
    service { $disableservices:
        enable => "false",
        ensure => "stopped",
    }

    # Can cause overload on the filesystem through cronjobs
    package { "locate": ensure => "absent", }
    package { "man-db": ensure => "absent", }

    # Basic configuration, depends on data center too
    include opera
    include opera::sudoers
    include opera::admins::core_services
    include opera::datacenters::dc2

    # Basic packages now. These are all in-house modules
    include base_packages
    include locales
    include bash
    include munin
    include cron
    include puppet
    include varnish

    varnish::config { "files-varnish-config":
        vcl_conf => "files.vcl",
        storage_type => "malloc",
        storage_size => "20G",
        listen_port => 80,
        ttl => 864000,
        thread_pools => 8,
        thread_min => 800,
        thread_max => 10000,
    }

    #
    # Nginx (SSL certs required)
    #
    include nginx

    nginx::config { "/etc/nginx/nginx.conf":
        worker_processes => 16,
        worker_connections => 16384,
        keepalive_timeout => 5,
    }

    nginx::vhost { "https.files.myopera.com":
        ensure => "present",
        source => "/usr/local/src/myopera/config/nginx/sites-available/https.files.myopera.com",
    }

    bash:: prompt { "/root/.bashrc":
        description => "Files::Varnish",
        color => "red",
    }

    munin:: plugin::custom { "cpuopera": }

    munin:: plugin { "if_eth0":
        plugin_name => "if_"
    }

    munin:: plugin {
        [ "mem_", "load", "df", "df_inode", "netstat", "vmstat",
          "iostat", "uptime", "threads", "open_files", "memory", "diskstats" ]:
    }
}

node default inherits varnish_box {
}

node 'my.hostname.opera.com' inherits varnish_box {
}

This manifest installs varnish, nginx, a bunch of basic packages I always want on every machines (vim, tcpdump, etc…), munin and appropriate plugins already configured, and also a nice red bash prompt to warn me that this is production stuff.

This file is everything the puppet client needs to run and produce the desired effect, without needing a puppet master. Save it as varnish-node.pp and then you run it with:


puppet varnish-node.pp

One problem that usually arises is how to serve the static files. In this case, I assumed I'm going to check out the source code and config files from my own repository into /usr/local/src/... so I don't need to point puppet to a server with the classic:


source => "puppet:///module/filename"

but you can just use:


source => "/usr/local/whatever/in/my/local/filesystem"

That's great and it works just fine.

Custom facts

Puppet uses a utility called facter to extract "facts" from the underlying system, sysinfo-style. A typical facter run produces the following output:


$ facter
architecture => x86_64
domain => internal.opera.com
facterversion => 1.5.6
fqdn => cd01.internal.opera.com
...
hardwaremodel => x86_64
hostname => cd01
id => cosimo
ipaddress => 10.20.30.40
ipaddress_eth0 => 10.20.30.40
is_virtual => false
...
kernel => Linux
kernelmajversion => 2.6
...
operatingsystem => Ubuntu
operatingsystemrelease => 10.04
physicalprocessorcount => 1
processor0 => Intel(R) Core(TM)2 Duo CPU     E6550  @ 2.33GHz
processor1 => Intel(R) Core(TM)2 Duo CPU     E6550  @ 2.33GHz
processorcount => 2
...

and so on. Within puppet manifests, you can use any of these facts to influence the configuration of your system. For example, if memorysize > 4.0 Gb then run varnish with 2000 threads instead of 1000. This is all very cool, but sometimes you need something that facter doesn't give you by default.

That's why facter can be extended.

I tried creating a datacenter.rb facter plugin that would look at the box IP address and figure out in which data center we're located. That in turn can be used to setup the nameservers and other stuff.

Here's the code. My Ruby-fu is less than awesome:


#
# Provide an additional 'datacenter' fact
# to use in generic modules to provide datacenter
# specific settings, such as resolv.conf
#
# Cosimo, 03/Aug/2010
#

Facter.add("datacenter") do
    setcode do

        datacenter = "unknown"

        # Get current ip address from Facter's own database
        ipaddr = Facter.value(:ipaddress)

        # Data center on Mars
        if ipaddr.match("^88.88.88.")
            datacenter = "mars"

        # This one on Mercury
        elsif ipaddr.match("^99.99.99.")
            datacenter = "mercury"

        # And on Jupiter
        elsif ipaddr.match("^77.77.77.")
            datacenter = "jupiter"
        end

        datacenter
    end
end

However, there's one problem. When puppet runs, it doesn't get the new fact, even though facter from the command line can see it and execute it just fine (when the plugin is in the current directory).

Now I need to know how to inform puppet (and/or facter) that it has to look into one of my puppet modules' plugin (or lib from 0.25.x) directory to load my additional datacenter.rb fact.

Any idea ??

More Perl 6 development, String::CRC32

Leave a reply

A couple of days were plenty enough to whip up a new Perl 6 module, String::CRC32.

This is a straight conversion from the Perl 5 CPAN version of String::CRC32. That one uses some C/XS code though. I wrote the equivalent in Perl 6. Was quite simple actually, once I learned a bit more on numeric bitwise operators.

The classic || and && for bitwise "OR" and "AND" are now written as:


Bitwise OR  '+|'
Bitwise AND '+&'
Bitwise XOR '+^'

So the following code:


$x &&= 0xFF;

becomes:


$x +&= 0xFF;

The reason behind this is that now there are not only numeric bitwise operators, but also boolean ones. You can read more in the official specification for Perl 6 operators.

You can download the code for the module on http://github.com/cosimo/perl6-string-crc32. Take a look at various other Perl 6 modules on modules.perl6.org.

Bandwidth limiting a varnish server

Leave a reply

I know, this might sound like blasphemy. After all, why on earth should one want to limit bandwidth, and serve your clients slower, when using Varnish to accelerate HTTP serving?

Sounds silly. And it is, but if your Varnish servers are saturating your internet wires, and you have other services running, you might want to investigate on bandwidth limiting possibilities. On an installation of Varnish here at Opera, I'm constantly seeing nodes hitting 950+ Mbit/s peak, and 600-700 Mbit/s average. That's a damn fast bit pushing, near to the physical theoretical capacity of 1 Gbit/s for one Gb ethernet card.

Now, back to the problem. How to bw limit Varnish?

I think I know the Varnish project well enough to be sure that something like bandwidth limiting will never be implemented inside Varnish. Asking on the mailing list, I got the tip to start looking into tc and its companion howto.

I'd never heard of tc before, so I started searching around. Turns out that half of the internet is heavily outdated :) I found lots of old stale material, and very few working examples. After a bit of research, I came up with a semi-working solution.

tc is a linux command installed by default almost anywhere I think. Its purpose is to control network traffic. It's not very simple to use, and I don't want to pretend I know it. I just wished there were more examples about it, easier to understand, and recently updated.

The best example I found is archived at http://blog.edseek.com/~jasonb/articles/traffic_shaping/scenarios.html ("8.2 Guaranteeing rate"):

#!/bin/bash

RATE=8000

if [ x$1 = 'xstop' ]
then
        tc qdisc del dev eth0 root >/dev/null 2>&1
fi

tc qdisc add dev eth0 root handle 1: htb default 90
tc class add dev eth0 parent 1: classid 1:1 htb rate ${RATE}kbit ceil ${RATE}kbit

tc class add dev eth0 parent 1:1 classid 1:10 htb rate 6000kbit ceil ${RATE}kbit
tc class add dev eth0 parent 1:1 classid 1:20 htb rate 1000kbit ceil ${RATE}kbit
tc class add dev eth0 parent 1:1 classid 1:50 htb rate 500kbit ceil ${RATE}kbit
tc class add dev eth0 parent 1:1 classid 1:90 htb rate 500kbit ceil 500kbit

tc qdisc add dev eth0 parent 1:10 handle 10: sfq perturb 10
tc qdisc add dev eth0 parent 1:20 handle 20: sfq perturb 10
tc qdisc add dev eth0 parent 1:50 handle 50: sfq perturb 10
tc qdisc add dev eth0 parent 1:90 handle 90: sfq perturb 10

tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 80 0xffff classid 1:10
tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 22 0xffff classid 1:20
tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 25 0xffff classid 1:50
tc filter add dev eth0 parent 1:0 protocol ip u32 match ip sport 110 0xffff classid 1:5

These commands create a queue discipline class tree ("qdisc") that is a control structure that is used by tc to know how to shape or rate limit the network traffic. You can do pretty much anything, but the previous code creates one "htb" bucket for all the traffic. "htb" means "Hierarchy Token Bucket". That is supposed to contain other "buckets" that can define different arbitrary bandwidth limits.

Simplifying: the main "htb" bucket (named "1:1") corresponds to the full pipe bandwidth, say 8Mbit/s. Then under this one, we create other 4 buckets, named "1:10", "1:20", "1:50", "1:90" of respectively, 6Mbit/s, 1Mbit/s, 500kbit/s, 500kbit/s. These ones are managed through the "sfq", "Stochastic Fairness Queueing". Read: everyone gets their fair piece of the pie :)

So we have these 4 different pipes, 6Mbit/s, 1Mbit/s, 500kbit/s, 500kbit/s. After that, last block, we can decide which pipe should the traffic go through.

sport 80 means that source (outgoing source, so it's your clients destination) port 80, where your HTTP is supposedly listening to, will get the big 6Mbit/s slice, sport 22 (ssh) will get 1Mbit/s, and so on…

Now this did work on my test machine, and I could set the bandwidth limit, download a file with wget and see that the speed was exactly matching the desired one, while other connections were unlimited. However, when I tried to put this in production on the actual Varnish machines, the same script and settings didn't work.

I figured I had to bandwidth limit the whole "htb" bucket, instead of limiting just the HTTP traffic. Which sucks, I guess. But nevertheless, it works. So, I'll copy/paste the entire magic here for whoever might be interested. And maybe explain me why this doesn't work exactly like in my tests. Traffic measured with iptraf and iftop show consistent results.

!/bin/sh
#
# Set up bandwidth limiting for an interface / service. Based on 'tc'.
# Defaults can be overridden by /etc/default/traffic-shaper
# Cosimo, 2010/07/13
#
TC=/sbin/tc

test -f /etc/default/traffic-shaper && . /etc/default/traffic-shaper

IF=${IF:-eth0}
RATE=${RATE:-100Mbit}
HTTP_RATE=${HTTP_RATE:-50Mbit}
HTTP_PORT=${HTTP_PORT:-80}
SSH_RATE=${SSH_RATE:-500kbit}

echo "[$IF] HTTP (:$HTTP_PORT) rate=$HTTP_RATE/$RATE"
echo "[$IF] SSH  (:22) rate=$SSH_RATE"

#exit

if [ "x$1" = "xstop" ]; then
        echo 'Stopping traffic shaper...'
        $TC qdisc del dev $IF root >/dev/null 2>&1 && echo 'Done'
        exit
elif [ "x$1" = "xshow" ]; then
        $TC qdisc show dev $IF
        exit
elif [ "x$1" = "xstats" ]; then
        $TC -d -s qdisc show dev $IF
        exit
fi

echo "Traffic shaping setup ($HTTP_RATE/$RATE) on port $HTTP_PORT."
echo "Reserving $SSH_RATE for interactive sessions."

$TC qdisc add dev $IF root handle 1: htb default 10

# I should be using this line, but I had to replace it with the following
### $TC class add dev $IF parent 1: classid 1:1 htb rate ${RATE} ceil ${RATE}
$TC class add dev $IF parent 1: classid 1:1 htb rate ${HTTP_RATE} ceil ${RATE}

# Doesn't seem to have any effect (?)
$TC class add dev $IF parent 1:1 classid 1:10 htb rate ${HTTP_RATE} ceil ${RATE}
$TC class add dev $IF parent 1:1 classid 1:90 htb rate ${SSH_RATE} ceil ${RATE}

$TC qdisc add dev $IF parent 1:10 handle 10: sfq perturb 10
$TC qdisc add dev $IF parent 1:90 handle 90: sfq perturb 10

$TC filter add dev $IF parent 1:0 protocol ip u32 match ip sport $HTTP_PORT 0xffff classid 1:10
$TC filter add dev $IF parent 1:0 protocol ip u32 match ip sport 22 0xffff match ip dport 22 0xffff classid 1:90

Have fun, but don't try this at home :)

Perl 6 LWP::Simple gets chunked transfers support

Leave a reply

With this one I think we're basically done!

Perl6 LWP::Simple gets chunked transfers support. It's probably not excellent or universally working, but for the examples I could try and test, it's totally fine. If you find some URLs where it's broken, please tell me.

I also threw in the getstore() method to save URLs locally.

So, LWP::Simple for Perl 6 is here and it's working. It's not yet "complete" compared to the Perl 5 version, but now that I got the hard bits working, and the internals can perform a full http response parsing, I'll try to reach a 100% API compatibility with the Perl 5 one (where it makes sense).

Try it and let me know. Have fun!

Random hacking

Assume nothing. Code defensively. Keep it simple, stupid!

Category Archives: Uncategorized

Surge 2010 scalability conference in Baltimore, USA – DAY 1

30" summary (TL;DR)

The long story

John Allspaw – Web Engineering

Theo Schlossnagle – Scalable Design Patterns

Ronald Bradford – Most common MySQL scalability mistakes

Ruslan Belkin – Scaling LinkedIn

Robert Treat – Database scalability patterns

Tom Cook – A day in the life of Facebook operations

Arthur Bergman – Scaling Wikia

Mike Malone – SimpleGeo

Closing day 1

Survival guide to UTF-8

Another Ubiquity for Opera update, DuckDuckGo search

Running Ubiquity on Google Chrome

If Text::Hunspell never worked for you, now it’s time to try it again!

Self-contained instant functional test suites

A functional test suite that sucks less

The main ideas

What can we improve?

Puppet custom facts, and master-less puppet deployment

Puppet master-less setup

Custom facts

More Perl 6 development, String::CRC32

Bandwidth limiting a varnish server

Perl 6 LWP::Simple gets chunked transfers support