Monthly Archives: August 2010

Self-contained instant functional test suites

I first learned about functional test suites when I started working on the My Opera code. Two years and some time later, I found myself slightly hating the My Opera functional test suite.

The main reasons:

  • It's too slow. Currently, it takes anywhere between 20 and 30 minutes to complete. Sure, it's thousands of tests, divided into hundreds of test scripts, grouped by functional area, like login, blogs, albums, etc…

    Even considered all of this, I think we should aim to have a single functional test run complete in 5-10 minutes.

    Most of the time is being wasted in the communication with the test server and creation and destruction of the database.

  • It's unreliable, sometimes cumbersome to manage. In our setup, CruiseControl fires a functional test suite run after every commit into the source code repository.

    That initiates an ssh connection to the test server, where a shell script takes care of restarting the running apache, dropping and creating the test database, updating required packages, etc…

    This implies that you need an apache instance running on a specific port, a mysql instance running on another port, etc… In the long run, this proved to be an approach that doesn't scale very well. It's too error prone, and not very reliable. Maybe your test run will fail because mysql has mysteriously crashed, or some other random bad thing.

A functional test suite that sucks less

Given the motivation, I came up with this idea of a functional test suite that is:

  1. instant: check out the source code from the repository, and you're ready to go.
  2. self-contained: it shouldn't have any external servers that need to be managed or even running.
  3. reusable: it doesn't have to be a functional test suite. The same concept can be reused for unit test suites, or anything really.

A few months later the first idea, and a couple of weeks of work on it, and we were ready with the first prototype. Here's how we use it:

  • Check out the source code from the repository in ~/src/myproject
  • cd ~/src/myproject
  • ./bin/run-functional-test-suite
  • Private instances of Apache, MySQL, and the main application are created and started
  • A custom WWW::Mechanizer-based client runs the functional test cases against the Apache instance
  • A TAP stream from the test run is produced and collected
  • All custom instances are destroyed
  • ???
  • Profit!

The temporary test run directory is left there untouched, so you can inspect it in case of problems. This is priceless, because that folder contains all the configuration and log files that the test run generated.

The main ideas

  • The entire functional test suite should run within a unique temporary directory created on the fly. You can run many instances as you want, with your username or a different one. They won't conflict with each other.
  • Use as much as possible the same configuration files as the other environments. If you already have development, staging and production, then functional-test is just another one of your environments. If possible, avoid creating special config files just for the functional test suite.
  • The Apache, MySQL, and Application configuration files are simple templates where you need to fill in the apache hostname, apache port, mysql port, username, password, etc…
  • Apache is started up from the temporary directory using the full command invocation, as:

    /usr/sbin/apache2 -k start -f /tmp/{project}_{user}_testsuite_{pid}/conf/apache.conf
    

    and all paths in config files are absolute, to avoid any relative references that would lead to files not found.
    The configuration template files have to take this into account, and put a $prefix style variable
    everywhere, like (Apache config, for example):

    Timeout 10
    Keepalive On
    HostnameLookups Off
    ...
    Listen [% suite.apache.port %]
    ...
    

  • The MySQL instances are created and destroyed from scratch for every test suite run, using the amazing MySQL::Sandbox. My colleague Terje had to patch the sandbox creation script to fix a problem with 64-bit environments. We filed a bug on the MySQL::Sandbox RT queue about this. This is the only small problem found in a really excellent Perl tool. If you haven't looked at it, do it now.

All in all, I'm really satisfied of the result. We applied it to the Auth project for now, which is smaller than My Opera, and we fine-tuned the wrapping shell script to compensate for the underlyiung distribution automatically. We've been able to run this functional test suite on Ubuntu from version 8.04 to 10.04, and on Debian Lenny, with no modifications. The required MySQL version is detected according to your system, and MySQL::Sandbox is instructed accordingly.

It also results in a faster execution of the entire functional test suite.

What can we improve?

There's lots of things to improve:

  • Speed. Instead of creating and destroying the database for every test group, it would be nice to use transactions, and rollback everything at the end of the test group (t/{group-name}/*.t). I used to do this many years ago with Postgres, and it was perfectly safe. I'm not sure we will ever make it with MySQL, since for example, ALTER TABLE (and other SQL statements) completely ignores transactions.
  • Reliability. The private instances of Apache, MySQL (and recently we added memcached too), are started up from ports that are calculated from the originating bash script PID. So, if you run the test suite from the bash script running as PID 8029, then MySQL is started up on port 8030, Apache on 8031, memcached on 8032. This has worked very well for now, and we made sure we don't use ports < 1024, but could lead to mysterious failures if some local services are using ports like 5432, or others. The idea is that we could test if a port is available before using it. However, this error is already detected at startup time, so it shouldn't be a huge problem.

Feedback welcome!

Puppet custom facts, and master-less puppet deployment

As I mentioned a few weeks ago, I'm using Puppet for some smaller projects here at work. They're pilot projects to see how puppet behaves and scales for us before taking it into bigger challenges.

One of the problems so far is that we're using fabric as the "last-mile" deployment tool, and that doesn't yet have any way to run jobs in parallel. That's the reason why I'm starting to look elsewhere, like mcollective for example.

However, today I had to prepare a new varnish box for files.myopera.com. This new machine is in a different data center than our current one, so we don't have any puppetmaster deployed there yet. This stopped me from using puppet in also another project. But lately I've been reading on the puppet-users mailing list that several people have tried to deploy a master-less puppet configuration, where you have no puppetmasterd running. You just deploy the puppet files, via rsync, source control or pigeons, and then let the standalone puppet executable run.

Puppet master-less setup

To do this, you have to at least have a good set of reusable puppet modules, which I tried to build small pieces at a time during the last few months. So I decided to give it a shot, and got everything up and running quickly. Deployed my set of modules in /etc/puppet/modules, and built a single manifest file that looks like the following:


#
# Puppet standalone no-master deployment
# for files.myopera.com varnish nodes
#
node varnish_box {

    # Basic debian stuff
    $disableservices = [ "exim4", "nfs-common", "portmap" ]
    service { $disableservices:
        enable => "false",
        ensure => "stopped",
    }

    # Can cause overload on the filesystem through cronjobs
    package { "locate": ensure => "absent", }
    package { "man-db": ensure => "absent", }

    # Basic configuration, depends on data center too
    include opera
    include opera::sudoers
    include opera::admins::core_services
    include opera::datacenters::dc2

    # Basic packages now. These are all in-house modules
    include base_packages
    include locales
    include bash
    include munin
    include cron
    include puppet
    include varnish

    varnish::config { "files-varnish-config":
        vcl_conf => "files.vcl",
        storage_type => "malloc",
        storage_size => "20G",
        listen_port => 80,
        ttl => 864000,
        thread_pools => 8,
        thread_min => 800,
        thread_max => 10000,
    }

    #
    # Nginx (SSL certs required)
    #
    include nginx

    nginx::config { "/etc/nginx/nginx.conf":
        worker_processes => 16,
        worker_connections => 16384,
        keepalive_timeout => 5,
    }

    nginx::vhost { "https.files.myopera.com":
        ensure => "present",
        source => "/usr/local/src/myopera/config/nginx/sites-available/https.files.myopera.com",
    }

    bash:: prompt { "/root/.bashrc":
        description => "Files::Varnish",
        color => "red",
    }

    munin:: plugin::custom { "cpuopera": }

    munin:: plugin { "if_eth0":
        plugin_name => "if_"
    }

    munin:: plugin {
        [ "mem_", "load", "df", "df_inode", "netstat", "vmstat",
          "iostat", "uptime", "threads", "open_files", "memory", "diskstats" ]:
    }
}

node default inherits varnish_box {
}

node 'my.hostname.opera.com' inherits varnish_box {
}

This manifest installs varnish, nginx, a bunch of basic packages I always want on every machines (vim, tcpdump, etc…), munin and appropriate plugins already configured, and also a nice red bash prompt to warn me that this is production stuff.

This file is everything the puppet client needs to run and produce the desired effect, without needing a puppet master. Save it as varnish-node.pp and then you run it with:


puppet varnish-node.pp

One problem that usually arises is how to serve the static files. In this case, I assumed I'm going to check out the source code and config files from my own repository into /usr/local/src/... so I don't need to point puppet to a server with the classic:


source => "puppet:///module/filename"

but you can just use:


source => "/usr/local/whatever/in/my/local/filesystem"

That's great and it works just fine.

Custom facts

Puppet uses a utility called facter to extract "facts" from the underlying system, sysinfo-style. A typical facter run produces the following output:


$ facter
architecture => x86_64
domain => internal.opera.com
facterversion => 1.5.6
fqdn => cd01.internal.opera.com
...
hardwaremodel => x86_64
hostname => cd01
id => cosimo
ipaddress => 10.20.30.40
ipaddress_eth0 => 10.20.30.40
is_virtual => false
...
kernel => Linux
kernelmajversion => 2.6
...
operatingsystem => Ubuntu
operatingsystemrelease => 10.04
physicalprocessorcount => 1
processor0 => Intel(R) Core(TM)2 Duo CPU     E6550  @ 2.33GHz
processor1 => Intel(R) Core(TM)2 Duo CPU     E6550  @ 2.33GHz
processorcount => 2
...

and so on. Within puppet manifests, you can use any of these facts to influence the configuration of your system. For example, if memorysize > 4.0 Gb then run varnish with 2000 threads instead of 1000. This is all very cool, but sometimes you need something that facter doesn't give you by default.

That's why facter can be extended.

I tried creating a datacenter.rb facter plugin that would look at the box IP address and figure out in which data center we're located. That in turn can be used to setup the nameservers and other stuff.

Here's the code. My Ruby-fu is less than awesome:


#
# Provide an additional 'datacenter' fact
# to use in generic modules to provide datacenter
# specific settings, such as resolv.conf
#
# Cosimo, 03/Aug/2010
#

Facter.add("datacenter") do
    setcode do

        datacenter = "unknown"

        # Get current ip address from Facter's own database
        ipaddr = Facter.value(:ipaddress)

        # Data center on Mars
        if ipaddr.match("^88.88.88.")
            datacenter = "mars"

        # This one on Mercury
        elsif ipaddr.match("^99.99.99.")
            datacenter = "mercury"

        # And on Jupiter
        elsif ipaddr.match("^77.77.77.")
            datacenter = "jupiter"
        end

        datacenter
    end
end

However, there's one problem. When puppet runs, it doesn't get the new fact, even though facter from the command line can see it and execute it just fine (when the plugin is in the current directory).

Now I need to know how to inform puppet (and/or facter) that it has to look into one of my puppet modules' plugin (or lib from 0.25.x) directory to load my additional datacenter.rb fact.

Any idea ??

More Perl 6 development, String::CRC32

A couple of days were plenty enough to whip up a new Perl 6 module, String::CRC32.

This is a straight conversion from the Perl 5 CPAN version of String::CRC32. That one uses some C/XS code though. I wrote the equivalent in Perl 6. Was quite simple actually, once I learned a bit more on numeric bitwise operators.

The classic || and && for bitwise "OR" and "AND" are now written as:


Bitwise OR  '+|'
Bitwise AND '+&'
Bitwise XOR '+^'

So the following code:


$x &&= 0xFF;

becomes:


$x +&= 0xFF;

The reason behind this is that now there are not only numeric bitwise operators, but also boolean ones. You can read more in the official specification for Perl 6 operators.

You can download the code for the module on http://github.com/cosimo/perl6-string-crc32. Take a look at various other Perl 6 modules on modules.perl6.org.