tap | Random hacking

You know everything about how MySQL treats UTF8 and LATIN1 charsets and how the collation table impacts on selection and insertion of data, right?

Great, then stop reading :)

I don't and since I'm in the process of setting up a new version of the Opera accounts database, I really don't want to screw up things. I tried to fully understand how MySQL works in this respect (charsets, collations, etc…) but reading documentation and memorizing it wasn't very easy. Plus, there's a thousands blog posts on the matter, not always 100% accurate.

So I thought I'd better get hands on and I wrote a kind of database test suite.

Now this test suite is hooked up to the main project builds on Jenkins. Here's a sample output:


[...]
[workspace] $ /bin/sh -xe /tmp/hudson3255767718598715423.sh
+ ./bin/run-dbtest-suite
basedir=/var/lib/jenkins/jobs/auth-db/workspace
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/__initdb__.my ........................................... 
1..2
ok 1 - Using utf8tests database
ok 2 - Server charset is latin1
ok
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/collation-utf8_bin.my ................................... 
1..6
ok 1 - All our records are there. No duplicate key error.
ok 2 - utf8_bin collation does not collate a/â/à/A/...
ok 3 - utf8_bin collation does not collate a/â/à/A/...
ok 4 - utf8_bin collation does not collate a/â/à/A/...
ok 5 - Query for mixed-case username does not return lowercase username
ok 6 - Query for upper-case username does not return lowercase username
ok
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/collation-utf8_general_ci.my ............................ 
1..7
ok 1 - Collation for t007 is utf8_general_ci
ok 2 - utf8_general_ci collation normalizes accents, diacritics and the like
ok 3 - A and Å are collated to the same character in the utf8_general_ci table
ok 4 - å and Å are collated to the same character in the utf8_general_ci table
ok 5 - lower/upper case chars are collated in the utf8_general_ci table
ok 6 - lower/upper case chars are collated in the utf8_general_ci table
ok 7 - We are allowed to insert all records just because there is no unique constraint
ok
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/collation-utf8_unicode_ci.my ............................ 
1..7
ok 1 - Collation for t005 is utf8_unicode_ci
ok 2 - utf8_unicode_ci collation normalizes accents, diacritics and the like
ok 3 - A and Å are collated to the same character in the utf8_unicode_ci table
ok 4 - å and Å are collated to the same character in the utf8_unicode_ci table
ok 5 - lower/upper case chars are collated in the utf8_unicode_ci table
ok 6 - lower/upper case chars are collated in the utf8_unicode_ci table
ok 7 - We are allowed to insert all records just because there is no unique constraint
ok
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/default-table-charset.my ................................ 
1..3
ok 1 - Default character set is utf8 when no charset is specified (from server)
ok 2 - Default character set is utf8 when "CHARSET utf8" specified in the CREATE TABLE
ok 3 - Default character set is utf8 when "CHARSET utf8" and "COLLATE" specified in the CREATE TABLE
ok
...
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/username-with-utf8-chars.my ............................. 
1..5
ok 1 - We have some UTF-8 encoded string in our hands (hex)
ok 2 - We have some UTF-8 encoded string in our hands (charset)
ok 3 - Can select back UTF-8 content from a CHARSET utf8 table
ok 4 - Given string is exactly 24 bytes long (length)
ok 5 - Given string is exactly 8 (wide) characters long (char_length)
ok
All tests successful.
Files=11, Tests=80, 0.739731 wallclock secs ( 0.05 usr  0.02 sys +  0.07 cusr  0.01 csys =  0.15 CPU)
Result: PASS
Recording test results
Finished: SUCCESS

And here's an example of "sanity check" test case, which doesn't do much:


   1 -- Check that we can insert and retrieve UTF-8 content correctly
   2 
   3 BEGIN;
   4 
   5 SET NAMES utf8;
   6 
   7 SELECT tap.plan(5);
   8 
   9 USE auth_utf8tests;
  10 
  11 SET @username = '今日话题今日话题';
  12 SET @encoded  = 'C1BB8AE697A5E8AF9DE9A298E4BB8AE697A5E8AF9DE9A298';
  13 
  14 SELECT tap.eq(
  15     HEX(@username),
  16     @encoded,
  17     'We have some UTF-8 encoded string in our hands (hex)'
  18 );
  19 
  20 SELECT tap.eq(
  21     CHARSET(@username),
  22     'utf8',
  23     'We have some UTF-8 encoded string in our hands (charset)'
  24 );
  25 
  26 INSERT INTO t001 (f1) VALUES (@username);
  27 
  28 SELECT tap.eq(
  29     (SELECT HEX(f1) FROM t001 WHERE f1 = @username),
  30     @encoded,
  31     'Can select back UTF-8 content from a CHARSET utf8 table'
  32 );
  33 
  34 SELECT tap.eq(
  35     (SELECT LENGTH(f1) FROM t001 WHERE f1 = @username),
  36     24,
  37     'Given string is exactly 24 bytes long (length)'
  38 );
  39 
  40 SELECT tap.eq(
  41     (SELECT CHAR_LENGTH(f1) FROM t001 WHERE f1 = @username),
  42     8,
  43     'Given string is exactly 8 (wide) characters long (char_length)'
  44 );
  45 
  46 -- Finish the tests and clean up.
  47 CALL tap.finish();
  48 ROLLBACK;

This SQL test code uses mytap. You can see how the SELECT tap.* calls are just the equivalents of the TAP testing framework of Perl. SELECT tap.eq() is the equivalent of Test::More::is(), and so on.

Another, more interesting test case, is the following:


   1 --
   2 -- Verify how the utf8_unicode_ci collation works
   3 --
   4 
   5 BEGIN;
   6 
   7 SET NAMES utf8;
   8 
   9 SELECT tap.plan(12);
  10 
  11 USE auth_utf8tests;
  12 

  [...]

  40 SELECT tap.eq(
  41     (SELECT TABLE_COLLATION FROM information_schema.TABLES WHERE TABLE_SCHEMA=SCHEMA() AND TABLE_NAME='t015'),
  42     'utf8_unicode_ci',
  43     'Collation for t015 is utf8_unicode_ci'
  44 );

  [...]

  48 
  49 SELECT tap.eq(
  50     (SELECT GROUP_CONCAT(id) FROM t015 WHERE username = 'testuser1a' ORDER BY id),
  51     '10',
  52     'utf8_unicode_ci collation normalizes accents, diacritics and the like'
  53 );
  54 
  55 SELECT tap.eq(
  56     (SELECT GROUP_CONCAT(id) FROM t015 WHERE username = 'testuser1Å' ORDER BY id),
  57     '10',
  58     'A and Å are collated to the same character in the utf8_unicode_ci table'
  59 );
  60 
  61 SELECT tap.eq(
  62     (SELECT GROUP_CONCAT(id) FROM t015 WHERE username = 'testuser1å' ORDER BY id),
  63     '10',
  64     'å and Å are collated to the same character in the utf8_unicode_ci table'
  65 );
  66 
  67 SELECT tap.eq(
  68     (SELECT GROUP_CONCAT(id) FROM t015 WHERE username = 'TestUser1A' ORDER BY id),
  69     '10',
  70     'lower/upper case chars are collated in the utf8_unicode_ci table'
  71 );
  72 
  73 SELECT tap.eq(
  74     (SELECT GROUP_CONCAT(id) FROM t015 WHERE username = 'TESTUSER1A' ORDER BY id),
  75     '10',
  76     'lower/upper case chars are collated in the utf8_unicode_ci table'
  77 );
  78 
  79 SELECT tap.eq(
  80     (SELECT COUNT(*) FROM t015),
  81     1,
  82     'We are allowed to insert only 1 record, because the others collate to the same string'
  83 );
  84 
  85 -- Finish the tests and clean up.
  86 CALL tap.finish();
  87 ROLLBACK;

An interesting thing that I didn't know how to do in the beginning is how to trap errors. I left out that part from the test code to simplify, but here it is:


  13 DELIMITER //
  14 
  15 DROP PROCEDURE IF EXISTS populate_table //
  16 
  17 CREATE PROCEDURE populate_table ()
  18 BEGIN
  19 
  20     DECLARE CONTINUE HANDLER FOR SQLSTATE '23000' BEGIN
  21         SELECT tap.ok(
  22             1,
  23             'We should get dupkey errors when inserting data with collation utf8_unicode_ci'
  24         );
  25     END;
  26 
  27     INSERT INTO t015 (id,username,note) VALUES (10, 'testuser1a', 'plain');
  28     INSERT INTO t015 (id,username,note) VALUES (20, 'testuser1â', 'circumflex a');
  29     INSERT INTO t015 (id,username,note) VALUES (30, 'testuser1à', 'a grave');
  30     INSERT INTO t015 (id,username,note) VALUES (40, 'testuser1Å', 'A circ');
  31     INSERT INTO t015 (id,username,note) VALUES (50, 'TestUser1A', 'mixed case');
  32     INSERT INTO t015 (id,username,note) VALUES (60, 'TESTUSER1A', 'upper case');
  33 
  34 END;
  35 
  36 //
  37 
  38 DELIMITER ;
  39 
  46 /* Should generate 5 dupkey errors (taken as successful tests) */
  47 CALL populate_table;

It's a bit convoluted. To trap errors you have use the DECLARE HANDLER statement. DECLARE CONTINUE HANDLER FOR SQLSTATE '23000' means that whenever SQLSTATE is '23000', and that corresponds to a duplicate key error, then execute this block of code. All of that must necessarily be wrapped into a stored procedure. Handlers outside of stored procedures are not allowed.

In this particular tests, the table uses the utf8_unicode_ci collation table, so we are expecting a duplicate key error on username whenever we insert the string 'testuser1à' or 'TESTUSER1A', because 'testuser1a' was already inserted at the beginning. Of all the INSERT statements, only the first one is bound to succeed, so I put a SELECT tap.ok(1) for the duplicate key HANDLER and I expect 5 tests when I make the CALL populate_table;.

This of course may seem trivial. And I guess it is, but for me it's a much better way of learning than scouring through the manuals or the many blog posts out there that may or may not reflect the environment I'm working with.

Routinely running this kind of test suite makes it possible and easy to verify the database behaviour:

instantly
after upgrades (5.1 -> 5.5? -> 6?) or storage engine changes
after mysql configuration changes. For example, I discovered in this way that adding default-charset=utf8 in my MySQL config breaks everything.

I consider this my live documentation on how MySQL works. I would really appreciate if you have any feedback on this. Have fun!

I first learned about functional test suites when I started working on the My Opera code. Two years and some time later, I found myself slightly hating the My Opera functional test suite.

The main reasons:

It's too slow. Currently, it takes anywhere between 20 and 30 minutes to complete. Sure, it's thousands of tests, divided into hundreds of test scripts, grouped by functional area, like login, blogs, albums, etc…

Even considered all of this, I think we should aim to have a single functional test run complete in 5-10 minutes.

Most of the time is being wasted in the communication with the test server and creation and destruction of the database.
It's unreliable, sometimes cumbersome to manage. In our setup, CruiseControl fires a functional test suite run after every commit into the source code repository.

That initiates an ssh connection to the test server, where a shell script takes care of restarting the running apache, dropping and creating the test database, updating required packages, etc…

This implies that you need an apache instance running on a specific port, a mysql instance running on another port, etc… In the long run, this proved to be an approach that doesn't scale very well. It's too error prone, and not very reliable. Maybe your test run will fail because mysql has mysteriously crashed, or some other random bad thing.

A functional test suite that sucks less

Given the motivation, I came up with this idea of a functional test suite that is:

instant: check out the source code from the repository, and you're ready to go.
self-contained: it shouldn't have any external servers that need to be managed or even running.
reusable: it doesn't have to be a functional test suite. The same concept can be reused for unit test suites, or anything really.

A few months later the first idea, and a couple of weeks of work on it, and we were ready with the first prototype. Here's how we use it:

Check out the source code from the repository in ~/src/myproject
cd ~/src/myproject
./bin/run-functional-test-suite
Private instances of Apache, MySQL, and the main application are created and started
A custom WWW::Mechanizer-based client runs the functional test cases against the Apache instance
A TAP stream from the test run is produced and collected
All custom instances are destroyed
???
Profit!

The temporary test run directory is left there untouched, so you can inspect it in case of problems. This is priceless, because that folder contains all the configuration and log files that the test run generated.

The main ideas

The entire functional test suite should run within a unique temporary directory created on the fly. You can run many instances as you want, with your username or a different one. They won't conflict with each other.
Use as much as possible the same configuration files as the other environments. If you already have development, staging and production, then functional-test is just another one of your environments. If possible, avoid creating special config files just for the functional test suite.
The Apache, MySQL, and Application configuration files are simple templates where you need to fill in the apache hostname, apache port, mysql port, username, password, etc…
Apache is started up from the temporary directory using the full command invocation, as:
/usr/sbin/apache2 -k start -f /tmp/{project}_{user}_testsuite_{pid}/conf/apache.conf

and all paths in config files are absolute, to avoid any relative references that would lead to files not found.
The configuration template files have to take this into account, and put a $prefix style variable
everywhere, like (Apache config, for example):

Timeout 10 Keepalive On HostnameLookups Off ... Listen [% suite.apache.port %] ...
The MySQL instances are created and destroyed from scratch for every test suite run, using the amazing MySQL::Sandbox. My colleague Terje had to patch the sandbox creation script to fix a problem with 64-bit environments. We filed a bug on the MySQL::Sandbox RT queue about this. This is the only small problem found in a really excellent Perl tool. If you haven't looked at it, do it now.

All in all, I'm really satisfied of the result. We applied it to the Auth project for now, which is smaller than My Opera, and we fine-tuned the wrapping shell script to compensate for the underlyiung distribution automatically. We've been able to run this functional test suite on Ubuntu from version 8.04 to 10.04, and on Debian Lenny, with no modifications. The required MySQL version is detected according to your system, and MySQL::Sandbox is instructed accordingly.

It also results in a faster execution of the entire functional test suite.

What can we improve?

There's lots of things to improve:

Speed. Instead of creating and destroying the database for every test group, it would be nice to use transactions, and rollback everything at the end of the test group (t/{group-name}/*.t). I used to do this many years ago with Postgres, and it was perfectly safe. I'm not sure we will ever make it with MySQL, since for example, ALTER TABLE (and other SQL statements) completely ignores transactions.
Reliability. The private instances of Apache, MySQL (and recently we added memcached too), are started up from ports that are calculated from the originating bash script PID. So, if you run the test suite from the bash script running as PID 8029, then MySQL is started up on port 8030, Apache on 8031, memcached on 8032. This has worked very well for now, and we made sure we don't use ports < 1024, but could lead to mysterious failures if some local services are using ports like 5432, or others. The idea is that we could test if a port is available before using it. However, this error is already detected at startup time, so it shouldn't be a huge problem.

Feedback welcome!

Random hacking

Assume nothing. Code defensively. Keep it simple, stupid!

Tag Archives: tap

Verifying MySQL behaviour with automated test suites and mytap

Self-contained instant functional test suites

A functional test suite that sucks less

The main ideas

What can we improve?