Tag Archives: database

Verifying MySQL behaviour with automated test suites and mytap

You know everything about how MySQL treats UTF8 and LATIN1 charsets and how the collation table impacts on selection and insertion of data, right?

Great, then stop reading :)

I don't and since I'm in the process of setting up a new version of the Opera accounts database, I really don't want to screw up things. I tried to fully understand how MySQL works in this respect (charsets, collations, etc…) but reading documentation and memorizing it wasn't very easy. Plus, there's a thousands blog posts on the matter, not always 100% accurate.

So I thought I'd better get hands on and I wrote a kind of database test suite.

Now this test suite is hooked up to the main project builds on Jenkins. Here's a sample output:


[...]
[workspace] $ /bin/sh -xe /tmp/hudson3255767718598715423.sh
+ ./bin/run-dbtest-suite
basedir=/var/lib/jenkins/jobs/auth-db/workspace
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/__initdb__.my ........................................... 
1..2
ok 1 - Using utf8tests database
ok 2 - Server charset is latin1
ok
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/collation-utf8_bin.my ................................... 
1..6
ok 1 - All our records are there. No duplicate key error.
ok 2 - utf8_bin collation does not collate a/â/à/A/...
ok 3 - utf8_bin collation does not collate a/â/à/A/...
ok 4 - utf8_bin collation does not collate a/â/à/A/...
ok 5 - Query for mixed-case username does not return lowercase username
ok 6 - Query for upper-case username does not return lowercase username
ok
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/collation-utf8_general_ci.my ............................ 
1..7
ok 1 - Collation for t007 is utf8_general_ci
ok 2 - utf8_general_ci collation normalizes accents, diacritics and the like
ok 3 - A and Å are collated to the same character in the utf8_general_ci table
ok 4 - å and Å are collated to the same character in the utf8_general_ci table
ok 5 - lower/upper case chars are collated in the utf8_general_ci table
ok 6 - lower/upper case chars are collated in the utf8_general_ci table
ok 7 - We are allowed to insert all records just because there is no unique constraint
ok
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/collation-utf8_unicode_ci.my ............................ 
1..7
ok 1 - Collation for t005 is utf8_unicode_ci
ok 2 - utf8_unicode_ci collation normalizes accents, diacritics and the like
ok 3 - A and Å are collated to the same character in the utf8_unicode_ci table
ok 4 - å and Å are collated to the same character in the utf8_unicode_ci table
ok 5 - lower/upper case chars are collated in the utf8_unicode_ci table
ok 6 - lower/upper case chars are collated in the utf8_unicode_ci table
ok 7 - We are allowed to insert all records just because there is no unique constraint
ok
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/default-table-charset.my ................................ 
1..3
ok 1 - Default character set is utf8 when no charset is specified (from server)
ok 2 - Default character set is utf8 when "CHARSET utf8" specified in the CREATE TABLE
ok 3 - Default character set is utf8 when "CHARSET utf8" and "COLLATE" specified in the CREATE TABLE
ok
...
/var/lib/jenkins/jobs/auth-db/workspace/t/database-tests/username-with-utf8-chars.my ............................. 
1..5
ok 1 - We have some UTF-8 encoded string in our hands (hex)
ok 2 - We have some UTF-8 encoded string in our hands (charset)
ok 3 - Can select back UTF-8 content from a CHARSET utf8 table
ok 4 - Given string is exactly 24 bytes long (length)
ok 5 - Given string is exactly 8 (wide) characters long (char_length)
ok
All tests successful.
Files=11, Tests=80, 0.739731 wallclock secs ( 0.05 usr  0.02 sys +  0.07 cusr  0.01 csys =  0.15 CPU)
Result: PASS
Recording test results
Finished: SUCCESS

And here's an example of "sanity check" test case, which doesn't do much:


   1 -- Check that we can insert and retrieve UTF-8 content correctly
   2 
   3 BEGIN;
   4 
   5 SET NAMES utf8;
   6 
   7 SELECT tap.plan(5);
   8 
   9 USE auth_utf8tests;
  10 
  11 SET @username = '今日话题今日话题';
  12 SET @encoded  = 'C1BB8AE697A5E8AF9DE9A298E4BB8AE697A5E8AF9DE9A298';
  13 
  14 SELECT tap.eq(
  15     HEX(@username),
  16     @encoded,
  17     'We have some UTF-8 encoded string in our hands (hex)'
  18 );
  19 
  20 SELECT tap.eq(
  21     CHARSET(@username),
  22     'utf8',
  23     'We have some UTF-8 encoded string in our hands (charset)'
  24 );
  25 
  26 INSERT INTO t001 (f1) VALUES (@username);
  27 
  28 SELECT tap.eq(
  29     (SELECT HEX(f1) FROM t001 WHERE f1 = @username),
  30     @encoded,
  31     'Can select back UTF-8 content from a CHARSET utf8 table'
  32 );
  33 
  34 SELECT tap.eq(
  35     (SELECT LENGTH(f1) FROM t001 WHERE f1 = @username),
  36     24,
  37     'Given string is exactly 24 bytes long (length)'
  38 );
  39 
  40 SELECT tap.eq(
  41     (SELECT CHAR_LENGTH(f1) FROM t001 WHERE f1 = @username),
  42     8,
  43     'Given string is exactly 8 (wide) characters long (char_length)'
  44 );
  45 
  46 -- Finish the tests and clean up.
  47 CALL tap.finish();
  48 ROLLBACK;

This SQL test code uses mytap. You can see how the SELECT tap.* calls are just the equivalents of the TAP testing framework of Perl. SELECT tap.eq() is the equivalent of Test::More::is(), and so on.

Another, more interesting test case, is the following:


   1 --
   2 -- Verify how the utf8_unicode_ci collation works
   3 --
   4 
   5 BEGIN;
   6 
   7 SET NAMES utf8;
   8 
   9 SELECT tap.plan(12);
  10 
  11 USE auth_utf8tests;
  12 

  [...]

  40 SELECT tap.eq(
  41     (SELECT TABLE_COLLATION FROM information_schema.TABLES WHERE TABLE_SCHEMA=SCHEMA() AND TABLE_NAME='t015'),
  42     'utf8_unicode_ci',
  43     'Collation for t015 is utf8_unicode_ci'
  44 );

  [...]

  48 
  49 SELECT tap.eq(
  50     (SELECT GROUP_CONCAT(id) FROM t015 WHERE username = 'testuser1a' ORDER BY id),
  51     '10',
  52     'utf8_unicode_ci collation normalizes accents, diacritics and the like'
  53 );
  54 
  55 SELECT tap.eq(
  56     (SELECT GROUP_CONCAT(id) FROM t015 WHERE username = 'testuser1Å' ORDER BY id),
  57     '10',
  58     'A and Å are collated to the same character in the utf8_unicode_ci table'
  59 );
  60 
  61 SELECT tap.eq(
  62     (SELECT GROUP_CONCAT(id) FROM t015 WHERE username = 'testuser1å' ORDER BY id),
  63     '10',
  64     'å and Å are collated to the same character in the utf8_unicode_ci table'
  65 );
  66 
  67 SELECT tap.eq(
  68     (SELECT GROUP_CONCAT(id) FROM t015 WHERE username = 'TestUser1A' ORDER BY id),
  69     '10',
  70     'lower/upper case chars are collated in the utf8_unicode_ci table'
  71 );
  72 
  73 SELECT tap.eq(
  74     (SELECT GROUP_CONCAT(id) FROM t015 WHERE username = 'TESTUSER1A' ORDER BY id),
  75     '10',
  76     'lower/upper case chars are collated in the utf8_unicode_ci table'
  77 );
  78 
  79 SELECT tap.eq(
  80     (SELECT COUNT(*) FROM t015),
  81     1,
  82     'We are allowed to insert only 1 record, because the others collate to the same string'
  83 );
  84 
  85 -- Finish the tests and clean up.
  86 CALL tap.finish();
  87 ROLLBACK;

An interesting thing that I didn't know how to do in the beginning is how to trap errors. I left out that part from the test code to simplify, but here it is:


  13 DELIMITER //
  14 
  15 DROP PROCEDURE IF EXISTS populate_table //
  16 
  17 CREATE PROCEDURE populate_table ()
  18 BEGIN
  19 
  20     DECLARE CONTINUE HANDLER FOR SQLSTATE '23000' BEGIN
  21         SELECT tap.ok(
  22             1,
  23             'We should get dupkey errors when inserting data with collation utf8_unicode_ci'
  24         );
  25     END;
  26 
  27     INSERT INTO t015 (id,username,note) VALUES (10, 'testuser1a', 'plain');
  28     INSERT INTO t015 (id,username,note) VALUES (20, 'testuser1â', 'circumflex a');
  29     INSERT INTO t015 (id,username,note) VALUES (30, 'testuser1à', 'a grave');
  30     INSERT INTO t015 (id,username,note) VALUES (40, 'testuser1Å', 'A circ');
  31     INSERT INTO t015 (id,username,note) VALUES (50, 'TestUser1A', 'mixed case');
  32     INSERT INTO t015 (id,username,note) VALUES (60, 'TESTUSER1A', 'upper case');
  33 
  34 END;
  35 
  36 //
  37 
  38 DELIMITER ;
  39 
  46 /* Should generate 5 dupkey errors (taken as successful tests) */
  47 CALL populate_table;

It's a bit convoluted. To trap errors you have use the DECLARE HANDLER statement. DECLARE CONTINUE HANDLER FOR SQLSTATE '23000' means that whenever SQLSTATE is '23000', and that corresponds to a duplicate key error, then execute this block of code. All of that must necessarily be wrapped into a stored procedure. Handlers outside of stored procedures are not allowed.

In this particular tests, the table uses the utf8_unicode_ci collation table, so we are expecting a duplicate key error on username whenever we insert the string 'testuser1à' or 'TESTUSER1A', because 'testuser1a' was already inserted at the beginning. Of all the INSERT statements, only the first one is bound to succeed, so I put a SELECT tap.ok(1) for the duplicate key HANDLER and I expect 5 tests when I make the CALL populate_table;.

This of course may seem trivial. And I guess it is, but for me it's a much better way of learning than scouring through the manuals or the many blog posts out there that may or may not reflect the environment I'm working with.

Routinely running this kind of test suite makes it possible and easy to verify the database behaviour:

  • instantly
  • after upgrades (5.1 -> 5.5? -> 6?) or storage engine changes
  • after mysql configuration changes. For example, I discovered in this way that adding default-charset=utf8 in my MySQL config breaks everything.

I consider this my live documentation on how MySQL works. I would really appreciate if you have any feedback on this. Have fun!

MySQL alter table in a transaction

I had a quite painful experience yesterday, I would say…

We were about to run live database migrations, when I decided to try it first in a staging database server, to have a more precise idea of how long they would take. So I connected to the MySQL test system:

mysql> use myopera
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> BEGIN WORK;
Query OK, 0 rows affected (0.00 sec)

mysql> ALTER TABLE blah ADD COLUMN xxx ...
...

The ALTER operation completed in a minute or so. Then I wanted to revert it, so I typed:

mysql> ROLLBACK;

So the transaction was rolled back. Just to double check, I verified that the table was indeed left in its original state, and to my surprise, it wasn't. Yes, the table is a InnoDB table, and yes, I had put the ALTER statement in a transaction.

Even after a ROLLBACK the table is left altered… Isn't that surprising? For me, it was. Of course, it is a documented MySQL behavior, and it's filed as bug/feature request.

Primary to secondary master failover

Here's how to failover from primary to secondary master.
This was written following the My Opera case, and we use MySQL, but should be fairly generic.

Disable monitoring checks

  • Pause any Pingdom checks that are running for the affected systems
  • Set downtime or disable notifications in Nagios for the affected systems

Record log file and position

Assuming your secondary master (DB2) is idle, then now you have to record the log file and position by issuing a SHOW MASTER STATUS command:


mysql> SHOW MASTER STATUS G
*************************** 1. row ***************************
            File: mysql-bin.000024    <- MASTER LOG FILE
        Position: 91074774            <- MASTER LOG POSITION
    Binlog_Do_DB:
Binlog_Ignore_DB:

1 row in set (0.00 sec)

Write them down somewhere.
If you need to perform any kind of write/alter query on this host, then you have to issue the show master status command again, because position will change.

Also try repeating this command. You should see that the log file and position do not change between different runs.

Enable maintenance mode

Now is the time to enable your maintenance or downtime mode for the site or service. That will of course depend on your deployment tools.

Stop backend servers

Your backend/application servers might need to stay up and running. For example, in case of the Auth service, we want this, because we're going to serve static responses (html, xml, etc…) to the clients instead of just letting the connections hang.

In other cases, it's fine to just shut down the backends. You may want to do this for 2 reasons:

  • to make sure that nobody is accidentally hitting your master database, from your internal network or otherwise
  • because doing so should close all the connections to your master database. This is actually depending on the wait_timeout variable in the mysql server. The connections won't go away until wait_timeout seconds have passed. This is the normal behaviour, so don't panic if you still see connections after you shut down the backends.

Switch to the new master now

This depends on how you actually perform the switch. I can imagine at least 2 ways to do this:

  • by instructing LVS to direct all connections to the secondary master
  • take over the IP address either manually or using keepalived

On My Opera, we use keepalived with a private group between the two master database servers, so it's just a matter of:

  • stopping keepalived on the primary master database
  • starting keepalived on the secondary master database

There is a quick and dirty bash script that allows to verify who's the master and makes the switch.

#!/bin/sh

DB1=pri-master-hostname
DB2=sec-master-hostname

function toggle_keepalive() {
        host=$1
        if [[ `ssh $host pidof keepalived` == "" ]]; then
                ssh $host /etc/init.d/keepalived start
                if [[ `ssh $host pidof keepalived` == "" ]]; then
                        echo '*** KEEPALIVE START FAILED ***'
                        echo 'Aborting the master failover procedure'
                        exit
                fi
        else
                ssh $host /etc/init.d/keepalived stop
                if [[ `ssh $host pidof keepalived` != "" ]]; then
                        echo '*** KEEPALIVE STOP FAILED ***'
                        echo 'Aborting the master failover procedure'
                        exit
                fi
        fi
}

echo "Master Database failover"
echo

# Find out who's the primary master now, and swap them
if [[ `ssh $DB1 pidof keepalived` == "" ]]; then
        PRIMARY=$DB2
        SECONDARY=$DB1
else
        PRIMARY=$DB1
        SECONDARY=$DB2
fi

echo Primary is $PRIMARY
echo Secondary is $SECONDARY

# Shutdown primary first, then enable secondary
toggle_keepalive $PRIMARY
toggle_keepalive $SECONDARY

As soon as you do that, the secondary master will be promoted to primary master.
Since they are assumed to be already replicating from each other, nothing will change for them. It will however for all the slaves that were replicating from the primary master. We'll see what to do about that later.

Restart backend servers

Now it's the right time to restart the backend servers, and check that they correctly connect to the new primary master.

On My Opera, we're using a virtual address, w-mlb (write-mysql-load-balancer), to refer to the active primary master database. We use this name in the configuration files everywhere.

This means that we don't have to change anything in the backend servers configuration. We just restart them, and they will connect to the new primary master, due to the IP takeover step described above.

Turn off maintenance mode

If the backends are working correctly, they're connecting to the new master db, it's time to remove the maintenance page, so do that.

We're enabling and disabling maintenance mode by enabling and disabling a virtual host configuration in our frontends and reloading or restarting the frontend httpd servers.

From now on, your application is hopefully up and running and receiving client requests, so your downtime window is over.

Check replication lag

The database slaves at this point are still replicating from the former primary master database (DB1).

But DB1 now is not receiving any traffic (queries) anymore, so it's basically idle, and it should be. Any queries happening on DB1 now mean that something is seriously wrong. There might be lingering connections, but no activity.

Then it's important that all the slaves show no replication lag, so issuing a SHOW SLAVE STATUS command should show zero seconds behind.

mysql> SHOW SLAVE STATUS G
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: <DB1-ip>
                Master_Port: <port>
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000025
        Read_Master_Log_Pos: 13691126
...
      Seconds_Behind_Master: 0

1 row in set (0.00 sec)

It's important that Seconds Behind Master is zero.
If it's not, it means that the slave needs more time to fully replicate all the past traffic that had been going on on the former primary master, DB1.

Remember that the primary master is now DB2, while DB1 is the secondary master.

Change master on the slaves

Now you can perform the CHANGE MASTER TO command on all the slaves.

Now you have to bring back the notes about MASTER LOG FILE and MASTER LOG POSITION.

First, stop the slave replication.

mysql> STOP SLAVE;

Then the exact command to issue, if nothing else about your replication changed, is:

mysql> CHANGE MASTER TO MASTER_HOST='<DB2-ip>', MASTER_LOG_FILE='<master_log_file>', MASTER_LOG_POSITION='<master_log_position>';

Then restart the slave replication:

mysql> START SLAVE;
mysql> SHOW SLAVE STATUS G

The following SHOW SLAVE STATUS G command should show the replication running, and, depending on how long it took you to change master since the new master took over the ip, the number of seconds of replication lag.

This number should rapidly go down towards zero.

If it's not, then you might have a problem. Go hide now or take the first flight to Australia or something.

We wrote a switch-master Perl script that proved to be very effective and useful. Example:

./switch-master --host <your_slave> --new-master <new_master_ip> --log-file <master_log_file> --log-pos <master_log_position>

This script performs a lot of sanity checks. Before switching master, it checks that replication lag is zero. If it's not, waits a bit and checks again, etc…

It's made to try to prevent disaster from striking. Very useful and quick to use.

Enable monitoring checks

Now verify that everything looks fine, replication lag is zero, your backends are working correctly, try to use your site a bit.

If everything's fine, enable or unpause the monitoring checks.
You have made it!

Looking at Cassandra

When I start to get interested in a project, I usually join the project users mailing list and read it for a couple of months, to get a general feeling of what's going on, what problems do people have, etc…

I became very interested in Cassandra, "A highly scalable, eventually consistent, distributed, structured key-value store".
So, almost 2 months ago, I joined the cassandra-users mailing list.

Turns out that Cassandra is in production at huge sites like Twitter and Digg. Which doesn't really tell you much if you don't know how they use it, and what they use it for. However, guess what? There's a lot of information out there.

Here's a few links I'd like to share about Cassandra:

The MySQL Sandbox

I learned about the MySQL Sandbox last year in Lisbon for the European Perl conference. I remember I talked about it with Giuseppe Maxia, the original author. I promised him to try it.

I wasn't really convinced about it until the first time I tried it.

That was when on My Opera we switched from master-slave to master-master replication. It was during a weekend last November. I was a bit scared of doing the switch. Then I remembered about MySQL Sandbox, and I tried it on an old laptop.

I was absolutely amazed to discover that I could simulate pretty much any setup, from simple to complex, master-slave, master-master, circular multi-master, etc…
It was also very quick to setup, and it's also very fast to install new sandboxed servers.
You can setup a 5 servers replication setup in less than 30 seconds.

MySQL Sandbox is a Perl tool that you can easily install from CPAN with:

$ sudo cpan MySQL-Sandbox

You get the make_sandbox command that allows you to create new sandboxes. Right now I'm trying it again for a maintainance operation we have to do on My Opera soon. I'm using the master-master setup like this:

make_replication_sandbox --master_master --server-version=/home/cosimo/mysql/mysql-5.0.51a-....tar.gz

so I can simulate the entire operation and try to minimize the risk of messing up the production boxes while also getting a bit more confident about these procedures.

MySQL Sandbox, try it. You won't go back :-)

MySQL is just crazy or what?

Ok, I was working on the staging/test database, checking out some performance problems I was experiencing.
Then I thought of optimizing one table. Optimizing should be like rebuilding the physical data file to compact record structures, remove invalidated data, …

MySQL has this OPTIMIZE TABLE command.
Ok, fire!

2/3 minutes passed. Then other 2/3. Then again.

I started to worry and tried to start new sessions, to see if table was blocked. It was.
Completely locked up.

Ok, I left the session running. Went to lunch. Came back.
Still running. WTF? I decided to interrupt the command. CTRL + C.

MySQL responded with some error code, 22 or something.
Ok, just check if I have to rebuild indexes, just in case.

SELECT * FROM sometable

Boom!

Table corrupted. No kidding. Tried to REPAIR TABLE and REPAIR TABLE QUICK.
No way. Table was completely DESTROYED.

Amazing…