Tag Archives: cpan

Matching IPv6 addresses with Regexp::Common

I wish Regexp::Common had a $RE{net}{IPv6} regular expression, but it doesn't (yet).

So I tried to implement this myself, but ripped off the IPv6 matching bit from the existing Regexp::IPv6 which happens to have a working IPv6 regular expression with a reasonable test suite. Now, why Regexp::IPv6 is not part of Regexp::Common?

By the way, I'll copy/paste the full regular expression to match IPv6 addresses, just for fun:

:(?::[0-9a-fA-F]{1,4}){0,5}(?:(?::[0-9a-fA-F]{1,4}){1,2}|:(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.]
(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?
[0-9]{1,2})))|[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:
[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}|:)|(?::(?:[0-9a-fA-F]{1,4})?|(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?
[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4]
[0-9]|[0-1]?[0-9]{1,2}))))|:(?:(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})
[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|[0-9a-fA-F]{1,4}(?::[0-9a-fA-F]
{1,4})?|))|(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|
2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|:[0-9a-fA-F]{1,4}(?::(?:(?:25[0-5]|2[0-4]
[0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.]
(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){0,2})|:))|(?:(?::[0-9a-fA-F]{1,4}){0,2}(?::(?:
(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?
[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))|(?:(?::[0-9a-fA-F]{1,4}){0,3}
(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|
[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))|(?:(?::[0-9a-fA-F]{1,4})
{0,4}(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4]
[0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))

Of course, I will never be able to tell if it's right or wrong, but the fact is that it passes the test suite :)
However, the actual code is not like that: it generates the full regular expression from a few components. Anyway, I've pushed a ipv6 branch on my fork of Regexp::Common. I hope it will be included soon in Regexp::Common or improved it enough to be included in it, so we can finally match IPv6 addresses with:


use Regexp::Common;

my $addr = '2001:0db8:0000:0000:0000:0000:1428:57ab';
if ($addr =~ $RE{net}{IPv6}) {
    print "Yes, it is an IPv6 address";
}
else {
    print "No, it isn't";
}

Continuous integration of Perl-based projects in Hudson/Jenkins

I didn't find massive amounts of information about how to link any Perl-based project to Jenkins for continuous integration, but there's a few presentations on Slideshare that carry some nice ideas.

While some older pages say that "there's no out-of-the-box integration, etc…", I think there is. A very simple, very straight-forward way to integrate any (Perl) project that uses TAP into Jenkins.

TAP::Harness::JUnit

Here we go then:

TAP::Harness::JUnit will capture all the standard TAP output and turn it into the default JUnit XML output that Jenkins expects. And you don't need to do anything to make this happen. How cool is that? Read below.

Build instructions

You need to instruct Jenkins on how to build your project. So, in the "Build" panel, I usually put:

prove -I ./lib -v

If you don't use prove, be ashamed and start using it :) You'll never look back. So, getting Jenkins to understand TAP is just a matter of modifying that command to read:

prove -I ./lib -v --harness=TAP::Harness::JUnit

Here's the actual Build panel screenshot:

That's it. prove will produce a junit_output.xml file with the JUnit-compatible XML output that corresponds to the standard TAP output.

Post-build actions

Now you need to tell Jenkins that the file is actually there. I'm not sure why, but this is not automatic. You need to tell it to "Publish JUnit test results". Now, if you ask me that's totally surprising, but it works. So:

That should be it. Run your build and you should see your tests output picked up.

Geo::IP support for IPv6 geolocation

We're currently looking into IPv6-enabling our services. One of the missing bits is being able to geolocate IPv6 client addresses. We're using the MaxMind GeoIP database. The main Perl library for this is Geo::IP.

The current version of Geo::IP out on CPAN, 1.38, does not support IPv6 lookups. I contacted the maintainer of Geo::IP asking for more information. In the meanwhile, I hacked together just enough of IPv6 support to be able to successfully geolocate a test address. Later on, I discovered that IPv6 is already available in the hopefully soon-to-be-released version of Geo::IP archived at Sourceforge.

Let's hope it lands on CPAN soon. In the meantime, if you really really want, you can try out my changes against CPAN v1.38. It was enough for me to start testing the integration with our other code and projects.

My Geo::IP with IPv6 support

Ubuntu 10.10, modperl and Apache segfaulting fixed

Last month, before moving to Melbourne, where I am now, to work in the Opera Australia office for a few months, I had to setup a laptop for all the development work I normally do. So I chose Ubuntu 10.10 amd64. I have to say I'm quite happy with it. Everything works out of the box for me, including a Quickcam 9000 USB camera I used to shoot this poor time-lapse video from my new office window. Woot!

Anyway, the development environment for one particular project consists of Apache and mod_perl. So I setup the usual list of dependencies, but when I tried to start Apache to run the test suite, it would always stop right away with a segmentation fault.

Didn't really dig into the problem. Just straced the apache process, and that's what I got:

[apache starts up, reads a bunch of Perl modules, and opens the access  
log...]
...
brk(0x7f342adbd000)                     = 0x7f342adbd000
...
...
brk(0x7f342adde000)                     = 0x7f342adde000
brk(0x7f342adff000)                     = 0x7f342adff000
brk(0x7f342ae20000)                     = 0x7f342ae20000
brk(0x7f342ae41000)                     = 0x7f342ae41000
stat("/usr/lib/perl5/auto/DBI/DESTROY.al", 0x7f341e8459b0) = -1 ENOENT (No  
such file or directory)
stat("/home/cosimo/src/auth-svn/lib/auto/DBI/DESTROY.al", 0x7fffc4e2d520)  
= -1 ENOENT (No such file or directory)
stat("/home/cosimo/src/myopera-trunk/lib/auto/DBI/DESTROY.al",  
0x7fffc4e2d520) = -1 ENOENT (No such file or directory)
stat("/etc/perl/auto/DBI/DESTROY.al", 0x7fffc4e2d520) = -1 ENOENT (No such  
file or directory)
stat("/usr/local/lib/perl/5.10.1/auto/DBI/DESTROY.al", 0x7fffc4e2d520) =  
-1 ENOENT (No such file or directory)
stat("/usr/local/share/perl/5.10.1/auto/DBI/DESTROY.al", 0x7fffc4e2d520) =  
-1 ENOENT (No such file or directory)
stat("/usr/lib/perl5/auto/DBI/DESTROY.al", 0x7fffc4e2d520) = -1 ENOENT (No  
such file or directory)
stat("/usr/share/perl5/auto/DBI/DESTROY.al", 0x7fffc4e2d520) = -1 ENOENT  
(No such file or directory)
stat("/usr/lib/perl/5.10/auto/DBI/DESTROY.al", 0x7fffc4e2d520) = -1 ENOENT  
(No such file or directory)
stat("/usr/share/perl/5.10/auto/DBI/DESTROY.al", 0x7fffc4e2d520) = -1  
ENOENT (No such file or directory)
stat("/usr/local/lib/site_perl/auto/DBI/DESTROY.al", 0x7fffc4e2d520) = -1  
ENOENT (No such file or directory)
stat("./auto/DBI/DESTROY.al", 0x7fffc4e2d520) = -1 ENOENT (No such file or  
directory)
stat("/var/tmp/test_cosimo_22931/auto/DBI/DESTROY.al", 0x7fffc4e2d520) =  
-1 ENOENT (No such file or directory)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

I thought it would be wiser to ask for advice on the DBI and mod_perl mailing lists. Tim Bunce suggested to try and get a stack trace of Apache. Why didn't I think of that in the first place? A few days later, I got my stack trace:

# gdb -c ./core /usr/sbin/apache2 
... 
Reading symbols from ... 
... 
Core was generated by `/usr/sbin/apache2 -d /var/tmp/test_cosimo_9727 -k 
start -C User cosimo -C Group ...'. 

Program terminated with signal 11, Segmentation fault. 
#0 0x00007fdaedfed858 in XS_Class__XSAccessor_END () 
from /usr/lib/perl5/auto/Class/XSAccessor/XSAccessor.so 
(gdb) backtrace 
#0 0x00007fdaedfed858 in XS_Class__XSAccessor_END () 
from /usr/lib/perl5/auto/Class/XSAccessor/XSAccessor.so 
#1 0x00007fdaf83cf845 in Perl_pp_entersub () from /usr/lib/libperl.so.5.10 
#2 0x00007fdaf83752c6 in Perl_call_sv () from /usr/lib/libperl.so.5.10 
#3 0x00007fdaf86ad40b in modperl_perl_call_list () 
from /usr/lib/apache2/modules/mod_perl.so 
#4 0x00007fdaf86b5786 in modperl_perl_destruct () 
from /usr/lib/apache2/modules/mod_perl.so 
#5 0x00007fdaf86a6256 in modperl_interp_destroy () 
from /usr/lib/apache2/modules/mod_perl.so 
#6 0x00007fdaf86a6715 in modperl_tipool_destroy () 
from /usr/lib/apache2/modules/mod_perl.so 
#7 0x00007fdaf86a62b2 in modperl_interp_pool_destroy () 
from /usr/lib/apache2/modules/mod_perl.so 
#8 0x00007fdaf98fd4e3 in ?? () from /usr/lib/libapr-1.so.0 
#9 0x00007fdaf98fc3b1 in apr_pool_destroy () from /usr/lib/libapr-1.so.0 
#10 0x00007fdaf98fc27f in apr_pool_clear () from /usr/lib/libapr-1.so.0 
#11 0x00007fdafa1b960d in main (argc=11, argv=0x7fff93b50ef8) 
at /build/buildd/apache2-2.2.16/server/main.c:692

Even if you don't know anything about stack traces, this output gently points to Class::XSAccessor. Perrin Harkins on the mod_perl list suggested to update Class::XSAccessor to the latest CPAN version, since its changelog mentioned some segmentation faults fixed in 0.10.

And that did it. No more segfaults on Ubuntu 10.10. Solution: upgrade Class::XSAccessor to 0.10+. Thanks to Class::XSAccessor maintainer(s)!

IPv6 and Perl. What’s the status?

This is more like a request for help than the usual babbling about something. IPv4 address space appears to be almost entirely allocated. IPv6 is there, ready for us to use (more or less).

I'm trying to reserve some time to prepare and test our systems for client and server IPv6 addresses. What is your experience in particular with Perl software? I know almost nothing about it, except there's:

I tried to download and test Net::IPv6Address, but that didn't work, tests fail, and the module is also last updated in 2008… Can you enlighten me on what's the state of the art regarding Perl (but not only, ofc) and IPv6? KTHXBYE.

In the meanwhile, I'll read a bit more on the whole IPv6 matter. First task: write an anonymizer for a IPv6 address.

Adding the irc NOTICE capability to Bot::BasicBot

Bot::BasicBot is a Perl module that provides a really easy, fast and convenient way to build plugin-based IRC bots. I'm playing around with an IRC bot that should assist in continuously deploying projects.

This bot has two main functionalities:

  • keep track of continous integration builds
  • initiate and keep track of deployments

Right now the bot reads a main configuration file with data about projects, repositories, continuous integration, etc… and answers commands. This is an example:


21:58 <@cosimo> projects-list
21:58 < deployer> auth, geodns, libopera, link, myopera, sso
21:58 <@cosimo> build-status geodns
21:58 < deployer> 97ad24e success cosimo https://git.server/functests/builds/geodns/97ad24e
21:58 <@cosimo> latest-revision sso
21:58 < deployer> 5207cfe, https://git.server/?p=sso.git;a=commit;h=fe977d32e9580551dffe8139396106ba25207cfe
21:59 <@cosimo> build-status auth
21:59 < deployer> 24135 success cosimo https://test.server/functests/builds/auth-unit/24135
21:59 < deployer> 24135.2 success (manual) https://test.server/functests/builds/auth-functional/24135.2

Another functionality of the bot is to detect new builds, and automatically send updates to a given channel, stating the project, the new VCS revision, the committer and a link to the continuous integration test run. Example:


17:22 -deployer:#chan- sso, fe977d3 success cosimo https://test.server/functests/builds/opera-sso/fe977d3

In the future, I also want to command the bot to initiate deployments. Anyway, the problem was that Bot::BasicBot apparently lacked support for sending IRC notices. This caused all the bot messages to interrupt the flow of IRC conversations. Bot::BasicBot has source code also on github, so I just forked it and added support for IRC notices. I just noticed that the author already pulled in the new changes. o/

It will still take years for this to land in Debian, but still… :-)

If Text::Hunspell never worked for you, now it’s time to try it again!

If you don't know, Hunspell is the spell checker engine of OpenOffice.org, and it's also included in the Opera and Mozilla browsers.

We were trying to use it from Perl, using the old Text::Hunspell module, version 1.3, but we had problems with it. Big problems. Like segfaults and tests that wouldn't run.

A bright hacker from Italy :) was then called in to fix the problem, with the promise of a fantastic prize he hasn't seen yet… [ping?] :-)

During the process, I found out I know absolutely nothing about dictionary files and stuff, and my fixes were – I would say – definitely horrible.

But! There's a bright side, of course, and that is that the module works just fine now, at least on Debian/Ubuntu systems. Before using Text::Hunspell, you want to install the following packages:

  • hunspell
  • libhunspell-dev

The example in the POD documentation (and in the examples dir) uses the standard US english dictionary. If you don't have that, you will need to change the script slightly. But the code is tested and should work without a problem. If you try it out and you have feedback, by all means let me know. Thanks!

Source code available on GitHub at:
http://github.com/cosimo/perl5-text-hunspell/

The module, tagged as 2.00 because it's cool :), will be up on CPAN shortly at this address:
http://search.cpan.org/dist/Text-Hunspell/

Self-contained instant functional test suites

I first learned about functional test suites when I started working on the My Opera code. Two years and some time later, I found myself slightly hating the My Opera functional test suite.

The main reasons:

  • It's too slow. Currently, it takes anywhere between 20 and 30 minutes to complete. Sure, it's thousands of tests, divided into hundreds of test scripts, grouped by functional area, like login, blogs, albums, etc…

    Even considered all of this, I think we should aim to have a single functional test run complete in 5-10 minutes.

    Most of the time is being wasted in the communication with the test server and creation and destruction of the database.

  • It's unreliable, sometimes cumbersome to manage. In our setup, CruiseControl fires a functional test suite run after every commit into the source code repository.

    That initiates an ssh connection to the test server, where a shell script takes care of restarting the running apache, dropping and creating the test database, updating required packages, etc…

    This implies that you need an apache instance running on a specific port, a mysql instance running on another port, etc… In the long run, this proved to be an approach that doesn't scale very well. It's too error prone, and not very reliable. Maybe your test run will fail because mysql has mysteriously crashed, or some other random bad thing.

A functional test suite that sucks less

Given the motivation, I came up with this idea of a functional test suite that is:

  1. instant: check out the source code from the repository, and you're ready to go.
  2. self-contained: it shouldn't have any external servers that need to be managed or even running.
  3. reusable: it doesn't have to be a functional test suite. The same concept can be reused for unit test suites, or anything really.

A few months later the first idea, and a couple of weeks of work on it, and we were ready with the first prototype. Here's how we use it:

  • Check out the source code from the repository in ~/src/myproject
  • cd ~/src/myproject
  • ./bin/run-functional-test-suite
  • Private instances of Apache, MySQL, and the main application are created and started
  • A custom WWW::Mechanizer-based client runs the functional test cases against the Apache instance
  • A TAP stream from the test run is produced and collected
  • All custom instances are destroyed
  • ???
  • Profit!

The temporary test run directory is left there untouched, so you can inspect it in case of problems. This is priceless, because that folder contains all the configuration and log files that the test run generated.

The main ideas

  • The entire functional test suite should run within a unique temporary directory created on the fly. You can run many instances as you want, with your username or a different one. They won't conflict with each other.
  • Use as much as possible the same configuration files as the other environments. If you already have development, staging and production, then functional-test is just another one of your environments. If possible, avoid creating special config files just for the functional test suite.
  • The Apache, MySQL, and Application configuration files are simple templates where you need to fill in the apache hostname, apache port, mysql port, username, password, etc…
  • Apache is started up from the temporary directory using the full command invocation, as:

    /usr/sbin/apache2 -k start -f /tmp/{project}_{user}_testsuite_{pid}/conf/apache.conf
    

    and all paths in config files are absolute, to avoid any relative references that would lead to files not found.
    The configuration template files have to take this into account, and put a $prefix style variable
    everywhere, like (Apache config, for example):

    Timeout 10
    Keepalive On
    HostnameLookups Off
    ...
    Listen [% suite.apache.port %]
    ...
    

  • The MySQL instances are created and destroyed from scratch for every test suite run, using the amazing MySQL::Sandbox. My colleague Terje had to patch the sandbox creation script to fix a problem with 64-bit environments. We filed a bug on the MySQL::Sandbox RT queue about this. This is the only small problem found in a really excellent Perl tool. If you haven't looked at it, do it now.

All in all, I'm really satisfied of the result. We applied it to the Auth project for now, which is smaller than My Opera, and we fine-tuned the wrapping shell script to compensate for the underlyiung distribution automatically. We've been able to run this functional test suite on Ubuntu from version 8.04 to 10.04, and on Debian Lenny, with no modifications. The required MySQL version is detected according to your system, and MySQL::Sandbox is instructed accordingly.

It also results in a faster execution of the entire functional test suite.

What can we improve?

There's lots of things to improve:

  • Speed. Instead of creating and destroying the database for every test group, it would be nice to use transactions, and rollback everything at the end of the test group (t/{group-name}/*.t). I used to do this many years ago with Postgres, and it was perfectly safe. I'm not sure we will ever make it with MySQL, since for example, ALTER TABLE (and other SQL statements) completely ignores transactions.
  • Reliability. The private instances of Apache, MySQL (and recently we added memcached too), are started up from ports that are calculated from the originating bash script PID. So, if you run the test suite from the bash script running as PID 8029, then MySQL is started up on port 8030, Apache on 8031, memcached on 8032. This has worked very well for now, and we made sure we don't use ports < 1024, but could lead to mysterious failures if some local services are using ports like 5432, or others. The idea is that we could test if a port is available before using it. However, this error is already detected at startup time, so it shouldn't be a huge problem.

Feedback welcome!

“DebPAN”, a production-grade Debian CPAN repository

The problem

This is a proposal I came up with after talking to Gabor Szabo about his Perl Ecosystem Development proposal.

One of the major "problems" we face while developing and deploying production Perl-based systems with Debian is that the state of the Debian CPAN modules is depressingly outdated. As an example, we're using Catalyst in Lenny, and that dates back to 2008 for the most parts.

This is just not enough.

A solution: maintain your own APT repository

Our current solution is to manually package every bit and maintain our own internal Opera APT repository that our servers and applications depend on. That's not optimal for two reasons:

  • we have to package lots of modules, due to interdependencies, dedicating a fair amount of time to this activity that is not exactly "productive"
  • we can't trust our systems to be Debian anymore, since we're updating bits and pieces with the bold assumption that everything will work fine

Of course, 99.999% will work fine, since it's Perl, but the problem is that one day this could fall down on our heads.

So, given the problem, what are the solutions?

  • Continuing to manually keep an apt repository. Downside: Some(tm) waste of time
  • Use a different packaging/deployment system, like PAR::Repository. I would personally like this, but it doesn't eliminate the need to maintain an own repository. You just don't use dh-make-perl and friends, that are, IMHO, nice to have and useful
  • The "DebPAN"

The DebPAN

I know Jeremiah Foster, and at a couple of Perl events I heard him talking to other CPAN/Perl developers about these issues. His answer would probably be to file a request for packaging for the modules we're interested in.

The reason why that doesn't work is that the lead time for a given RFP to land on Debian stable is unacceptably long, and I realize that is for a good reason. After all, it's supposed to be stable, right?

What about a "DebPAN" repository?

That could be a 3rd party APT repository, something like debpan.perl.org:

  • maintained by a close group of Perl/Debian/CPAN developers
  • guaranteed to have a selection of the most important modules (more on that…) in a reasonably recent version
  • targeted to Debian stable, and maybe other distributions? I think Ubuntu 10.04 suffers from this same problem, but much less than Debian Lenny, just to pick two versions
  • maybe even with patches applied?, but that might be way too much, actually

The can of worms

Of course, lots of problems can arise. However, if we think this is a good idea, then we should try to have something even minimal up and running. Then we'll worry about all the problems…

However, who gets to decide the most useful modules? That should go by popular demand I guess. Even looking at Debian requests for packaging stats, maybe? I can also imagine that bigger companies using Perl would be interested in this to potentially save lots of "infrastructural work".

I'd be really interesting to know other people opinions on this, especially if they use Debian stable, Debian developers, or the Debian-Perl group itself.

Cache::Memcached::Mock, instant in-process memcached mock

A week ago, I wrote a Cache::Memcached mock module for some complicated unit tests in this project I'm working on.

A few people asked to upload it to CPAN, so here it is:

Cache::Memcached::Mock v0.01 is on CPAN.

I didn't spend that much time polishing it and making documentation, so it's a bit rough around the edges, but you get the idea.

You can use it as a drop-in replacement for Cache::Memcached when you don't want, or can't afford, to run your own memcached daemon.

I've already got a feature request from a colleague: making sure set() fails if you try to store a value bigger than 1Mb.