Tag Archives: development

Porting Facter from Ruby to Perl6

I've been playing with Puppet for a while now. One of the most interesting (and simple!) components of Puppet is Facter.

Facter is a small software that reports "facts" about your computer. When you run facter, its output looks like the following:

architecture => x86_64
facterversion => 1.5.6
fqdn => cd01.localdomain.lan
hardwaremodel => x86_64
hostname => cd01
id => cosimo
interfaces => eth0,pan0
ipaddress => 10.0.0.1
...
uptime_hours => 422
uptime_seconds => 1519256
virtual => physical

It is interesting also because it's extensible with your own custom plugins. A custom plugin is just a Ruby file with usually a call to Facter.add:

Facter.add(:kernel) do
    setcode do
        require 'rbconfig'
        case Config::CONFIG['host_os']
        when /mswin|win32|dos|cygwin|mingw/i
            'windows'
        else
            Facter::Util::Resolution.exec("uname -s")
        end
    end
end

As a little experiment to become more familiar with Ruby and at the same time to enjoy writing some Perl 6, I decided to study the facter project and then port it to Perl6.

That took me a few hours over a couple of weekends, and it's almost completely functional. It is a straight port from the Ruby code, so it doesn't really use the magic powers of Perl6 yet. The idea is to use a different branch, now that I know it well under the hood, and to rewrite it from the ground up in Perl6.

I found out that Ruby code maps very closely to Perl6, apart from the yield instruction and a different model for static/instance variables. For yield, something that would be close in Perl6 is gather/take, but I'm not really sure that is the appropriate statement to use. yield is used in the fact value resolution algorithm, and that is currently the only thing that doesn't work properly in my port. Everything else is in place. Regarding static/instance variables, Ruby uses @attribute for instance variables, and @@attribute for static variables. In Perl6, instance variables are denoted with $.attribute if public, while static variables are package globals, so you can use our $attribute.

Of course, there's a lot more! If you're interested, take a look at the code on Github, the URL is http://github.com/cosimo/perl6-facter.

Another round of fixes for the varnish Accept-Language VCL extension

This very specific VCL extension to parse and normalize Accept-Language headers is becoming more robust as time goes by. Here's the latest round of fixes:

  • the original req.http.Accept-Language header could be overwritten when calling the vcl_rewrite_accept_language() function. Now this is fixed by copying the original header string into a static buffer and executing the processing on the copy.
  • improved a bit the style of the C code in a few places. Nothing miraculous really. Feels improved for me at least :)
  • fixed the use of a wrong define (string max length instead of languages list max length)
  • use of sizeof instead of using same constants in strncpy, etc…
  • added a small intro on the generated file too. In this way, if you find the code on some server, you can immediately understand what's that supposed to do, and where it came from :)

Enough blah blah, here's the new code. If you were already running this piece of VCL, then I won't tell you this is experimental stuff, because at this point it's no more experimental. But if you were running it already, then by all means upgrade. It is definitely better :)

http://github.com/cosimo/varnish-accept-language

Enjoy!

Adding the irc NOTICE capability to Bot::BasicBot

Bot::BasicBot is a Perl module that provides a really easy, fast and convenient way to build plugin-based IRC bots. I'm playing around with an IRC bot that should assist in continuously deploying projects.

This bot has two main functionalities:

  • keep track of continous integration builds
  • initiate and keep track of deployments

Right now the bot reads a main configuration file with data about projects, repositories, continuous integration, etc… and answers commands. This is an example:


21:58 <@cosimo> projects-list
21:58 < deployer> auth, geodns, libopera, link, myopera, sso
21:58 <@cosimo> build-status geodns
21:58 < deployer> 97ad24e success cosimo https://git.server/functests/builds/geodns/97ad24e
21:58 <@cosimo> latest-revision sso
21:58 < deployer> 5207cfe, https://git.server/?p=sso.git;a=commit;h=fe977d32e9580551dffe8139396106ba25207cfe
21:59 <@cosimo> build-status auth
21:59 < deployer> 24135 success cosimo https://test.server/functests/builds/auth-unit/24135
21:59 < deployer> 24135.2 success (manual) https://test.server/functests/builds/auth-functional/24135.2

Another functionality of the bot is to detect new builds, and automatically send updates to a given channel, stating the project, the new VCS revision, the committer and a link to the continuous integration test run. Example:


17:22 -deployer:#chan- sso, fe977d3 success cosimo https://test.server/functests/builds/opera-sso/fe977d3

In the future, I also want to command the bot to initiate deployments. Anyway, the problem was that Bot::BasicBot apparently lacked support for sending IRC notices. This caused all the bot messages to interrupt the flow of IRC conversations. Bot::BasicBot has source code also on github, so I just forked it and added support for IRC notices. I just noticed that the author already pulled in the new changes. o/

It will still take years for this to land in Debian, but still… :-)

Surge 2010 scalability conference in Baltimore, USA – DAY 2

This is a summary of day 2 of the Surge conference that took place in Baltimore, USA, 30th of September and 1st of October 2010. For a quite comprehensive blog post about day 1, you can read my previous post.

Here comes the list of talks I attended during Day 2.

Brian Cantryll – failures in commodity hardware

What happens when commodity hardware is used in an "enterprise" hardware project? Brian guided the audience through this industrial hw project. There was no recorded video of this talk, due to the content being potentially "sensitive". Very interesting talk, and Brian is IMO a very good speaker.

Benjamin Black – FastIP

Benjamin presented a – for me – new way to analyze metrics of a network, named "Flow". The flow-based network metrics can represent a network activity in a way that is completely different and much more accurate than what's usually done by operations and sysadmin departments. The downside is that is generates a lot of data. The advantage is that you can analyze and even replay? any traffic that took place between any two nodes of the network. I'm sure I didn't understand correctly because this would be amazing.

There's products out there that offer flow-based network analysis: Cisco Netflow, Ntop NProbe, etc… There's also a IETF working group about flow. We couldn't see any example/demo because there was a problem with the slides, IIRC.

FastIP also offers a related service. I contacted Benjamin about this after his talk. Maybe we'll be able to try something out or at least have a demonstration.

My TODO list:

Gavin Roy – Scaling MyYearBook.com

One of the most interesting talks in this conference IMO. MyYearbook is a Postgres shop, among the top 25 trafficked sites in the USA.

Gavin talked about many things they did to scale their site as the traffic was growing. Here's some of the things I remember:

  • DB connection pooling very important for them. Made a world of difference. They use PgBouncer and pgPool2
  • DB Horizontal scaling with pIProxy. TODO: look it up
  • DB Replication w/ Londiste, Slony, Bucardo
  • Postgresql 9.0 based standby to increase read-only capacity, and for hot-standby.
  • Partitioned the database by table, feature available since Pg 8.1

They have a primary-to-secondary master failover procedure. They looked into automating it, but a tech judgement is really necessary in case something goes wrong, so they will keep it manual. This was a question I asked to Gavin, since we've thought about automating our failover procedure for MySQL, but it's not so easy to just decide when to trigger the failover…

For user storage, they use Isilon IQ Series, apparently a FreeBSD appliance with on-board NFS. For DB servers, they looked at different solutions, but they keep coming back to direct attached storage. Their man db server, they have a massively powerful machine, IIRC, 512Gb of RAM and 128 cores machine. I have to double check this because it seems really impressive.

John Allspaw – Go or No Go

Another great talk by John, well presented and with great content. Not easy to summarize. The main topic was the "Go or no-go meeting", a 10 minutes get together of all involved parties before releasing changes or launching any new feature live.

This meeting basically consists of Yes/No questions:

  • Have you tested enough to deploy? QA still needed?
  • Has the feature being communicated (blog/forum/…)?
  • Does everyone know: when it will go live? who will push the feature?
  • Has the feature been in production for staff (or beta users)? That can be tricky to implement if the new feature implies social interactions (beta user tagging non-beta user)
  • Is it possible to dark launch this feature? Will we?
  • Is it possible to turn on this feature on a % of users? Will we?
  • Does it involve new infrastructure? If so: is there monitoring in place? (BLOCKER)
  • On/Off switch in the code/config is in place? Is it documented?
  • Are all the relevant people available for communication and launch?
  • Is there a place for users to provide feedback about the feature?
  • Post-launch "it's all done" time agreed?
  • Contingency checklist done and everytime reviewed it? (BLOCKER)

The "Contingency list" should answer the question: "What could possibly go wrong? What will we do about it?", with a list of potential issues and how to solve them in case shit hits the fan.

Apart from the Go/No-go meeting, which would be, also according to my past experience, a great way to avoid problems, there's at least a couple more really nice things to keep in mind when developing or launching a new feature:

  • "Dark launches": a dark launch is essentially a full launch of the new feature, but in such a way that is invisible to users. So if you're making db queries and processing stuff, you keep doing all that, you just throw the data away. You will be able to realize the (almost) full impact of the new feature on your application and compensate accordingly.
  • Feature "sampling" (% of users): you just enable the full feature for a small, and then growing, percentage of your user base. You can gradually grow to 100% and test the effect of the changes.

Great stuff.

Neil Gunther – Quantifying scalability

Here I was a bit too excited, due to my talk coming next, so unfortunately I didn't pay too much attention. It's a full analysis of scalability seen as a mathematical function, as capacity of your system as the load increases.

Cosimo Streppone – Scaling challenges of my.opera.com

I think

I used 5 minutes to show a live demo of the My Opera realtime monitor application that we built and afterwards I got very interesting questions, and also some nice twitter messages about it.

I also talked about how we've experimented in distributing requests across the different datacenters with our little geodns tool.

All in all, for me it was a fantastic experience. Practice will make me better, so I look forward to a next time :-)

Baron Schwartz – Scaling without sharding

Baron works for Percona. I had read some talks of his. I think he's a really good speaker. He explained in detail the scenarios that arise when dealing with database scaling, the typical characteristics of reads and writes, single server vs multiple servers deployments.

Basically what the talk tries to suggest is that very few situations require to shard your database. Single server setups can go very far, by optimizing the way the db works. Quote: "Sharding should be your last resort". Sharding should be enforced when write demand exceeds write capacity, so avoid sharding if you can, try to buffer/collate writes, defer update work, etc..

Closing day 2

Theo Schlossnagle closed the conference with a plenary keynote about a semi-serious "brief history of computing". Much fun, and a goodbye to next year's Surge.

For a glimpse of what happened live at the conference, you can also check out the Twitter stream for #surgecon.

Definitely a great conference. Stay tuned for videos and slides on the official site, http://omniti.com/surge/2010.

Surge 2010 scalability conference in Baltimore, USA – DAY 1

This was the first year the Surge conference took place, in Baltimore, USA. OmniTI is the company that organized it.

30" summary (TL;DR)

The conference was amazing. Main topic was scalability. Met a lot of people. 2 days, 2 tracks and 20+ speakers. Several interesting new products and technologies to evaluate.

The long story

The conference topics were scalability, databases and web operations. It took place over two days filled with high-level talks about experiences, failures, and advice on scaling web sites.

The only downside is that I had to miss half of the talks, being alone :). The good thing is that all videos and slides will be up on the conference website Soon™

Lots of things to be mentioned but I'll try to summarize what happened in Day 1.

John Allspaw – Web Engineering

First keynote session by John Allspaw, former Flickr dev, now Etsy.com.

Summary: Web engineering (aka Web Operations) is still a young field. We must set out to achieve much higher goals, be more scientific. We don't need to invent anything. We should be able to get inspiration and prior art from other fields like aerospace, civil engineering, etc…

He had lots of examples in his slides. I want to go through this talk again. Really inspiring.

Theo Schlossnagle – Scalable Design Patterns

Theo's message was clear. Tools can work no matter what technology. Bend technologies to your needs. You don't need the shiniest/awesomest/webscalest. Monitoring is key. Tie metrics to your business. Be relevant to your business people.

Ronald Bradford – Most common MySQL scalability mistakes

If you're starting with MySQL, or don't have too much experience, then you definitely want to listen to Ronald's talk. Will save you a few years of frustration. :)

Companion website, monitoring-mysql.com.

Ruslan Belkin – Scaling LinkedIn

Ruslan is very prepared and technical, but maybe I expected a slightly different type of content. I must read again the slides when they're up. LinkedIn is a mostly ("99%") Java, uses Lucene as main search tier. Very interesting: they mentioned that since 2005-2006, they have been using several specific services (friends, groups, profiles, etc…) instead of one big database. This allows them to scale better and more predictably.

They also seem to use a really vast array of different technologies, like Voldemort, and many others I don't remember the names right now.

Robert Treat – Database scalability patterns

Robert is a very experienced DBA with no doubt. He talked about all different types of MySQL configurations available to developers in need of scaling their apps, explaining them and providing examples: horizontal/vertical partitioning, h/v scaling, etc…

I was late for this talk so I only got the final part.

Tom Cook – A day in the life of Facebook operations

I listened to the first 10-15 minutes of this talk, and I had the impression that this was probably the 3rd time I listen to the same talk, that tells us how big Facebook is, upload numbers, status updates, etc… without going into specific details. This of course is very impressive, but it's the low-level stuff that's more interesting, at least for me.

Last time I had attended this talk was in Brussels for Fosdem. I was a bit disappointed so I left early. According to some later tweets, the last part was the most interesting. Have to go back on this one, and watch the video. Well… at least I got to listen to the last part of…

Arthur Bergman – Scaling Wikia

Lots of Varnish knowledge (and more) in this talk!

I had read some earlier talks by Artur, always about Varnish, and I have learnt a lot from him.
I strongly suggest to go through his talks if you're interested in Varnish.

They "abused" Urchin tracker (Google Analytics) javascript code to measure their own statistics about server errors and client-side page loading times. Another cool trick is the use of a custom made-up X-Vary-URL HTTP header to keep all linked URLs (view/edit/etc.. regarding a single wiki page) in one varnish hash slot. In this case, with a single purge command you can get rid of all relevant pages linked to the same content.

They use SSDs extensively. A typical Wikia server (Varnish and/or DB) has got:

  • 2 x 6 cores westmere processor
  • 6 x Intel X25 SSD (~ $2000)
  • 2 x spinning drives for transaction logs (db)

"SSD allows you JOINs with no performance degradation."

Peak speeds reached (this is random not sequential: amazing!):

  • 500 Mbyte/s random read with avg latency of 0.2 ms
  • 220 Mbyte/s random writes

They use their own CDN based on Dynect (I think a Dyn Inc. service, see below).
Still using Akamai for a minor part of their static content.

Wikia is looking into using Riak, and a Riak-based filesystem to hook up directly to Varnish for really fast file serving.

Mike Malone – SimpleGeo

SimpleGeo implemented a geographic database over apache cassandra, able to answer spatial queries. They researched into using PostGIS (postgres-based GIS DB, very common product), but wasn't as flexible as they needed (don't remember exactly why).

TODO: look into "Distributed indexes over-DHT". He indicated it as prior art for their system.
This talk was a bit complicated for me to follow, so I'll have to watch it again.

Closing day 1

At the end of the day, there was a SQL vs NoSQL panel, which I skipped entirely. Maybe it was interesting :) The after-hours event that closed day 1 was organized by Dyn Inc. It was fantastic. Lots of good beer, martinis, and good food. I went to bed early, since I was still jetlagged. Day 2 started at 9 AM.

Time for a break :)

And then on to Day 2:

http://my.opera.com/cstrep/blog/2010/10/07/surge-2010-scalability-conference-in-baltimore-usa-day-2

Self-contained instant functional test suites

I first learned about functional test suites when I started working on the My Opera code. Two years and some time later, I found myself slightly hating the My Opera functional test suite.

The main reasons:

  • It's too slow. Currently, it takes anywhere between 20 and 30 minutes to complete. Sure, it's thousands of tests, divided into hundreds of test scripts, grouped by functional area, like login, blogs, albums, etc…

    Even considered all of this, I think we should aim to have a single functional test run complete in 5-10 minutes.

    Most of the time is being wasted in the communication with the test server and creation and destruction of the database.

  • It's unreliable, sometimes cumbersome to manage. In our setup, CruiseControl fires a functional test suite run after every commit into the source code repository.

    That initiates an ssh connection to the test server, where a shell script takes care of restarting the running apache, dropping and creating the test database, updating required packages, etc…

    This implies that you need an apache instance running on a specific port, a mysql instance running on another port, etc… In the long run, this proved to be an approach that doesn't scale very well. It's too error prone, and not very reliable. Maybe your test run will fail because mysql has mysteriously crashed, or some other random bad thing.

A functional test suite that sucks less

Given the motivation, I came up with this idea of a functional test suite that is:

  1. instant: check out the source code from the repository, and you're ready to go.
  2. self-contained: it shouldn't have any external servers that need to be managed or even running.
  3. reusable: it doesn't have to be a functional test suite. The same concept can be reused for unit test suites, or anything really.

A few months later the first idea, and a couple of weeks of work on it, and we were ready with the first prototype. Here's how we use it:

  • Check out the source code from the repository in ~/src/myproject
  • cd ~/src/myproject
  • ./bin/run-functional-test-suite
  • Private instances of Apache, MySQL, and the main application are created and started
  • A custom WWW::Mechanizer-based client runs the functional test cases against the Apache instance
  • A TAP stream from the test run is produced and collected
  • All custom instances are destroyed
  • ???
  • Profit!

The temporary test run directory is left there untouched, so you can inspect it in case of problems. This is priceless, because that folder contains all the configuration and log files that the test run generated.

The main ideas

  • The entire functional test suite should run within a unique temporary directory created on the fly. You can run many instances as you want, with your username or a different one. They won't conflict with each other.
  • Use as much as possible the same configuration files as the other environments. If you already have development, staging and production, then functional-test is just another one of your environments. If possible, avoid creating special config files just for the functional test suite.
  • The Apache, MySQL, and Application configuration files are simple templates where you need to fill in the apache hostname, apache port, mysql port, username, password, etc…
  • Apache is started up from the temporary directory using the full command invocation, as:

    /usr/sbin/apache2 -k start -f /tmp/{project}_{user}_testsuite_{pid}/conf/apache.conf
    

    and all paths in config files are absolute, to avoid any relative references that would lead to files not found.
    The configuration template files have to take this into account, and put a $prefix style variable
    everywhere, like (Apache config, for example):

    Timeout 10
    Keepalive On
    HostnameLookups Off
    ...
    Listen [% suite.apache.port %]
    ...
    

  • The MySQL instances are created and destroyed from scratch for every test suite run, using the amazing MySQL::Sandbox. My colleague Terje had to patch the sandbox creation script to fix a problem with 64-bit environments. We filed a bug on the MySQL::Sandbox RT queue about this. This is the only small problem found in a really excellent Perl tool. If you haven't looked at it, do it now.

All in all, I'm really satisfied of the result. We applied it to the Auth project for now, which is smaller than My Opera, and we fine-tuned the wrapping shell script to compensate for the underlyiung distribution automatically. We've been able to run this functional test suite on Ubuntu from version 8.04 to 10.04, and on Debian Lenny, with no modifications. The required MySQL version is detected according to your system, and MySQL::Sandbox is instructed accordingly.

It also results in a faster execution of the entire functional test suite.

What can we improve?

There's lots of things to improve:

  • Speed. Instead of creating and destroying the database for every test group, it would be nice to use transactions, and rollback everything at the end of the test group (t/{group-name}/*.t). I used to do this many years ago with Postgres, and it was perfectly safe. I'm not sure we will ever make it with MySQL, since for example, ALTER TABLE (and other SQL statements) completely ignores transactions.
  • Reliability. The private instances of Apache, MySQL (and recently we added memcached too), are started up from ports that are calculated from the originating bash script PID. So, if you run the test suite from the bash script running as PID 8029, then MySQL is started up on port 8030, Apache on 8031, memcached on 8032. This has worked very well for now, and we made sure we don't use ports < 1024, but could lead to mysterious failures if some local services are using ports like 5432, or others. The idea is that we could test if a port is available before using it. However, this error is already detected at startup time, so it shouldn't be a huge problem.

Feedback welcome!

“DebPAN”, a production-grade Debian CPAN repository

The problem

This is a proposal I came up with after talking to Gabor Szabo about his Perl Ecosystem Development proposal.

One of the major "problems" we face while developing and deploying production Perl-based systems with Debian is that the state of the Debian CPAN modules is depressingly outdated. As an example, we're using Catalyst in Lenny, and that dates back to 2008 for the most parts.

This is just not enough.

A solution: maintain your own APT repository

Our current solution is to manually package every bit and maintain our own internal Opera APT repository that our servers and applications depend on. That's not optimal for two reasons:

  • we have to package lots of modules, due to interdependencies, dedicating a fair amount of time to this activity that is not exactly "productive"
  • we can't trust our systems to be Debian anymore, since we're updating bits and pieces with the bold assumption that everything will work fine

Of course, 99.999% will work fine, since it's Perl, but the problem is that one day this could fall down on our heads.

So, given the problem, what are the solutions?

  • Continuing to manually keep an apt repository. Downside: Some(tm) waste of time
  • Use a different packaging/deployment system, like PAR::Repository. I would personally like this, but it doesn't eliminate the need to maintain an own repository. You just don't use dh-make-perl and friends, that are, IMHO, nice to have and useful
  • The "DebPAN"

The DebPAN

I know Jeremiah Foster, and at a couple of Perl events I heard him talking to other CPAN/Perl developers about these issues. His answer would probably be to file a request for packaging for the modules we're interested in.

The reason why that doesn't work is that the lead time for a given RFP to land on Debian stable is unacceptably long, and I realize that is for a good reason. After all, it's supposed to be stable, right?

What about a "DebPAN" repository?

That could be a 3rd party APT repository, something like debpan.perl.org:

  • maintained by a close group of Perl/Debian/CPAN developers
  • guaranteed to have a selection of the most important modules (more on that…) in a reasonably recent version
  • targeted to Debian stable, and maybe other distributions? I think Ubuntu 10.04 suffers from this same problem, but much less than Debian Lenny, just to pick two versions
  • maybe even with patches applied?, but that might be way too much, actually

The can of worms

Of course, lots of problems can arise. However, if we think this is a good idea, then we should try to have something even minimal up and running. Then we'll worry about all the problems…

However, who gets to decide the most useful modules? That should go by popular demand I guess. Even looking at Debian requests for packaging stats, maybe? I can also imagine that bigger companies using Perl would be interested in this to potentially save lots of "infrastructural work".

I'd be really interesting to know other people opinions on this, especially if they use Debian stable, Debian developers, or the Debian-Perl group itself.

LWP::Simple for Perl 6, now with (partial) BasicAuth support and getstore()

I just pushed out another update for the LWP::Simple module for Perl 6. This time, the main work was:

  • refactoring the code and adding unit tests for the URL parsing (that might even grow into a Perl 6 URI module
  • adding partial basic auth support. To be complete and working, it needs to base64 encode the user/password pair. Not implemented yet. I'll see if I get around to it, or if someone has done it already.
  • adding a getstore() method, that writes on to disk the downloaded content. Unfortunately that needs to strip the HTTP headers and undestand chunked transfers for it to be remotely useful.

It was nice to see the module grow in both functionality, code and unit tests coverage. I had to workaround a couple of problems I couldn't understand. I was extremely lazy and I didn't even look up Synopses, so I assume it's my fault. However.

The first is the use of .match() and ~~ to match against a regular expression. I found that the following code:

my $hostname = 'cosimo:eelst@faveclub.eelst.com';
if $hostname.match('^cosimo') {
    # Doesn't enter here
}

doesn't trigger a match. However, this other here:

my $hostname = 'cosimo:eelst@faveclub.eelst.com';
if $hostname ~~ /^cosimo/ {
    # Does match
}

And, in the same way, something similarly surprising. The following code correctly matches:

my $hostname = 'cosimo:eelst@faveclub.eelst.com';
if $hostname ~~ /^ .+ : .+ @ .+ $/ {
    say '(user:pass@host) matches';
} else {
    say '(user:pass@host) does not match';
}

but adding captures makes the same exact regex fail:

my $hostname = 'cosimo:eelst@faveclub.eelst.com';
if $hostname ~~ /^ (.+) : (.+)  @ (.+) $/ {
    say '(with captures) matches';
} else {
    say '(with captures) does not match';
}

There are also very nice things about programming in Perl 6 that are slowly sucking me in this fantastic language. This is part of a test script for LWP::Simple:

#
# Test the parse_url() method
#
use v6;
use Test;
use LWP:: Simple;

my @test = (
    { User-Agent => 'Opera/9.80 (WinNT; 6.0) Version/10.60' },
    "User-Agent: Opera/9.80 (WinNT; 6.0) Version/10.60rn",

    { Connection => 'close' },
    "Connection: closern",
);

for @test -> %headers, $expected_str {
    my $hdr_str = LWP:: Simple.stringify_headers(%headers);
    is($hdr_str, $expected_str, 'OK - ' ~ $hdr_str);
}

Note how in the for statement we can "extract" the hash and string from the @test array with:

for @test -> %headers, $expected_string {
    # Loop body
}

It's not a big deal, other languages have it, but Perl 6 is filled with this small niceties that make the resulting code still feel like Perl, but also, don't know exactly, more robust perhaps?

So, to conclude:

  • Anyone with a Perl 6 implementation of MIME::Base64 ? Speak up before I create a monster :)
  • Anyone cares enough to take on the chunked transfer encoding support?

Dist::Zilla, cpanhacker, do you use it??!

Dist::Zilla, cpan hacker, do you use it?

If not, you should. Full stop.

So I've been reading about Dist::Zilla (dzil) for a while now. The first articles didn't really convince me too much. Yes, it's a nice tool, but why would I want to use it? Ok, to save time. Right. But how do I use it the first time? I need a config file. Ok where do I get (a sane) one?

Then I read a clear walk-through example of a Dist::Zilla config by Dave Rolsky. So, the quick start route was clear. Copy/paste the dist.ini file and try it.

Except, the file in that blog post contains so much stuff (read: additional plugins required) that is not practical to use as a first dist.ini config.

Basic premise

It's that you have to have a module/class that you want to upload to the CPAN, and you want to save yourself all the "package-wrapping" activities that a CPAN distribution needs. That is were dzil shines.

Start-using-Dist::Zilla-in-2-minutes guide

Good, you're on your way to spend countless hours less on maintaining your CPAN stuff.

  • First things first: write the thing that you want to package and upload to CPAN. Fatto? Bene.
  • So you probably have a lib folder with your class and a t folder with the test scripts for your class. Good.
  • Then, usually if you want to package everything for CPAN, you need to add a MANIFEST, a README, INSTALL, Changes, possibly a LICENSE, etc… Forget about those!
  • Copy paste the following into a dist.ini file:
    ; Everything starting with ';' is a comment
    
    name    = Your-Awesome-Module
    author  = Joe J. Hacker <jjhacker@example.org>
    license = Perl_5
    copyright_holder = Joe J. Hacker
    copyright_year   = 2010
    
    version = 0.01
    
    [@Basic]
    [InstallGuide]
    [MetaJSON]
    
    [MetaResources]
    bugtracker.web    = http://rt.cpan.org/NoAuth/Bugs.html?Dist=Your-Awesome-Module
    bugtracker.mailto = bug-your-awesome-module@rt.cpan.org
    
    ; If you have a repository...
    ;repository.url    = git://github.com/jjhacker/your-awesome-module.git
    ;repository.web    = http://github.com/jjhacker/your-awesome-module
    ;repository.type   = git
    
    ; You have to have Dist::Zilla::Plugin::<Name> for these to work
    ;[PodWeaver]
    ;[NoTabsTests]
    ;[EOLTests]
    ;[Signature]
    ;[CheckChangeLog]
    
    [Prereq]
    ;DBI = 1.600
    ;SomeOtherModule = 0
    
    [Prereq / TestRequires]
    Test::More = 0
    
    ; If you're using git, this is interesting
    ; You need to install Dist::Zilla::PluginBundle::Git
    ;[@Git]
    
    
  • Add the following line at the top of your main class (.pm) file:
    # ABSTRACT: Supercool generator of twisty little passages, all alike
    
  • Run dzil build

That's it! The build command will do everything for you. When I realized how quick this tool makes my CPAN activities, I decided I had to try to convince some more people out there. Until I didn't really try it, I wasn't convinced 100%. Now I definitely am. Life's just too short. Use DZIL.

There's also a release command, that will also send your generated distribution archive to the CPAN directly. I couldn't get that to work. I remember there was some kind of dot file I had to write my credentials to before I could release, but I couldn't find the details today. If anyone knows, please comment here… :-)

Mocking Cache::Memcached

Today I've been working on some elaborate unit tests that require a database and a memcached object. In My Opera, like probably everywhere else, we use our own classes for both DBI and memcached access. The memcached class in particular is just a subclass of Cache::Memcached, so nothing special there.

I was looking at already existing modules that would mock Cache::Memcached without requiring a running memcached server. The only relevant one I could find is Test::Memcached.

However, Test::Memcached interacts with the memcached binary, trying to start/stop it. This is not really what I want to do. I could do that, but it would be slightly complicated because we have lots of test installations, and we'd have to install or require a memcached daemon running everywhere.
A single shared memcached daemon wouldn't be so smart either since the different installations would interfere with each other. We could probably use key namespaces for that. Mmh.

Anyway, I decided that having a mock object, something that could be named Cache::Memcached::Mock, or Test::Memcached::Mock could be simpler and more easily testable as well. So I wrote a prototype that subclasses Test::MockObject that works fine for now and covers all my needs. It uses just a regular hash as memory storage, and supports get(), set(), flush_all(), and delete().

Not sure if I should upload it to CPAN…