Monthly Archives: June 2010

Digest::MD5 for Perl 6 finally works!

It took me an awful long time, and lots of help from the folks on #parrot and #perl6 but in the end, it's done!

It needs a tiny patch to Parrot, but I believe it will be added to the next parrot release. I tried to documented the fixes to the code to help others that might have the same problems.

So now Digest::MD5 for Perl 6 works as good as the Perl 5 one. I'm too tired to say anything else :)

Good night!

LWP::Simple for Perl 6, now with (partial) BasicAuth support and getstore()

I just pushed out another update for the LWP::Simple module for Perl 6. This time, the main work was:

  • refactoring the code and adding unit tests for the URL parsing (that might even grow into a Perl 6 URI module
  • adding partial basic auth support. To be complete and working, it needs to base64 encode the user/password pair. Not implemented yet. I'll see if I get around to it, or if someone has done it already.
  • adding a getstore() method, that writes on to disk the downloaded content. Unfortunately that needs to strip the HTTP headers and undestand chunked transfers for it to be remotely useful.

It was nice to see the module grow in both functionality, code and unit tests coverage. I had to workaround a couple of problems I couldn't understand. I was extremely lazy and I didn't even look up Synopses, so I assume it's my fault. However.

The first is the use of .match() and ~~ to match against a regular expression. I found that the following code:

my $hostname = 'cosimo:eelst@faveclub.eelst.com';
if $hostname.match('^cosimo') {
    # Doesn't enter here
}

doesn't trigger a match. However, this other here:

my $hostname = 'cosimo:eelst@faveclub.eelst.com';
if $hostname ~~ /^cosimo/ {
    # Does match
}

And, in the same way, something similarly surprising. The following code correctly matches:

my $hostname = 'cosimo:eelst@faveclub.eelst.com';
if $hostname ~~ /^ .+ : .+ @ .+ $/ {
    say '(user:pass@host) matches';
} else {
    say '(user:pass@host) does not match';
}

but adding captures makes the same exact regex fail:

my $hostname = 'cosimo:eelst@faveclub.eelst.com';
if $hostname ~~ /^ (.+) : (.+)  @ (.+) $/ {
    say '(with captures) matches';
} else {
    say '(with captures) does not match';
}

There are also very nice things about programming in Perl 6 that are slowly sucking me in this fantastic language. This is part of a test script for LWP::Simple:

#
# Test the parse_url() method
#
use v6;
use Test;
use LWP:: Simple;

my @test = (
    { User-Agent => 'Opera/9.80 (WinNT; 6.0) Version/10.60' },
    "User-Agent: Opera/9.80 (WinNT; 6.0) Version/10.60rn",

    { Connection => 'close' },
    "Connection: closern",
);

for @test -> %headers, $expected_str {
    my $hdr_str = LWP:: Simple.stringify_headers(%headers);
    is($hdr_str, $expected_str, 'OK - ' ~ $hdr_str);
}

Note how in the for statement we can "extract" the hash and string from the @test array with:

for @test -> %headers, $expected_string {
    # Loop body
}

It's not a big deal, other languages have it, but Perl 6 is filled with this small niceties that make the resulting code still feel like Perl, but also, don't know exactly, more robust perhaps?

So, to conclude:

  • Anyone with a Perl 6 implementation of MIME::Base64 ? Speak up before I create a monster :)
  • Anyone cares enough to take on the chunked transfer encoding support?

Dist::Zilla, cpanhacker, do you use it??!

Dist::Zilla, cpan hacker, do you use it?

If not, you should. Full stop.

So I've been reading about Dist::Zilla (dzil) for a while now. The first articles didn't really convince me too much. Yes, it's a nice tool, but why would I want to use it? Ok, to save time. Right. But how do I use it the first time? I need a config file. Ok where do I get (a sane) one?

Then I read a clear walk-through example of a Dist::Zilla config by Dave Rolsky. So, the quick start route was clear. Copy/paste the dist.ini file and try it.

Except, the file in that blog post contains so much stuff (read: additional plugins required) that is not practical to use as a first dist.ini config.

Basic premise

It's that you have to have a module/class that you want to upload to the CPAN, and you want to save yourself all the "package-wrapping" activities that a CPAN distribution needs. That is were dzil shines.

Start-using-Dist::Zilla-in-2-minutes guide

Good, you're on your way to spend countless hours less on maintaining your CPAN stuff.

  • First things first: write the thing that you want to package and upload to CPAN. Fatto? Bene.
  • So you probably have a lib folder with your class and a t folder with the test scripts for your class. Good.
  • Then, usually if you want to package everything for CPAN, you need to add a MANIFEST, a README, INSTALL, Changes, possibly a LICENSE, etc… Forget about those!
  • Copy paste the following into a dist.ini file:
    ; Everything starting with ';' is a comment
    
    name    = Your-Awesome-Module
    author  = Joe J. Hacker <jjhacker@example.org>
    license = Perl_5
    copyright_holder = Joe J. Hacker
    copyright_year   = 2010
    
    version = 0.01
    
    [@Basic]
    [InstallGuide]
    [MetaJSON]
    
    [MetaResources]
    bugtracker.web    = http://rt.cpan.org/NoAuth/Bugs.html?Dist=Your-Awesome-Module
    bugtracker.mailto = bug-your-awesome-module@rt.cpan.org
    
    ; If you have a repository...
    ;repository.url    = git://github.com/jjhacker/your-awesome-module.git
    ;repository.web    = http://github.com/jjhacker/your-awesome-module
    ;repository.type   = git
    
    ; You have to have Dist::Zilla::Plugin::<Name> for these to work
    ;[PodWeaver]
    ;[NoTabsTests]
    ;[EOLTests]
    ;[Signature]
    ;[CheckChangeLog]
    
    [Prereq]
    ;DBI = 1.600
    ;SomeOtherModule = 0
    
    [Prereq / TestRequires]
    Test::More = 0
    
    ; If you're using git, this is interesting
    ; You need to install Dist::Zilla::PluginBundle::Git
    ;[@Git]
    
    
  • Add the following line at the top of your main class (.pm) file:
    # ABSTRACT: Supercool generator of twisty little passages, all alike
    
  • Run dzil build

That's it! The build command will do everything for you. When I realized how quick this tool makes my CPAN activities, I decided I had to try to convince some more people out there. Until I didn't really try it, I wasn't convinced 100%. Now I definitely am. Life's just too short. Use DZIL.

There's also a release command, that will also send your generated distribution archive to the CPAN directly. I couldn't get that to work. I remember there was some kind of dot file I had to write my credentials to before I could release, but I couldn't find the details today. If anyone knows, please comment here… :-)

Ubiquity for Opera gets a “bing” command

Another micro update to Ubiquity for Opera. This time I added the Bing command, to search the web with Microsoft Bing search engine.

So, Bing fans, it wasn't fair to discriminate you, so now you can search Bing through ubiquity for Opera as well. Enjoy!

As usual, you can download the script here:

Batteries and installation instructions included. Have fun.

Mocking Cache::Memcached

Today I've been working on some elaborate unit tests that require a database and a memcached object. In My Opera, like probably everywhere else, we use our own classes for both DBI and memcached access. The memcached class in particular is just a subclass of Cache::Memcached, so nothing special there.

I was looking at already existing modules that would mock Cache::Memcached without requiring a running memcached server. The only relevant one I could find is Test::Memcached.

However, Test::Memcached interacts with the memcached binary, trying to start/stop it. This is not really what I want to do. I could do that, but it would be slightly complicated because we have lots of test installations, and we'd have to install or require a memcached daemon running everywhere.
A single shared memcached daemon wouldn't be so smart either since the different installations would interfere with each other. We could probably use key namespaces for that. Mmh.

Anyway, I decided that having a mock object, something that could be named Cache::Memcached::Mock, or Test::Memcached::Mock could be simpler and more easily testable as well. So I wrote a prototype that subclasses Test::MockObject that works fine for now and covers all my needs. It uses just a regular hash as memory storage, and supports get(), set(), flush_all(), and delete().

Not sure if I should upload it to CPAN…

Dependencies suck

We love dependencies. For example, in the CPAN universe. They make our job so damn easier. Thousands of production quality, unit tested modules at your fingertips.

But dependencies also suck really badly, for example when you're using a Linux distribution that has packages that are just too old to be useful. Hey, but they are stable! More stable-as-dead or more stable-as-production-quality? You decide.

It's been many months since I installed a local instance of Transifex, a Django application that allows translators to easily contribute to projects. We're using it for My Opera, but also trying to get other internal projects to use it.

So far, it has worked nicely. I think Transifex is a really good application, its feature set is just right for what we need etc… Last week I decided to upgrade our Transifex instance from v0.8.0-devel to 0.9.0-devel. The improvements were really nice and needed, so I just decided to go for it. I had been upgrading in the 0.8.0 series from their repository (aka HEAD, aka master, aka trunk).

This time though, the list of dependencies was a bit more specific than usual. Also, please note that 0.9.0 is a **BLEEDING EDGE** development version as of June 2010.

Anyway, first dependency listed was "Django = 1.1.2". I think I started going down the wrong path when I upgraded Django with:


$ sudo easy_install 'Django>=1.1.2'

Here you can see that my mind is somewhat hardwired to the Perl culture, where backward compatibility is of paramount importance. I wrote code 10 years ago, using perl 5.005, that it's still in production, unmodified, with perl 5.10, and I'm talking about commercial stuff, not silly home projects. The terrible mistake here is to think that this also applies everywhere else. Forget it. It's not true.

In fact, easy_install picked up Django 1.2.1, which is an entirely different beast that breaks at least a couple of assumptions that Transifex was making. I don't remember exactly now, but one had to do with the automatic export of email.MIMEBase into django.core.mail and another I only remember it broke horribly.

So, a couple of hours later, thanks to the guys on the #transifex channel, I figured out that what I really needed to write was:


$ sudo easy_install 'Django==1.1.2'

This forces to install the given version instead of any later one. So far so good. Then I had another problem, completely unrelated, the required me to strace the ./manage.py Django script, to figure out that it was using a totally screwed up sqlite database coming from a year old test version of transifex I had installed through easy_install and was completely ignoring my local settings that went to a MySQL db. How nice.

So, yes, we always complain about CPAN, dependencies, Module::Install, ExtUtils::MakeMaker and whatnot, but a look at other worlds (easy_install, ruby-gems anyone?) can remind Perl people of the fantastic toolchain and especially culture "we" have built, and that's still kicking everyone else's ass, on any platform.

So, regarding the debate in the Perl community, my vote goes to keeping Sane(tm) backward compability standards, as we always did. It matters, especially for commercial software companies!

My silly twitter OAuth command line client

I had the need to test a new My Opera API module that is soon coming out. Since this module, to be named Net::MyOpera, is modeled exactly after Net::Twitter, I tried changing my example script replacing all Net::MyOpera occurrencies to Net::Twitter. And there you go, a Twitter command line client was born.

I know, there's plenty of them already, and as I said, I didn't really need one, but since it's there already, it's nice to have it. So I saved it into my ~/bin folder, and aliased to tw, so whenever I feel the urge to communicate stupid things to the universe, I can now do that. Ehm wait… :-)

The code is out on Github.

As always, there's a hidden (poor) excuse for this. And it's that I'm working to port this Twitter command line client to Perl6. OAuth support needs a good deal of modules that are not immediately available for Perl6, so it's going to be exciting.

First we're going to need are Digest::SHA1, and Digest::HMAC for the HMAC-SHA1 signatures. These modules are not impossible to write, except currently there's a problem using Parrot libraries from Perl6.

I'm trying to do the same for my Perl6 Digest::MD5 module, but I'm stumbling on the following error:

t/perl5-compat.t ... Null PMC access in find_method('signature')
  in 'Digest::MD5::md5_hex' at line 11:lib/Digest/MD5.pm
  in main program body at line 17:t/perl5-compat.t
t/perl5-compat.t ... Dubious, test returned 1 (wstat 256, 0x100)

I will need some help on this :-)

Perl6 hacking, grammars, Digest::MD5 and caffeine levels

I'll be brief. Need some sleep. :)

Perl 6 is here. Now. And there's an immense work waiting to be done: rewriting Perl5's CPAN. Ain't that easy? :)

Anyway, during last couple of weeks, I spent most of my spare time playing with Perl 6:

Grammars

My poor excuse to (try to) learn grammars was to build a Perl6 class that could parse a puppet module and build some documentation for it. Puppet itself uses a grammar to parse its modules, so I thought it wouldn't be impossible to port it to Perl6 and use it to parse puppet code.

Well, turns out it's not so easy, but at least I'm learning how grammars work and having fun.

Perl6 Digest::MD5

This is extremely fascinating, because it's touching the Parrot core. In Parrot, there's already a Digest::MD5 module, so all you have to do (but again, not so easy), is to write a Perl6 "wrapper" around the Parrot code.

And how do you do that? With PIR blocks. This stuff is great. Seriously. It's like going back to Assembly, in some sense(tm). Here's an example of this glue PIR code:

class Digest::MD5 {

    multi method md5_hex (Str $message) {

        pir::load_bytecode('Digest/MD5.pbc');

        my $md5_sum = Q:PIR {
            .local pmc md5sum, md5_sum_get
            md5sum = get_root_global ['parrot'; 'Digest'], '_md5sum'
            $P0 = find_lex '$message'
            $P1 = md5sum($P0)
            md5_sum_get = get_root_global ['parrot'; 'Digest'], '_md5_hex'
            %r = md5_sum_get($P1)
        };

        return $md5_sum;
    }

    multi method md5_hex (@message) {
        my Str $message = @message.join('');
        return Digest::MD5.md5_hex($message);
    }

}

Even if you don't understand Perl 6 or PIR, you can probably recognize a class definition, and polimorphic methods. md5_hex() is in fact defined twice:

  • multi method md5_hex (Str $message)
  • multi method md5_hex (@message)

You don't have to write polimorphic methods, but you can do it if you want. Yes, there's more than one way, and there always will be. I don't like dictators, even if they are benevolent, and Python code looks so flat and dull, seriously. There's no personality in Python code. Yes, sigils are great.

Digest::MD5 is also using alien technology (UFO).

Synopsis documents

Nothing fancy there, just improved the existing CSS. For an example, go read Synopsis 03 about operators.

Good night!

Disassembling a real world Plack PSGI application

After I started playing with Plack, I tried to evaluate whether to continue using it for our mission-critical production stuff or give up, going back to the same techniques we already use (successfully).

I think it's time to develop and deploy a Plack based application. In my grand plan, :-), I'd like to deploy nginx with PSGI support, or even more ambitiously, nginx or apache with Starman as "backend" http server. We'll see…

In the meantime, I'd like to write here a couple of niceties about Plack and Starman, showing some real code I wrote when I started.

A real world PSGI application

Here's a sample PSGI application currently under development:

#!/usr/bin/env perl
#
# Sample PSGI application
#

use strict;
#se warnings;
use constant ENVIRONMENT         => 'development';
use constant APACHE_DEPLOYMENT   => (ENVIRONMENT eq 'production');
use constant ENABLE_ACCESS_LOG   => (ENVIRONMENT eq 'development');
use constant ENABLE_DEBUG_PANELS => (ENVIRONMENT eq 'development');

use Plack::Builder;
use AuthOpera;
use AuthOpera::Account;

my $app = AuthOpera::Account->new(); 

builder {

    enable_if { not APACHE_DEPLOYMENT }
        'Plack::Middleware::Static', 
        path => qr{^/(bitmaps/|images/|js/|css/|downtime/|favicon.ico$|ping.html$)},
        root => '..',
        ;

    mount "/account" => builder {

        enable_if { ENABLE_DEBUG_PANELS } 'StackTrace';
        enable_if { ENABLE_DEBUG_PANELS } 'Debug';   # panels => [ qw(DBITrace Memory) ];
        enable_if { ENABLE_DEBUG_PANELS } 'Lint';
        enable_if { ENABLE_DEBUG_PANELS } 'Runtime';
        enable_if { ENABLE_ACCESS_LOG   } 'AccessLog';

        $app;

    }

}

Of course, the main application code is not here, but in the AuthOpera::Account class. That's not really relevant to what we're discussing here. Let's just say that any class, to be a valid and complete PSGI application, has to:

  • subclass from Plack::Component
  • have a call() method
  • the call() method must return a valid PSGI response. Example:
    package MyPSGIApp;
    
    use strict;
    use Data:: Dumper ();
    use parent 'Plack::Component';
    
    sub call {
        # $env is the full PSGI environment
        my ($self, $env) = @_;
    
        return [
    
            # HTTP Status code
            200,
    
            # HTTP headers as arrayref
            [ 'Content-type' => 'text/html' ],
    
            # Response body as array ref
            [ '<!DOCTYPE html>',
              '<body><h1>Hello world</h1><pre>',
              Data:: Dumper:: Dumper($env),
              '</pre></body></html>',
            ],
        ];
    }
    
    1;
    

That's it, this is a full PSGI application that does dump all its PSGI environment.

Of course in a real example, you probably want a template engine to return the page content, etc… That's what we are building for our applications. Actually just assembling the components we already have developed during these years, so we have template classes, config classes, localization, database access, etc…

So we're basically just gluing these ready made components inside the PSGI application, and then using them. I don't think this is particularly original, but it allows us to quickly "port" our code to PSGI and thus run anywhere we want to.

app.psgi in detail

Now, let's see the PSGI app in more detail.

use constant ENVIRONMENT         => 'development';
use constant APACHE_DEPLOYMENT   => (ENVIRONMENT eq 'production');
use constant ENABLE_ACCESS_LOG   => (ENVIRONMENT eq 'development');
use constant ENABLE_DEBUG_PANELS => (ENVIRONMENT eq 'development');

These constants are used to turn on and off certain features mentioned later in the builder {} block. I just found out the other day that these constants are near to useless. That is because plackup and starman already provide a -E environment switch. If you start your application with:

starman -E development myapp.psgi     # same with plackup, the default server

then Plack will by default enable the debugging panels and the Apache-style access log. I found out about this after having written that file. This means that the following enable_ifs are unnecessary:

mount "/myroot" => builder {
    enable_if { ENABLE_DEBUG_PANELS } 'StackTrace';
    enable_if { ENABLE_DEBUG_PANELS } 'Debug';   # panels => [ qw(DBITrace Memory) ];
    enable_if { ENABLE_DEBUG_PANELS } 'Lint';
    enable_if { ENABLE_DEBUG_PANELS } 'Runtime';
    enable_if { ENABLE_ACCESS_LOG   } 'AccessLog';
    $app;
}

I think Plack enables by default at least StackTrace, Debug, and AccessLog. In my case, however, I'm also enabling RunTime and Lint. But more importantly, I need to differentiate between Apache deployment and Starman deployment. That affects the way static files are served.

When deploying under Apache, I don't need the following:

enable_if { not APACHE_DEPLOYMENT }
    'Plack::Middleware::Static',
    path => qr{^/(bitmaps/|images/|js/|css/|downtime/|favicon.ico$|ping.html$)},
    root => '..';

because my PSGI application is enabled in an Apache <Location> block, as in:

<Location /myroot/>
    SetHandler perl-script
    PerlResponseHandler Plack::Handler::Apache2
    PerlSetVar psgi_app /my/path/to/app.psgi
</Location>

So Apache already takes care of serving the static files for me. However, when running completely under Starman, I need to tell it which folders or paths need to be served as static files, and where they are located. This is the purpose of the Static middleware:

enable_if { not APACHE_DEPLOYMENT } 'Plack::Middleware::Static',
    path => qr{^/(images/|js/|css/|favicon.ico$)},
    root => '/var/www/something';

If you're always deploying through plackup or starman, then, again, you don't need any enable_if, just enable. Maybe it's also a good idea to put everything under /static. For me that wasn't possible, since I already had existing content:

enable 'Plack::Middleware::Static',
    path => qr{^/static/},
    root => '/var/www/something';

Plack::Builder

About the Plack::Builder bit, and the related builder function. That is a function that helps you specify what you want Plack to run and how. Example:

builder {
    enable 'StackTrace';
    enable 'Debug';
    enable 'AccessLog';
    $app
}

where StackTrace, Debug, and AccessLog are all middleware classes, so causes Plack to wrap your final $app application first with the AccessLog middleware, then Debug and then StackTrace. I didn't check the code, but I believe this creates 3 different PSGI applications that are meant to fiddle with the response that your own application generates.

PSGI makes this possible, and it's just great. More middleware means easier and faster development. And ultimately, very good middleware makes for great reuse too.

The mount wrapper

I used mount in my example very basicly, but you can use mount to assemble compounds of applications in a very simple way. The same thing you do, for example, with Django and urls.py, except that, if you have seen a non-trivial urls.py, it looks like spaghetti after a while. Compare with this:

my $app1 = MyApp->new();
my $app2 = MyApp2->new();
#...

builder {

    enable 'Plack::Middleware::Static', 
        path => qr{^/static/},
        root => '/var/www/something';

    mount "/path1" => builder {
        enable 'StackTrace';
        $app;
    }

    mount "/path2" => $app2;

    mount "/path3" => builder {
        enable 'SomeMiddleware';
        $app3;
    }

}

Of course, then you have to add some dispatcher logic to your applications, but in the Plack world, we don't lack good dispatchers.

Plack rocks.