Tag Archives: perl

Calling all Mojolicious users: patches welcome?

So you're using Mojolicious. Good. We started using it too, and it's great. We started having some patches lying around, which wouldn't be integrated into the mainline.

We're starting to reach a critical mass and I have been considering the idea of starting our own Mojolicious "branch". I'd like to know how many of you are in the same situation, and issue a call to action:

if you need or have needed patches to Mojolicious that for whatever reason were not integrated into the official repository, please contact me, leave a comment here or send me an email. I'd like to hear from you!

Fixed temporary files handling in HTTP::DAV

It's more than 3 years already that I took over maintainership for HTTP::DAV. I've been fixing several bugs, last one today (and 0.45 is just out on CPAN), and I have to say that it was a fantastic exercise, that I really suggest to anyone even moderately interested in open source development and improving their own programming skills.

Here's how it works:

  • Pick a CPAN distribution that has been put up for adoption, or one that your $work depends on (my case for HTTP::DAV)
  • Contact its author or current maintainer
  • Take a look at its RT queue (usually it's something like https://rt.cpan.org/Public/Dist/Display.html?Name=Some-Dist-Name
  • Pick whatever bug you fancy from the list
  • Write a test case for it, naming it t/RT_[ticket_number].t
  • Fix the bug in the code, and see your test case pass

That's what I've been trying to do with HTTP::DAV, that was back then completely unknown code to me. I hope the results are decent. At least there hasn't been any regression reported so far… :-)

Enjoy, https://github.com/cosimo/perl5-http-dav and https://metacpan.org/module/HTTP::DAV.

Got maintainership of Bookmarks::Parser, and gave back some Opera love to it

A new project just came in last week. We need to analyze Opera desktop builds (like this one) and extract all settings from them and populate a nice database.

We can do that quite easily since Opera stores most of its settings as either .ini or bookmarks (.adr) files.

In Perl land, we can use Config::IniFiles and Bookmarks::Parser to do most of this. We had to subclass Config::IniFiles to skip the non-standard Opera Preferences file ... first line, while Bookmarks::Parser, that includes an Opera-specific Bookmarks::Opera class, wasn't updated with all the latest Opera-specific properties we use in our Desktop builds.

Working on our first prototype of this build cataloger tool, we "patched" Bookmarks::Opera to do what we needed, but the solution that makes the most long-term sense is to bring back some love upstream, so have our patches in the CPAN version of Bookmarks::Parser. It's what makes the most sense to me anyway, so I usually always try to get in touch with the current author or maintainer.

That's what I did in this case too, with great results. I was given co-maintainer bit in less than 24 hours. Rest of the story is in the github repository now:

  • Fixed a couple of bad bugs in the Opera bookmarks parser
  • Added test cases for those bugs, plus a sample of our current Desktop build bookmarks file

Then, since I had a spare couple of hours during this weekend:

  • I fixed all bugs ever reported in the RT queue (2 LOL), which were more than 5 years old
  • Added some documentation love

So, now we'll just package the stock CPAN version of Bookmarks::Parser instead of maintaining our own patched version, plus whoever attempts to use it to work with Opera files will be happier.

Source code, as usual, up on https://github.com/cosimo/Bookmarks-Parser, and CPAN is now at v0.04.

Internationalization (i18n) with Mojolicious and Template Toolkit

In a previous post I talked about this new Mojolicious-based application that I've been working on, that btw was rolled out in production today (yay!)

Classic I18N with TT

One of the required features of this app was "i18n", internationalization. To be less vague, the requirement was to present the UI in different languages. We're using Template Toolkit, so our templates need to have strings marked in a special way to allow translation to kick-in at run-time. Usually in TT you do this with:

<html>
 
<head>
<title>[% l('This is the title of the page') %]</title>
</head>
 
<body>
<h1>[% l('Hello, world!') %]</h1>
<p>
[% l('Some text here') %]
</p>
</body>
</html>

so all the strings that have to be translated according to the user language have to be marked up with:

[% l('<your string here>') %]

Enter Mojolicious

Mojolicious includes a built-in I18N plugin that simplifies your life allowing the <% l('somestring)' %> syntax to work. That is, it gives you a l() helper.

Helper?

A helper is a method that it's available both as part of your controller object, and within templates.

Back to Mojolicious…

In the example helper syntax I wrote <% l('somestring)' %> because that's Mojolicious default templating system syntax. However, under Template Toolkit, you can't use that syntax! You have to pass through an extra level, as in:

<!-- This is my TT template -->
[% c.l('<your string here>') %]

I'm not exactly sure why that c. is required, but that's how it is.

I18N workflow: extracting the strings

Everything would be fantastic, except there's one tricky problem. After you worked so hard on your TT templates, now it's time to collect all the marked up strings, presumably to build a .PO file to be shipped to translation agencies or whatever system you're using for that. More on that later.

In the Perl world, there is an equivalent of GNU xgettext, which is xgettext.pl. This tool is part of the Locale::Maketext::Lexicon CPAN distribution, which is kind of "the standard" way to i18n in Perl. Or it is for us here anyway since we started building i18n for my.opera.com in 2008.

The tricky problem is that even though xgettext.pl understands quite a few syntax variants, it didn't understand [% c.l('string') %]. After a few Perl debugger sessions, I managed to teach Locale::Maketext::Extract::Plugin::TT2 how to parse Mojolicious-style syntax. I knew that Clinton Gormley, the maintainer of L::M::Lexicon had a source repository for it on Github, so I forked his repository and pushed my changes on a dedicated branch.

CPAN, Github and the Community

This is where the Github + CPAN model really shines. You're using a CPAN module. You stumble on a problem. Fix the problem. Find its repository on Github. Fork it, push your fix, and if you're lucky, you have your fix merged and out on CPAN the same day.

This is what actually happened. Clinton got in touch the very same day I sent him the pull request and later pushed out the changes on CPAN. If you ask me, that's just awesome. I wish everything worked that way :)

Closing the i18n workflow

Fixed the c.l() problem, everything else was easier. xgettext.pl allows you to collect strings from your code and templates and build a master .PO file with all the strings. Then msgmerge, a standard GNU gettext tool, allows you to take the generated master PO file and merge it with any existing language-specific PO if any. If you don't have any, just copy the master PO file (usually called POT, or reference PO file) to <language>.po and start translating.

Last step is either:

  • compiling the .po files to .mo, a lookup-optimized form of the .po file
  • creating the "lexicon" files. In the Perl world, these are nothing more than Perl modules with a %Lexicon hash that contains all string IDs and their translations

We're long time fans of the latter approach, so our lexicon files look like this:

package AuthOpera::Locale::it;
 
use strict;
use utf8;
use base qw(AuthOpera::Locale);
 
### LEXICON STARTS HERE (don't remove this line)
our %Lexicon = (
 
    # Automatic fallback to string ID when no translation available
    _AUTO => 1,
 
    # String IDs                  # Translations
    "Application name:"        => "Nome dell'applicazione:",
    "Application registration" => "Registrazione dell' applicazione",
    "Data provider:"           => "Provider dei dati:",
 
    # ...
);
### LEXICON ENDS HERE
 
1;

and we use a simple subclass of Locale::PO to read the PO file in memory and write back a lexicon based on a fixed template, hence the ### LEXICON lines above.

Transifex

Currently we also use Transifex, that allows to have external translators contribute to PO files directly from a web page, and if you configure it to do so, commit straight to your source code repository. You can then trigger automated builds of the lexicon files, having completed the full i18n workflow.

I find this system pretty simple but at the same time fully automated and very powerful. I'd love to hear comments or feedback about this stuff, especially from people adopting a different process.

Perl6 LWP::Simple now uses the URI module for added awesome

My little LWP::Simple module for Perl 6 is growing beyond my control. I just merged a a pull request with a patch to remove all existing URL parsing to replace it with the new URI class.

That's fantastic, because:

  • it reduces the amount of code in LWP::Simple
  • the URI module is based on the actual grammar for URIs and IPv6 addresses. Plain and simple. It doesn't adhere to the standard. It is the standard. Isn't that amazing?

So, just clicky clicky and the pull request was merged. Running the test suite confirmed that everything still works fine passing all tests. Can it get much better than this? :)

If you want to have a look at Perl 6 code, modules.perl6.org is the place to go for inspiration. Have fun!

Using Template Toolkit with Mojolicious

For an upcoming project, I decided to try and use Mojolicious in production. That would be the first time, so I'm quite excited to see what's going to happen.

A few days ago I wrote some sample application that just loads a basic Template Toolkit template and renders it, and benchmarked it using both:

  • mod_perl and Plack::Handler::Apache2 and,
  • using starman as self-contained HTTP server running the psgi application

I have to say that I was quite impressed with the performance level of Starman. I got 1,000+ (a thousand plus) requests per second without the server even breaking a sweat. The command line, just in case, was:

starman --workers 32 ./app.psgi

Anyway, back to using TT. I found myself searching for recipes on how to use TT with Mojolicious because there wasn't a clear documented answer on how to do it, or at least I didn't find it. An example of what I came up with follows.

Step 1: the Mojolicious application class

First you have to create your application class. You should probably use the script that generate the basic skeleton for you. There's nice documentation on how to do that. My class looks like this:


package My::PSGI::App;

use strict;
use base 'Mojolicious';
use Opera::Config;

sub startup {
    my $self = shift;
    
    $self->secret('some-secret-random-string');

    # Our internal configuration system
    my $conf = Opera::Config->new();
    my $tmpl_dir = $conf->get('Template:include_dir');
    my $cache_dir = $conf->get('Template:cache_dir');

    # Tell Mojolicious we want to load the TT renderer plugin
    $self->plugin(tt_renderer => {
        template_options => {
            # These options are specific to TT
            INCLUDE_PATH => $tmpl_dir,
            COMPILE_DIR => $cache_dir,
            COMPILE_EXT => '.ttc',
            # ... anything else to be passed on to TT should go here
        },
    });

    $self->renderer->default_handler('tt');

    my $r = $self->routes;

    # Your routes should go here
    $r->route('/login')->to('account#login');
    # ... and so on ...

}

1;

To have your TT templates picked up, you only need a few more things.

Mojolicious::Plugin::TtRenderer

When you declare that you want to load the tt_renderer plugin (see above, $self->plugin(tt_renderer=>...)), then Mojolicious will "camelize" the tt_renderer string, turn it into Mojolicious::Plugin::TtRenderer, and try to load that plugin, if available.

Turns out there was a MojoX::Renderer::TT CPAN module that also contained a class called Mojolicious::Plugin::TtRenderer. I said there was because Sebastian Riedel, the main developer of Mojolicious had in the meantime deprecated the MojoX namespace.

Since we're building the modules we want to use in production as deb packages, we would have run the risk to package MojoX::Renderer::TT to have it changed later because of this namespace conflict. To avoid this, I decided to fork its repository and put together a patch to remove the use of the MojoX:: namespace. With this, I hoped to get the thing done and hopefully picked up quickly by the maintainer of MojoX::Renderer::TT.

Turned out that he was super responsive (thanks Ask!) to merge the change and release it to CPAN, so ladies and gentlemen, I hereby announce we have Mojolicious::Plugin::TtRenderer 1.20+ out!

In fact, the old deprecated MojoX:: module is still there, just don't use it, and install Mojolicious::Plugin::TtRenderer instead.

Templates naming

Another thing you need for TT to work out of the box is that your templates should(*) be named sometemplate.html.tt. (*) probably you can deviate from this convention, I just don't know yet.

Your controller should specify TT as the renderer

UPDATE: this is not needed. If you're using:

$self->renderer->default_handler('tt');

in your main application class, then you won't need to specify format and handler in every controller.

Again, not sure it's really needed (no, it's not, read above), check before you copy/paste. Here's a simple action from one of my controllers (following the previous example):


package My::PSGI::App::Account;
 
use strict;
use base 'Mojolicious::Controller';
 
sub login {
    my $self = shift;
    
    $self->render(
        template => 'path/to/template', # *without* .html.tt
        format   => 'html',
        handler  => 'tt',
    );
    
}

1;

That should be it: have fun!

EDIT: Thanks Robert for the suggestions.

How to detect TCP retransmit timeouts in your network

Some months ago, while investigating on a problem in our infrastructure, I put together a small tool to help detecting TCP retransmits happening during HTTP requests.

TCP retransmissions can happen, for example, when a client sends a SYN packet to the server, the server responds with a SYN-ACK but, for any reason, the client never receives the SYN-ACK. In this case, the client correctly waits for a given time, called the TCP Retransmission Timeout. In the usual case, this time is set to 3 seconds.

There's probably a million reasons why the client may never receive a SYN-ACK. The one I've seen more often is packet loss, which in turn can have lots of reasons, for example a malfunctioning or misconfigured network switch.

However, you can immediately spot if your timeout/hang problems are caused by TCP retransmission because they happen to cause response times that are unusually frequently distributed around 3, 9 and 21 seconds (and on, of course).

In fact, the TCP retransmission timeout starts at 3 seconds, but if the client tries to resend after a timeout and still receives no answer, it doubles the wait to 6 s, so the total response time will be 9 seconds, assuming that the client now finally receives the SYN-ACK. Otherwise, 3 + 6 + 12 = 21, then 3 + 6 + 12 + 24 = 45 s and so on and so forth.

So, this little tool fires a quick batch of HTTP requests to a given server and measures the response times, highlighting slow responses (> 0.5s). If you see that the reported response times are 3.002s, 9.005s or similar, then you are probably in presence of TCP retransmission and/or packet loss.

Finally, here it is:


#!/usr/bin/env perl
#
# https://gist.github.com/1101500
#
# Fires HTTP request batches at the specified hostname
# and analyzes the response times.
#
# If you have suspicious frequency of 3.00x, 9.00x, 21.00x
# seconds, then most probably you have a problem of packet loss
# in your network.
#
# cosimo@opera.com, sometime in 2011
#

use strict;
use LWP::UserAgent ();
use Time::HiRes ();

$| = 1;

my $ua = LWP::UserAgent->new();
$ua->agent("$0/0.01");

# Tests this hostname
my $server = $ARGV[0] || die "Usage: $0 <hostname>n";

# Picks the URLs to test in this list, one after the other
my @url_pool = qw(
	/ping.html
);

my $total_reqs = 0;
my $total_elapsed = 0.0;
my $n_pick = 0;
my $url_to_fire;

my $max_elapsed = 0.0;
my $max_elapsed_when = '';
my $failed_reqs = 0;
my $slow_responses = 0;
my $terminate_now = 0;

sub output_report {
	print "Report for:            $server at " . localtime() . "n";
	printf "Total requests:        %d in %.3f sn", $total_reqs, $total_elapsed;
	print "Failed requests:       $failed_reqsn";
	print "Slow responses (>1s):  $slow_responses (slowest $max_elapsed s at $max_elapsed_when)n";
	printf "Average response time: %.3f s (%.3f req/s)n", $total_elapsed / $total_reqs, $total_reqs / $total_elapsed;
	print "--------------------------------------------------------------------n";
	sleep 1;
	return;
}

$SIG{INT} = sub { $terminate_now = 1 };

while (not $terminate_now) {

	$url_to_fire = "http://" . $server . $url_pool[$n_pick];

	my $t0 = [ Time::HiRes::gettimeofday() ];
	my $resp = $ua->get($url_to_fire);
	my $elapsed = Time::HiRes::tv_interval($t0);

	$failed_reqs++ if ! $resp->is_success;

	$total_reqs++;
	$total_elapsed += $elapsed;

	if ($elapsed > $max_elapsed) {
		$max_elapsed = $elapsed;
		$max_elapsed_when = scalar localtime;
		printf "[SLOW] %s, %s served in %.3f sn", $max_elapsed_when, $url_to_fire, $max_elapsed;
	}

	$slow_responses++ if $elapsed >= 0.5;
	$n_pick = 0       if ++$n_pick > $#url_pool;
	output_report()   if $total_reqs > 0 and ($total_reqs % 1000 == 0);

}
continue {
    Time::HiRes::usleep(100000);
}

output_report();

# End

It's also published here on Github, https://gist.github.com/1101500. Have fun!

Matching IPv6 addresses with Regexp::Common

I wish Regexp::Common had a $RE{net}{IPv6} regular expression, but it doesn't (yet).

So I tried to implement this myself, but ripped off the IPv6 matching bit from the existing Regexp::IPv6 which happens to have a working IPv6 regular expression with a reasonable test suite. Now, why Regexp::IPv6 is not part of Regexp::Common?

By the way, I'll copy/paste the full regular expression to match IPv6 addresses, just for fun:

:(?::[0-9a-fA-F]{1,4}){0,5}(?:(?::[0-9a-fA-F]{1,4}){1,2}|:(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.]
(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?
[0-9]{1,2})))|[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:
[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}|:)|(?::(?:[0-9a-fA-F]{1,4})?|(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?
[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4]
[0-9]|[0-1]?[0-9]{1,2}))))|:(?:(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})
[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|[0-9a-fA-F]{1,4}(?::[0-9a-fA-F]
{1,4})?|))|(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|
2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|:[0-9a-fA-F]{1,4}(?::(?:(?:25[0-5]|2[0-4]
[0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.]
(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){0,2})|:))|(?:(?::[0-9a-fA-F]{1,4}){0,2}(?::(?:
(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?
[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))|(?:(?::[0-9a-fA-F]{1,4}){0,3}
(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|
[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))|(?:(?::[0-9a-fA-F]{1,4})
{0,4}(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4]
[0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))

Of course, I will never be able to tell if it's right or wrong, but the fact is that it passes the test suite :)
However, the actual code is not like that: it generates the full regular expression from a few components. Anyway, I've pushed a ipv6 branch on my fork of Regexp::Common. I hope it will be included soon in Regexp::Common or improved it enough to be included in it, so we can finally match IPv6 addresses with:


use Regexp::Common;

my $addr = '2001:0db8:0000:0000:0000:0000:1428:57ab';
if ($addr =~ $RE{net}{IPv6}) {
    print "Yes, it is an IPv6 address";
}
else {
    print "No, it isn't";
}

Continuous integration of Perl-based projects in Hudson/Jenkins

I didn't find massive amounts of information about how to link any Perl-based project to Jenkins for continuous integration, but there's a few presentations on Slideshare that carry some nice ideas.

While some older pages say that "there's no out-of-the-box integration, etc…", I think there is. A very simple, very straight-forward way to integrate any (Perl) project that uses TAP into Jenkins.

TAP::Harness::JUnit

Here we go then:

TAP::Harness::JUnit will capture all the standard TAP output and turn it into the default JUnit XML output that Jenkins expects. And you don't need to do anything to make this happen. How cool is that? Read below.

Build instructions

You need to instruct Jenkins on how to build your project. So, in the "Build" panel, I usually put:

prove -I ./lib -v

If you don't use prove, be ashamed and start using it :) You'll never look back. So, getting Jenkins to understand TAP is just a matter of modifying that command to read:

prove -I ./lib -v --harness=TAP::Harness::JUnit

Here's the actual Build panel screenshot:

That's it. prove will produce a junit_output.xml file with the JUnit-compatible XML output that corresponds to the standard TAP output.

Post-build actions

Now you need to tell Jenkins that the file is actually there. I'm not sure why, but this is not automatic. You need to tell it to "Publish JUnit test results". Now, if you ask me that's totally surprising, but it works. So:

That should be it. Run your build and you should see your tests output picked up.

Geo::IP support for IPv6 geolocation

We're currently looking into IPv6-enabling our services. One of the missing bits is being able to geolocate IPv6 client addresses. We're using the MaxMind GeoIP database. The main Perl library for this is Geo::IP.

The current version of Geo::IP out on CPAN, 1.38, does not support IPv6 lookups. I contacted the maintainer of Geo::IP asking for more information. In the meanwhile, I hacked together just enough of IPv6 support to be able to successfully geolocate a test address. Later on, I discovered that IPv6 is already available in the hopefully soon-to-be-released version of Geo::IP archived at Sourceforge.

Let's hope it lands on CPAN soon. In the meantime, if you really really want, you can try out my changes against CPAN v1.38. It was enough for me to start testing the integration with our other code and projects.

My Geo::IP with IPv6 support