How to detect TCP retransmit timeouts in your network

Some months ago, while investigating on a problem in our infrastructure, I put together a small tool to help detecting TCP retransmits happening during HTTP requests.

TCP retransmissions can happen, for example, when a client sends a SYN packet to the server, the server responds with a SYN-ACK but, for any reason, the client never receives the SYN-ACK. In this case, the client correctly waits for a given time, called the TCP Retransmission Timeout. In the usual case, this time is set to 3 seconds.

There's probably a million reasons why the client may never receive a SYN-ACK. The one I've seen more often is packet loss, which in turn can have lots of reasons, for example a malfunctioning or misconfigured network switch.

However, you can immediately spot if your timeout/hang problems are caused by TCP retransmission because they happen to cause response times that are unusually frequently distributed around 3, 9 and 21 seconds (and on, of course).

In fact, the TCP retransmission timeout starts at 3 seconds, but if the client tries to resend after a timeout and still receives no answer, it doubles the wait to 6 s, so the total response time will be 9 seconds, assuming that the client now finally receives the SYN-ACK. Otherwise, 3 + 6 + 12 = 21, then 3 + 6 + 12 + 24 = 45 s and so on and so forth.

So, this little tool fires a quick batch of HTTP requests to a given server and measures the response times, highlighting slow responses (> 0.5s). If you see that the reported response times are 3.002s, 9.005s or similar, then you are probably in presence of TCP retransmission and/or packet loss.

Finally, here it is:


#!/usr/bin/env perl
#
# https://gist.github.com/1101500
#
# Fires HTTP request batches at the specified hostname
# and analyzes the response times.
#
# If you have suspicious frequency of 3.00x, 9.00x, 21.00x
# seconds, then most probably you have a problem of packet loss
# in your network.
#
# cosimo@opera.com, sometime in 2011
#

use strict;
use LWP::UserAgent ();
use Time::HiRes ();

$| = 1;

my $ua = LWP::UserAgent->new();
$ua->agent("$0/0.01");

# Tests this hostname
my $server = $ARGV[0] || die "Usage: $0 <hostname>n";

# Picks the URLs to test in this list, one after the other
my @url_pool = qw(
	/ping.html
);

my $total_reqs = 0;
my $total_elapsed = 0.0;
my $n_pick = 0;
my $url_to_fire;

my $max_elapsed = 0.0;
my $max_elapsed_when = '';
my $failed_reqs = 0;
my $slow_responses = 0;
my $terminate_now = 0;

sub output_report {
	print "Report for:            $server at " . localtime() . "n";
	printf "Total requests:        %d in %.3f sn", $total_reqs, $total_elapsed;
	print "Failed requests:       $failed_reqsn";
	print "Slow responses (>1s):  $slow_responses (slowest $max_elapsed s at $max_elapsed_when)n";
	printf "Average response time: %.3f s (%.3f req/s)n", $total_elapsed / $total_reqs, $total_reqs / $total_elapsed;
	print "--------------------------------------------------------------------n";
	sleep 1;
	return;
}

$SIG{INT} = sub { $terminate_now = 1 };

while (not $terminate_now) {

	$url_to_fire = "http://" . $server . $url_pool[$n_pick];

	my $t0 = [ Time::HiRes::gettimeofday() ];
	my $resp = $ua->get($url_to_fire);
	my $elapsed = Time::HiRes::tv_interval($t0);

	$failed_reqs++ if ! $resp->is_success;

	$total_reqs++;
	$total_elapsed += $elapsed;

	if ($elapsed > $max_elapsed) {
		$max_elapsed = $elapsed;
		$max_elapsed_when = scalar localtime;
		printf "[SLOW] %s, %s served in %.3f sn", $max_elapsed_when, $url_to_fire, $max_elapsed;
	}

	$slow_responses++ if $elapsed >= 0.5;
	$n_pick = 0       if ++$n_pick > $#url_pool;
	output_report()   if $total_reqs > 0 and ($total_reqs % 1000 == 0);

}
continue {
    Time::HiRes::usleep(100000);
}

output_report();

# End

It's also published here on Github, https://gist.github.com/1101500. Have fun!

Capture a webcam image with a simple bash script

I did that a few months ago to make a poor man's time-lapse video. This one. I said I'd post how to do it somewhere. Here it is.

This is a bash script that will capture an image from a webcam or similar video acquisition device. Works on Linux only I guess. Makes use of gstreamer and video for linux.


#!/bin/bash

DEST_DIR="$HOME/Pictures/Webcam"

if [ ! -e $DEST_DIR ]; then
    mkdir $DEST_DIR
fi

DAY_DIR="$DEST_DIR/`date +'%Y%m%d'`"
if [ ! -e $DAY_DIR ]; then
    mkdir $DAY_DIR
fi

PIC_FILE="$DAY_DIR/webcam-`date +'%H%M%S'`.jpg"
gst-launch v4l2src num-buffers=1 ! jpegenc ! filesink location=$PIC_FILE

Save as ~/bin/camera-snapshot, and put it in crontab every minute if you want :)
After a while you will have a folder filled with pictures.

If you also want to assemble them into a movie file, use the excellent ffmpeg.

How to tag a remote git repository or… vcs support for fabric

With svn, you can tag a remote repository with:

svn cp http://{your-svn-server}/svn/{project}/trunk http://{your-svn-server}/svn/{project}/tags/{tag-name}

or if you're already in a working copy:

svn cp ^/{project}/trunk ^/{project}/tags/{tag-name}

The latter case assumes you have a working copy already checked out, but the first case is more interesting for what I needed.

Tagging when deploying

Lately I've been working on some deployment tools in the form of a few fabric classes. One of the things I want to do when launching a production deployment is auto-tagging the repository with the new build name.

The tag naming I went for is something like:

<project_name> - <date> - <time> - <who_deployed>

Example:

geodns-20110409-133701-cosimo

Every time there's a new production deployment using these tools, the
repository / revision that is being deployed is tagged with names like those.
The plan to use this added metadata for a "deployment console", but I didn't
have time to do anything about it yet.

vcs.py

Having planned the move from svn to git, I had to add a thin abstraction to the
fabric deployment classes to make sure that when the repository url changed from
svn to git, nothing really changed from the deployment point of view.

I ended up with a generic vcs.py class for fabric that implements vcs-related actions such as:

  • exporting a remote repository to a local directory
  • listing available tags on a remote repository
  • tagging a remote repository

This means I had to find out how to do these things in both svn and git.

Exporting a remote repository

With svn:

svn export [--force] http://svn.server/project/trunk /your/local/dir

and you can use --force if the local directory already exists, or svn will refuse to do it by default.

Git requires an intermediate step:

git archive --prefix=some-dir-name/ --remote=git.server:/var/git/project.git master | tar xvC /path/where/to/export

Listing available tags (remotely)

With svn:

svn list http://svn.server/project/tags/

With git:

git ls-remote --tags git.server:/var/git/project.git

Thanks to my colleague Alfie for the ls-remote tip.

Tagging a remote URL

I mentioned how you do it with svn:

svn cp http://svn.server/project/trunk http://svn.server/project/tags/tagname

What about git though? I searched a bit around, and I found no git command to
directly tag a remote repository.

I looked at the Jenkins git plugin source code but AFAICS there's no magical way to do it, so I figured out I would just clone the remote repository, tag locally and then push the tag to origin.

In theory, this should be just fine, except it has some drawbacks:

  • Execution time: if the remote repository is very large, we need to clone it first, and that can take a long time.
  • Size: when cloning a large git repository, the local copy will take up disk space for nothing. We don't need it, as we just want to tag the remote repository.

Not sure this is the best thing to do, but what I'm using right now is:

  • Cloning with --depth=1:

    git clone has a --depth option that limits the amount of history that is cloned. In this case, we don't need any history, so --depth=1 is great:

    git clone --depth=1 <git-remote-url> <local-dir>

    Example:

    git clone --depth=1 git.server:/var/git/project.git /var/tmp/deploy.$USER.$$
  • Tagging locally:
    cd /var/tmp/deploy.$USER.$$
    git tag -as <tag-name>
    
  • Pushing the tag remotely:
    git push origin --tags
  • Removing the temporary local copy:
    rm -rf /var/tmp/deploy.$USER.$$

That's it. Not very brilliant, but works great for now. If you know of a better way to tag a remote git repository, or some existing work on these things, please get in touch or add a comment below. Thanks! :)

A command line tool for Debian to purge Varnish objects

I've been using varnish mostly on Debian systems. I found the reload-vcl script included in Debian to be useful.

The reload-vcl script

It's part of the standard varnish debian package. It uses the system defaults in /etc/defaults/varnish, so it knows how to correctly invoke the varnishadm utility to perform administrative commands. As the name implies, it reloads the default VCL file using the vcl.load and vcl.use commands, checking that every step succeeds properly before continuing so it's safe to use. It loads the new VCL file and labels it automatically with a unique id.

Something analogous but regarding the purge functionality could have been useful, so I looked at the source code for reload-vcl. Most of it deals with loading of /etc/defaults/varnish and various sanity checks. I reused that bit to make another script, to control cache purging.

The purge-cache script

Here's the full source code. Below there's a link to download the latest version from github.


#!/bin/sh

# purge-cache: Script to purge varnish cache. Defaults are defined in
# /etc/default/varnish.
#
# Cosimo <cosimo@cpan.org>
# Based on reload-vcl, by Stig Sandbeck Mathisen <ssm at debian dot org>

# Settings
defaults=/etc/default/varnish

# Paths
varnishadm=/usr/bin/varnishadm
date=/bin/date 
tempfile=/bin/tempfile

# Messages
# msg_no_varnishadm: varnishadm
msg_no_varnishadm="Error: Cannot execute %sn"
msg_no_management="Error: $DAEMON_OPTS must contain '-T hostname:port'n"
# msg_defaults_not_readable: defaults
msg_defaults_not_readable="Error: %s is not readablen"
# msg_defaults_not_there: defaults
msg_defaults_not_there="Error: %s does not existn"
msg_usage="Usage: $0 [-h][-q][-u <url>|-r <regex>|-a]nt-htdisplay helpnt-qtbe quietnt-utpurge by exact (relative) url (ex.: /en/products/)nt-rtpurge objects with URL matching a regex (ex.: ^/blogs/)nt-atpurge all objects from cachen"
msg_purge_failed="Error: purge command failedn"
# msg_purge_url: url
msg_purge_url="Purging objects by exact url: %sn"
# msg_purge_regex: regex
msg_purge_regex="Purging objects with URL matching regex: %sn"
msg_purge_all="Purging all cachen"
msg_purge_ok="Purge command successfuln"

# Load defaults file
if [ -f "$defaults" ]
then
    if [ -r "$defaults" ]
    then
        . "$defaults"
    else
        printf >&2 "$msg_defaults_not_readable" $defaults
        exit 1 
    fi
else
    printf >&2 "$msg_defaults_not_there" $defaults
    exit 1
fi

# parse command line arguments
while getopts hqu:r:a flag
do
    case $flag in
        h)
            printf >&2 "$msg_usage"
            exit 0
            ;; 
        u)
            purge_method=url
        url="$OPTARG"
            ;; 
        r)
            purge_method=regex
        regex="$OPTARG"
            ;; 
        a)
            purge_method=all
            ;; 
        q)
            quiet=1
            ;; 
        *)
            printf >&2 "$msg_usagen"
            exit 1
            ;; 
    esac
done

# Parse $DAEMON_OPTS (options must be kept in sync with varnishd).
# Extract the -f and the -T option, and (try to) ensure that the
# management interface is on the form hostname:address
OPTIND=1
while getopts a:b:dFf:g:h:l:n:P:p:s:T:t:u:Vw: flag $DAEMON_OPTS
do
    case $flag in
        f)
            if [ -f "$OPTARG" ]; then
                vcl_file="$OPTARG"
            fi 
            ;; 
        T)
            if [ -n "$OPTARG" -a "$OPTARG" != "${OPTARG%%:*}" ]
                then
                mgmt_interface="$OPTARG"
            fi  
            ;;  
    esac
done

# Sanity checks 
if [ ! -x "$varnishadm" ]
then
    printf >&2 "$msg_no_varnishadm" $varnishadm
    exit 1
fi

if [ -z "$mgmt_interface" ]
then
    printf >&2 "$msg_no_management"
    exit 1
fi

logfile=$($tempfile)
purge_command="vcl.list"

# Now run the purge command against the admin interface
if [[ $purge_method = "url" ]]
then
        purge_command="purge req.url == $url"
        printf >&2 "$msg_purge_url" $url | grep -v "^$" > $logfile
else
    if [[ $purge_method = "regex" ]]
    then
        purge_command="purge.url $regex"
        printf >&2 "$msg_purge_regex" $regex | grep -v "^$" > $logfile
    else
        if [[ $purge_method = "all" ]]
        then
            purge_command="purge.url ."
            printf >&2 "$msg_purge_all" | grep -v "^$" > $logfile
        fi
    fi
fi

# For some reason, using:
#
#   fi | grep -v "^$" > $logfile
#
# results in purge_command assignment being wiped out
# at the end of the block??

if [ -z "$purge_command" ]
then
    printf >&2 "$msg_usagen"
    exit 1
fi

# echo "cmd: $varnishadm -T $mgmt_interface $purge_command"

if $varnishadm -T $mgmt_interface $purge_command
then
    printf >&2 "$msg_purge_ok"
else
    printf >&2 "$msg_purge_failed"
    exitstatus=1
fi | grep -v "^$" > $logfile

# Blather
if [ -z "${quiet}" -o -n "$exitstatus" ]
then
    cat >&2 $logfile
fi

# Cleanup
rm -f $logfile  
exit $exitstatus

You can control how objects are purged from the cache with 3 options:

  • -a: purges all objects
  • -u <url>: purges an exact url
  • -r <regexp>: purges objects matching a regular expression
  • Examples

    
    # Purges all objects
    purge-cache -a
     
    # Purges all objects starting with "/products"
    purge-cache -r '^/products'
    
    # Purges objects with exact URL
    purge-cache -u '/en/homepage'
    

    Goal: no downtime

    Both reload-vcl and purge-cache can be combined together in a single script to be triggered when deploying new VCL code or new backend applications. Instead of restarting varnish, which I really don't like, and it's not very reliable either (on Debian sometimes it won't come back up), I use purge-cache -a to purge all objects and then reload-vcl to load and use the newly deployed VCL code.

    This procedure has no downtime at all. The effect of purging all objects can potentially be hard on the backends, but we're not at that point yet. Usually in the busiest applications we have, it takes around 10-20 seconds to reach 70%-75% of hit rate, so I would say that's not really a problem right now.

    Download!

    You can download the purge-cache script from github. I contacted the maintainer of the reload-vcl script. Maybe he will include purge-cache in the next release of the varnish debian package… or maybe I could package it as a Perl CPAN module.

Matching IPv6 addresses with Regexp::Common

I wish Regexp::Common had a $RE{net}{IPv6} regular expression, but it doesn't (yet).

So I tried to implement this myself, but ripped off the IPv6 matching bit from the existing Regexp::IPv6 which happens to have a working IPv6 regular expression with a reasonable test suite. Now, why Regexp::IPv6 is not part of Regexp::Common?

By the way, I'll copy/paste the full regular expression to match IPv6 addresses, just for fun:

:(?::[0-9a-fA-F]{1,4}){0,5}(?:(?::[0-9a-fA-F]{1,4}){1,2}|:(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.]
(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?
[0-9]{1,2})))|[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:
[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}|:)|(?::(?:[0-9a-fA-F]{1,4})?|(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?
[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4]
[0-9]|[0-1]?[0-9]{1,2}))))|:(?:(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})
[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|[0-9a-fA-F]{1,4}(?::[0-9a-fA-F]
{1,4})?|))|(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|
2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|:[0-9a-fA-F]{1,4}(?::(?:(?:25[0-5]|2[0-4]
[0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.]
(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){0,2})|:))|(?:(?::[0-9a-fA-F]{1,4}){0,2}(?::(?:
(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?
[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))|(?:(?::[0-9a-fA-F]{1,4}){0,3}
(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|
[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))|(?:(?::[0-9a-fA-F]{1,4})
{0,4}(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4]
[0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))

Of course, I will never be able to tell if it's right or wrong, but the fact is that it passes the test suite :)
However, the actual code is not like that: it generates the full regular expression from a few components. Anyway, I've pushed a ipv6 branch on my fork of Regexp::Common. I hope it will be included soon in Regexp::Common or improved it enough to be included in it, so we can finally match IPv6 addresses with:


use Regexp::Common;

my $addr = '2001:0db8:0000:0000:0000:0000:1428:57ab';
if ($addr =~ $RE{net}{IPv6}) {
    print "Yes, it is an IPv6 address";
}
else {
    print "No, it isn't";
}

“Loadsnake” AKA the Novell Netware snake screensaver clone

For those that didn't have the pleasure to see the old Novell Netware snakes screensaver, I'll say here that it was the default Netware screensaver, in console/text mode. It showed one snake for each CPU you had (99.9% of people had just 1 really). The cool thing is that the snakes became longer and longer as your server load increased. They also started going faster.

Anyway, this is one of those little time-wasting projects that usually go nowhere. I started working on a clone of this Netware screensaver in 2007. I remember I wanted to figure out how to write an xscreensaver "hack", so I spent a weekend looking at the source code for all the existing hacks, and I picked popsquares.c as a base and started tearing it apart, injecting dubious amounts of crappy C code until it did what I wanted.

Fast forward 4 years. Yesterday, for some reason, I got back to it, cleaned up the code a bit, and implemented a "fantastic" new feature I've always wanted: different snake colors for every different CPU, instead of all snakes being red. So I did, and the result is, well, see it for yourself:

Source code, but don't take inspiration from it, please… :) is up on github at http://github.com/cosimo/xscreensaver-loadsnake. You can also download the xscreensaver binary module if you want (only for Linux x86_64), as compiling it requires a bit of fiddling on the xscreensaver source code.

I have to admit that it's cool to run your own screensaver :)

Continuous integration of Perl-based projects in Hudson/Jenkins

I didn't find massive amounts of information about how to link any Perl-based project to Jenkins for continuous integration, but there's a few presentations on Slideshare that carry some nice ideas.

While some older pages say that "there's no out-of-the-box integration, etc…", I think there is. A very simple, very straight-forward way to integrate any (Perl) project that uses TAP into Jenkins.

TAP::Harness::JUnit

Here we go then:

TAP::Harness::JUnit will capture all the standard TAP output and turn it into the default JUnit XML output that Jenkins expects. And you don't need to do anything to make this happen. How cool is that? Read below.

Build instructions

You need to instruct Jenkins on how to build your project. So, in the "Build" panel, I usually put:

prove -I ./lib -v

If you don't use prove, be ashamed and start using it :) You'll never look back. So, getting Jenkins to understand TAP is just a matter of modifying that command to read:

prove -I ./lib -v --harness=TAP::Harness::JUnit

Here's the actual Build panel screenshot:

That's it. prove will produce a junit_output.xml file with the JUnit-compatible XML output that corresponds to the standard TAP output.

Post-build actions

Now you need to tell Jenkins that the file is actually there. I'm not sure why, but this is not automatic. You need to tell it to "Publish JUnit test results". Now, if you ask me that's totally surprising, but it works. So:

That should be it. Run your build and you should see your tests output picked up.

Geo::IP support for IPv6 geolocation

We're currently looking into IPv6-enabling our services. One of the missing bits is being able to geolocate IPv6 client addresses. We're using the MaxMind GeoIP database. The main Perl library for this is Geo::IP.

The current version of Geo::IP out on CPAN, 1.38, does not support IPv6 lookups. I contacted the maintainer of Geo::IP asking for more information. In the meanwhile, I hacked together just enough of IPv6 support to be able to successfully geolocate a test address. Later on, I discovered that IPv6 is already available in the hopefully soon-to-be-released version of Geo::IP archived at Sourceforge.

Let's hope it lands on CPAN soon. In the meantime, if you really really want, you can try out my changes against CPAN v1.38. It was enough for me to start testing the integration with our other code and projects.

My Geo::IP with IPv6 support

Disable evil page back/forward keys on a Lenovo T500 Thinkpad

Currently I'm using a Lenovo Thinkpad T500. A fine machine. Apart from the awful keyboard layout. The ESC key is way too high, causing useless "F1" hits. But the stupidest thing on this layout is the placement of two evil evil keys that perform browser back and forward functions.

These keys are labeled with a page/document icon and arrows going left and right. I would really like to find whoeve r decided to place these keys there, and punish him with a couple of years of typing with his own keyboard layout :) It happened to me already a couple of times (even more, like right now for example) that I'm typing this long text in a browser text area, and then by accident I hit the evil "page back" key. Oops. F**k! And you just lost 10 minutes of editing because the browser is stupid and won't allow you to go back to your half-edited textarea. Isn't that great?. Fantastic, I'd say.

Today I had had enough of it. That's how you fix it:

Create a ~/.xmodmaprc file or similar, since the name doesn't matter, with the following content:


!
! Disable the idiot XF86Back and XF86Forward
! keys on the Lenovo T500 keyboard
!
! They mess up when editing textareas within
! the browser, causing so much waste of time and
! frustration.
!
! Cosimo, 23/Feb/2011

! XF86Back
keycode 166 = NoSymbol

! XF86Forward
keycode 167 = NoSymbol

And then run:


$ xmodmap ~/.xmodmaprc

VoilĂ , done. Thanks, xmodmap and thanks unnamed Lenovo keyboard engineer.

How to convert Opera contacts file to Mutt aliases format

Recently I've been looking more and more into mutt, the email client. I've been a very happy M2 (Opera built-in email client) user for almost 3 years now. But still I felt I was missing something if I didn't try out mutt. I've been a pine user as well, many many years ago :) So, decided to give it a go, I started about a month ago.

I struggled a bit while getting a reasonable .muttrc file together. Fortunately, there's plenty of examples out there. After getting a working config, the problem was to get back my contacts list.

Mutt has a simple address book integration (through abook) and stores the contacts into an alias file, typically ~/.mutt/aliases. Now, Opera can of course export all your mail contacts to an .adr file, a simple "addressbook" text file. Did that, and I needed to convert it to mutt's aliases format.

Ten minutes later, a Perl script to do just that was ready. Here it is:


#!/usr/bin/env perl
#
# Convert Opera contacts file (.adr) into
# mutt aliases file format.
#
# Usage:
#   perl opera-adr-to-mutt-aliases.pl < ~/.opera/contacts.adr >> ~/.mutt/aliases
#
# Cosimo, 31/Jan/2011
#

use strict;
use warnings;
use utf8;

sub harvest ($) {
    my ($contact_info) = @_;

    my ($id)    = $contact_info =~ m{^ s+ ID   = (.*) $}mx;
    my ($name)  = $contact_info =~ m{^ s+ NAME = (.*) $}mx;
    my ($email) = $contact_info =~ m{^ s+ MAIL = (.*) $}mx;

    return if ! $id and ! $email;

    return {
        ID    => $id,
        NAME  => $name,
        MAIL => $email,
    };

}

my $adr_file_contents = q{};
$adr_file_contents .= $_ while <STDIN>;

my @contacts = split m{#CONTACT}, $adr_file_contents;

for (@contacts) {
    my $contact = harvest($_) or next;
    my ($first_word) = $contact->{MAIL} =~ m{ (S+) @ }x;
    printf "alias %s %s <%s>n",
        lc($first_word), $contact->{NAME}, $contact->{MAIL};
}

Download link: https://gist.github.com/803454