Tag Archives: deployment

Using hypnotoad in production, anyone?

So, you're using hypnotoad in production. And it works perfectly for you. Maybe you have an Nginx or Apache in front of it configured as reverse proxy. Everything's great. Right? Right. Then I have a zillion questions for you.

Maybe I don't understand how it works, but I'm having the following problems:

  • "sometimes" hypnotoad won't stop. I usually try to stop it with:
    hypnotoad --stop /path/to/my/script
  • I use symlinks to deploy applications, so for example I deploy in /opt/myapp and each new deployment gets a timestamped folder, /opt/myapp/releases/20120224-180801.

    Then there's a symlink that always points to the last deployed version, /opt/myapp/current/opt/myapp/releases/{whatever-datetime}. Now, using hypnotoad --stop /opt/myapp/current doesn't work, because hypnotoad probably uses the actual filename, not the symlink, to identify the running application.

    That's fine, but then how can I stop it reliably? I wish it had a hypnotoad --force-stop mode or something.

  • Last problem, when I push a new deployment, and stop and restart hypnotoad, often the application doesn't work properly, it only generates exceptions for unknown reasons. Stopping and restarting again manually usually fixes the problems…
  • I was a bit frustrated today, so I decided to switch back to starman. I have never ever had a problem with it, so I will stick to it for now. But I would still be interested to know whether you use hypnotoad in production and how well it works. Write in the small box below, you don't need to register. Thanks :)

How to tag a remote git repository or… vcs support for fabric

With svn, you can tag a remote repository with:

svn cp http://{your-svn-server}/svn/{project}/trunk http://{your-svn-server}/svn/{project}/tags/{tag-name}

or if you're already in a working copy:

svn cp ^/{project}/trunk ^/{project}/tags/{tag-name}

The latter case assumes you have a working copy already checked out, but the first case is more interesting for what I needed.

Tagging when deploying

Lately I've been working on some deployment tools in the form of a few fabric classes. One of the things I want to do when launching a production deployment is auto-tagging the repository with the new build name.

The tag naming I went for is something like:

<project_name> - <date> - <time> - <who_deployed>

Example:

geodns-20110409-133701-cosimo

Every time there's a new production deployment using these tools, the
repository / revision that is being deployed is tagged with names like those.
The plan to use this added metadata for a "deployment console", but I didn't
have time to do anything about it yet.

vcs.py

Having planned the move from svn to git, I had to add a thin abstraction to the
fabric deployment classes to make sure that when the repository url changed from
svn to git, nothing really changed from the deployment point of view.

I ended up with a generic vcs.py class for fabric that implements vcs-related actions such as:

  • exporting a remote repository to a local directory
  • listing available tags on a remote repository
  • tagging a remote repository

This means I had to find out how to do these things in both svn and git.

Exporting a remote repository

With svn:

svn export [--force] http://svn.server/project/trunk /your/local/dir

and you can use --force if the local directory already exists, or svn will refuse to do it by default.

Git requires an intermediate step:

git archive --prefix=some-dir-name/ --remote=git.server:/var/git/project.git master | tar xvC /path/where/to/export

Listing available tags (remotely)

With svn:

svn list http://svn.server/project/tags/

With git:

git ls-remote --tags git.server:/var/git/project.git

Thanks to my colleague Alfie for the ls-remote tip.

Tagging a remote URL

I mentioned how you do it with svn:

svn cp http://svn.server/project/trunk http://svn.server/project/tags/tagname

What about git though? I searched a bit around, and I found no git command to
directly tag a remote repository.

I looked at the Jenkins git plugin source code but AFAICS there's no magical way to do it, so I figured out I would just clone the remote repository, tag locally and then push the tag to origin.

In theory, this should be just fine, except it has some drawbacks:

  • Execution time: if the remote repository is very large, we need to clone it first, and that can take a long time.
  • Size: when cloning a large git repository, the local copy will take up disk space for nothing. We don't need it, as we just want to tag the remote repository.

Not sure this is the best thing to do, but what I'm using right now is:

  • Cloning with --depth=1:

    git clone has a --depth option that limits the amount of history that is cloned. In this case, we don't need any history, so --depth=1 is great:

    git clone --depth=1 <git-remote-url> <local-dir>

    Example:

    git clone --depth=1 git.server:/var/git/project.git /var/tmp/deploy.$USER.$$
  • Tagging locally:
    cd /var/tmp/deploy.$USER.$$
    git tag -as <tag-name>
    
  • Pushing the tag remotely:
    git push origin --tags
  • Removing the temporary local copy:
    rm -rf /var/tmp/deploy.$USER.$$

That's it. Not very brilliant, but works great for now. If you know of a better way to tag a remote git repository, or some existing work on these things, please get in touch or add a comment below. Thanks! :)

Adding the irc NOTICE capability to Bot::BasicBot

Bot::BasicBot is a Perl module that provides a really easy, fast and convenient way to build plugin-based IRC bots. I'm playing around with an IRC bot that should assist in continuously deploying projects.

This bot has two main functionalities:

  • keep track of continous integration builds
  • initiate and keep track of deployments

Right now the bot reads a main configuration file with data about projects, repositories, continuous integration, etc… and answers commands. This is an example:


21:58 <@cosimo> projects-list
21:58 < deployer> auth, geodns, libopera, link, myopera, sso
21:58 <@cosimo> build-status geodns
21:58 < deployer> 97ad24e success cosimo https://git.server/functests/builds/geodns/97ad24e
21:58 <@cosimo> latest-revision sso
21:58 < deployer> 5207cfe, https://git.server/?p=sso.git;a=commit;h=fe977d32e9580551dffe8139396106ba25207cfe
21:59 <@cosimo> build-status auth
21:59 < deployer> 24135 success cosimo https://test.server/functests/builds/auth-unit/24135
21:59 < deployer> 24135.2 success (manual) https://test.server/functests/builds/auth-functional/24135.2

Another functionality of the bot is to detect new builds, and automatically send updates to a given channel, stating the project, the new VCS revision, the committer and a link to the continuous integration test run. Example:


17:22 -deployer:#chan- sso, fe977d3 success cosimo https://test.server/functests/builds/opera-sso/fe977d3

In the future, I also want to command the bot to initiate deployments. Anyway, the problem was that Bot::BasicBot apparently lacked support for sending IRC notices. This caused all the bot messages to interrupt the flow of IRC conversations. Bot::BasicBot has source code also on github, so I just forked it and added support for IRC notices. I just noticed that the author already pulled in the new changes. o/

It will still take years for this to land in Debian, but still… :-)

Puppet custom facts, and master-less puppet deployment

As I mentioned a few weeks ago, I'm using Puppet for some smaller projects here at work. They're pilot projects to see how puppet behaves and scales for us before taking it into bigger challenges.

One of the problems so far is that we're using fabric as the "last-mile" deployment tool, and that doesn't yet have any way to run jobs in parallel. That's the reason why I'm starting to look elsewhere, like mcollective for example.

However, today I had to prepare a new varnish box for files.myopera.com. This new machine is in a different data center than our current one, so we don't have any puppetmaster deployed there yet. This stopped me from using puppet in also another project. But lately I've been reading on the puppet-users mailing list that several people have tried to deploy a master-less puppet configuration, where you have no puppetmasterd running. You just deploy the puppet files, via rsync, source control or pigeons, and then let the standalone puppet executable run.

Puppet master-less setup

To do this, you have to at least have a good set of reusable puppet modules, which I tried to build small pieces at a time during the last few months. So I decided to give it a shot, and got everything up and running quickly. Deployed my set of modules in /etc/puppet/modules, and built a single manifest file that looks like the following:


#
# Puppet standalone no-master deployment
# for files.myopera.com varnish nodes
#
node varnish_box {

    # Basic debian stuff
    $disableservices = [ "exim4", "nfs-common", "portmap" ]
    service { $disableservices:
        enable => "false",
        ensure => "stopped",
    }

    # Can cause overload on the filesystem through cronjobs
    package { "locate": ensure => "absent", }
    package { "man-db": ensure => "absent", }

    # Basic configuration, depends on data center too
    include opera
    include opera::sudoers
    include opera::admins::core_services
    include opera::datacenters::dc2

    # Basic packages now. These are all in-house modules
    include base_packages
    include locales
    include bash
    include munin
    include cron
    include puppet
    include varnish

    varnish::config { "files-varnish-config":
        vcl_conf => "files.vcl",
        storage_type => "malloc",
        storage_size => "20G",
        listen_port => 80,
        ttl => 864000,
        thread_pools => 8,
        thread_min => 800,
        thread_max => 10000,
    }

    #
    # Nginx (SSL certs required)
    #
    include nginx

    nginx::config { "/etc/nginx/nginx.conf":
        worker_processes => 16,
        worker_connections => 16384,
        keepalive_timeout => 5,
    }

    nginx::vhost { "https.files.myopera.com":
        ensure => "present",
        source => "/usr/local/src/myopera/config/nginx/sites-available/https.files.myopera.com",
    }

    bash:: prompt { "/root/.bashrc":
        description => "Files::Varnish",
        color => "red",
    }

    munin:: plugin::custom { "cpuopera": }

    munin:: plugin { "if_eth0":
        plugin_name => "if_"
    }

    munin:: plugin {
        [ "mem_", "load", "df", "df_inode", "netstat", "vmstat",
          "iostat", "uptime", "threads", "open_files", "memory", "diskstats" ]:
    }
}

node default inherits varnish_box {
}

node 'my.hostname.opera.com' inherits varnish_box {
}

This manifest installs varnish, nginx, a bunch of basic packages I always want on every machines (vim, tcpdump, etc…), munin and appropriate plugins already configured, and also a nice red bash prompt to warn me that this is production stuff.

This file is everything the puppet client needs to run and produce the desired effect, without needing a puppet master. Save it as varnish-node.pp and then you run it with:


puppet varnish-node.pp

One problem that usually arises is how to serve the static files. In this case, I assumed I'm going to check out the source code and config files from my own repository into /usr/local/src/... so I don't need to point puppet to a server with the classic:


source => "puppet:///module/filename"

but you can just use:


source => "/usr/local/whatever/in/my/local/filesystem"

That's great and it works just fine.

Custom facts

Puppet uses a utility called facter to extract "facts" from the underlying system, sysinfo-style. A typical facter run produces the following output:


$ facter
architecture => x86_64
domain => internal.opera.com
facterversion => 1.5.6
fqdn => cd01.internal.opera.com
...
hardwaremodel => x86_64
hostname => cd01
id => cosimo
ipaddress => 10.20.30.40
ipaddress_eth0 => 10.20.30.40
is_virtual => false
...
kernel => Linux
kernelmajversion => 2.6
...
operatingsystem => Ubuntu
operatingsystemrelease => 10.04
physicalprocessorcount => 1
processor0 => Intel(R) Core(TM)2 Duo CPU     E6550  @ 2.33GHz
processor1 => Intel(R) Core(TM)2 Duo CPU     E6550  @ 2.33GHz
processorcount => 2
...

and so on. Within puppet manifests, you can use any of these facts to influence the configuration of your system. For example, if memorysize > 4.0 Gb then run varnish with 2000 threads instead of 1000. This is all very cool, but sometimes you need something that facter doesn't give you by default.

That's why facter can be extended.

I tried creating a datacenter.rb facter plugin that would look at the box IP address and figure out in which data center we're located. That in turn can be used to setup the nameservers and other stuff.

Here's the code. My Ruby-fu is less than awesome:


#
# Provide an additional 'datacenter' fact
# to use in generic modules to provide datacenter
# specific settings, such as resolv.conf
#
# Cosimo, 03/Aug/2010
#

Facter.add("datacenter") do
    setcode do

        datacenter = "unknown"

        # Get current ip address from Facter's own database
        ipaddr = Facter.value(:ipaddress)

        # Data center on Mars
        if ipaddr.match("^88.88.88.")
            datacenter = "mars"

        # This one on Mercury
        elsif ipaddr.match("^99.99.99.")
            datacenter = "mercury"

        # And on Jupiter
        elsif ipaddr.match("^77.77.77.")
            datacenter = "jupiter"
        end

        datacenter
    end
end

However, there's one problem. When puppet runs, it doesn't get the new fact, even though facter from the command line can see it and execute it just fine (when the plugin is in the current directory).

Now I need to know how to inform puppet (and/or facter) that it has to look into one of my puppet modules' plugin (or lib from 0.25.x) directory to load my additional datacenter.rb fact.

Any idea ??

Puppet, Fabric and a Perl alternative?

Some time later this month I'm going to write more extensively about a project that I've been working on, not continuously, but for the last couple of months. It is about small and medium scale projects configuration management and deployment.

For configuration management I evaluated several products like bcfg2, puppet, cfengine and lcfg, and I finally chose puppet.

For "the last mile", as I call it, the alternatives that I considered were fabric, capistrano, ControlTier and TheNewShinyWheel(tm)

So I settled on puppet + fabric. Puppet is a Ruby system, while Fabric is Python code. None of them is particularly fast, actually Puppet is slow, and Fabric is acceptable. The main problem I'm confronting with, after having learnt how to use these tools, is that Fabric does not support parallel processing of tasks.

This is a severe limitation for us. This was a pilot project. If it works well, it could be applied to many other deployment tasks. That could also mean that a single deployment has to send code or files to tens of servers, and you don't want to do that sequentially waiting for each task to complete.

At the moment, this is impossible to do with Fabric. There is an experimental fork in the works that might support parallel execution, by adding a @parallel task decorator, but it still requires work and a good dose of testing.

During my survey I looked for mature Perl-based deployment tools, but I failed at finding them. While Fabric is nice, I might be tempted to reconsider my choice. Any suggestions?