We've been using Jenkins to build and test all our projects for a good part of this year now. I think Jenkins is one of the very few Java projects I've seen and worked with that actually works and it's a real pleasure to use. Except every now and then it seems to crash without reason..
I haven't had time to dig into this problem yet. I've only seen the frontend Apache process logging errors because it cannot connect to the Tomcat backend on port 8080. My theory so far is that Jenkins tries to auto-update and crashes, or maybe there's a runaway test run that brings everything down…
Time is really limited these days, and I have heard good things about monit, I decided to try it to see if we could have Jenkins kicked when it dies for some reason. In this way we can avoid cases where the test suites haven't been running for a day or two and nobody noticed… :-|
So, long story short, here's the quick and dirty monit recipe to kick Apache + Jenkins (this is on Debian Squeeze):
check process jenkins with pidfile /var/run/jenkins/jenkins.pid start program = "/etc/init.d/jenkins start" with timeout 90 seconds stop program = "/etc/init.d/jenkins stop" if failed host my.host.name port 8080 protocol http and request "/" then restart check process apache with pidfile /var/run/apache2.pid start program = "/etc/init.d/apache2 start" with timeout 60 seconds stop program = "/etc/init.d/apache2 stop" if failed host my.host.name port 80 protocol http and request "/" then restart
And, just for kicks, a complete Monit module for puppet up on Github. Have fun!