Tag Archives: server

“Loadsnake” AKA the Novell Netware snake screensaver clone

For those that didn't have the pleasure to see the old Novell Netware snakes screensaver, I'll say here that it was the default Netware screensaver, in console/text mode. It showed one snake for each CPU you had (99.9% of people had just 1 really). The cool thing is that the snakes became longer and longer as your server load increased. They also started going faster.

Anyway, this is one of those little time-wasting projects that usually go nowhere. I started working on a clone of this Netware screensaver in 2007. I remember I wanted to figure out how to write an xscreensaver "hack", so I spent a weekend looking at the source code for all the existing hacks, and I picked popsquares.c as a base and started tearing it apart, injecting dubious amounts of crappy C code until it did what I wanted.

Fast forward 4 years. Yesterday, for some reason, I got back to it, cleaned up the code a bit, and implemented a "fantastic" new feature I've always wanted: different snake colors for every different CPU, instead of all snakes being red. So I did, and the result is, well, see it for yourself:

Source code, but don't take inspiration from it, please… :) is up on github at http://github.com/cosimo/xscreensaver-loadsnake. You can also download the xscreensaver binary module if you want (only for Linux x86_64), as compiling it requires a bit of fiddling on the xscreensaver source code.

I have to admit that it's cool to run your own screensaver :)

Handling a file-server workload with varnish – part 2

I wrote about tuning varnish for a file server. I'd like to continue, detailing what we had to change compared to the varnish defaults to achieve a good and stable performance.

These were the main topics I had mentioned:

  • threads related parameters
  • hash bucket size
  • Session related parameters
  • Grace config and health checking

Threads-related parameters

One thing is certain. You'd better bump up the minimum threads number. When varnish starts up, it creates "n" threads. If more threads are needed, it can create as much as "m" threads but no more.

"n" is given by thread_pool_min * thread_pools. The defaults are 2 thread pools, and thread_pool_min is 200, so varnish will create 400 threads when starting up. We found that we need at least 6,000 threads, sometimes peaking at 8,000. In this case, it's better to start up directly with 7-8,000 threads. We set:

  • thread_pools = 8 since we have a 8 cores machine
  • thread_pool_min = 800

With these settings, on start Varnish will create 6,400 threads and keep them running all the time.

We also set a related param, thread_pool_add_delay to 2 ms, instead of the default I believe 20 ms. This allows Varnish to create a lot of threads more quickly when it starts up, or when more threads are needed. Using the default 20 ms value could slow down the threads creation process, and prevent Varnish to serve requests quickly enough.

Hash bucket size

Don't know much about the hashing internals, but I know we have tens of millions of files, even more, so we have to make sure the hash tables used to store cached objects are big enough, to prevent too many hashing collisions.

This is controlled by the -h option to varnishd. The default bucket size is 50023. We set it to 500009 (-h classic,500009). In this way, even if we could keep 10 million files in memory, we would only have 20 entries in each bucket on average. That's not ideal, but it's better than the default.

We didn't experiment with the new hashing algorithms like critbit.

Session-related parameters

Not so much on this particular server, but in general, we had to bump up the sess_workspace parameter. The default is 16kbytes (16384). sess_workspace controls the amount of memory dedicated by varnish to each connection (session in varnish speak), that is used as a working memory for the HTTP header manipulations. We set it to 32k here. On other servers, where we use a more elaborate VCL config, we use 128k as the default value.

Grace and health checking

Varnish can check that your defined backends are "healthy". That means that they respond to queries in the defined time, and they don't miss heartbeats. You enable health checks just by defining a .probe block in your backend definition (search the Varnish wiki for details).

Having health checks is very convenient. You can instruct varnish to extend the grace period when/if your backend is dead. This means: if Varnish detects that your backends are dead or overloaded and they miss some heartbeats, it will keep serving stale objects from its cache, even if they expired (their TTL is already over). You enable this behaviour by saying:

sub vcl_recv {
   set req.backend = mybackend;

   # Default grace period is 10s
   set req.grace = 10s;

   # OMG. Backend dead. Keep serving stuff until we recover them.
   if (! req.backend.healthy) {
      set req.grace = 4h;
   }
    
   ...
}

sub vcl_fetch {

   # Renew cached objects every minute ...
   set obj.ttl = 60s;

   # ... but keep all objects way past their expire date
   # in case we need them because backends died
   set obj.grace = 4h;

   ...

}

That's it. We're continuing to refine our configs and best practices for Varnish servers. If you have feedback, leave a comment or drop me an email.

Handling a file-server workload with varnish

During last couple of months, we have been playing with varnish for files.myopera.com, the My Opera main files server.

I'm not sure this is a typical use case for varnish, maybe it is, but it has a few unique challenges which I'll try to explain here:

  • really high number of connections (in the 10k range)
  • large file set, ~100 millions
  • the longer the TTL, the better (10 days it's the default)
  • really simple or no VCL logic

In other Varnish installations we're maintaining here at Opera, the real challenge is to seamlessly interface with backend application servers, but in this case the "backend" is just another http file server with little more logic.

Searching around, and using the resources I mentioned some blog posts ago, we have found a few critical settings that need to be tuned to achieve consistent performance levels. Some are obvious, some others are not so obvious:

  • threads related parameters
  • hash bucket size
  • Session related parameters
  • Grace config and health checking

I'll explain all the settings we had to change and how they affected us in a later post.