Recently we migrated most of our services from Varnish 2.0 to 2.1.
I'd like to explain what we changed with code (VCL) examples side by side,
in case anyone still needs to migrate to 2.1 and needs some help as well :-)
req.
, bereq.
, beresp.
, and obj.
Usually this naming difference in VCL is not really explained. They say "x has been renamed to y"
and you should change the name. That's kind of annoying. In reality, yes, the names changed, and at first
it is annoying, but trying to understand why they changed allows them to stick
in your mind very easily.
In vcl_fetch()
, obj.
is now beresp.
. Why?
Because vcl_fetch()
is the part of the request stage where Varnish has
already performed a request against a backend and got a response from it. That means that
if you refer to obj.
in vcl_fetch()
, it really means that your
touching the backend response, hence beresp.
.
Similarly in vcl_pipe()
, that is executed when the result of vcl_recv()
is to switch to pipe mode. In that case, however, Varnish hasn't made the request
to the backend yet, so if you used obj.
in vcl_pipe()
you really meant
to change the request that was going to be made to the backend, hence bereq.
.
Let's see the changes we had to make:
sub vcl_fetch { - set obj.ttl = 88s; - set obj.grace = 10m; - set obj.http.X-My-Opera = "http://youtube.com/watch?v=br79xGSpgF4"; + set beresp.ttl = 88s; + set beresp.grace = 60m; + set beresp.http.X-dramatic = "http://www.youtube.com/watch?v=a1Y73sPHKxw"; }
And:
sub vcl_pipe { # Streaming files too (see related vcl_recv() rule). # We need to close the request, or varnish remains in pipe # mode for the entire session with that client. - set req.http.connection = "close"; + set bereq.http.connection = "close"; }
Backend probes and .initial
Backend probing allows Varnish to detect that backends are either "healthy"
or "sick". The probe VCL config block allows to tweak how this should work. In
particular, .threshold
is the number of successful probes that are necessary
for Varnish to consider a backend healthy. .interval
is the number of seconds
between one probe and the following one.
As an example, you can define that a backend should be considered working
(healthy) when it answers successfully to at least 3 probes, with an interval of
10 seconds between each probe. In Varnish 2.0.4, this means that if restarted,
Varnish will wait 3 times 10 = 30 seconds before serving any requests
from that backend, because all backends were considered dead (sick) at startup.
In 2.1 this limitation is removed by introducing an .initial
attribute
in the probe block. .initial
is the number of probes considered successful
when the service is started, or the backend is added, and there's no information about it.
The default value is assumed to be equal to .threshold
, so backends are considered
healthy as soon as they are introduced.
I think you can understand from these tiny details how well Varnish is engineered.
This just makes sense, doesn't it? :-) Here's the diff from 2.0 to 2.1:
backend nginx { .host = "localhost"; .port = "8080"; - - # Disabled to avoid the 15s startup - # 2.0.4-5 doesn't have .initial - # - #.probe = { - # .url = "/ping.html"; - # .interval = 5s; - # .timeout = 1s; - # .window = 5; - # .threshold = 3; - #} + .probe = { + .url = "/ping.html"; + .interval = 10s; + .timeout = 2s; + .window = 10; + .threshold = 3; + .initial = 3; + } }
And in vcl_recv()
:
sub vcl_recv { [...] - #---------- - # DISABLED: Only enable when .probe block above is enabled - #---------- # Detect broken backends and keep serving requests if possible - #if (! req.backend.healthy) { - # set req.grace = 10m; - #} else { - # set req.grace = 5s; - #} + if (! req.backend.healthy) { + set req.grace = 60m; + } else { + set req.grace = 5s; + }
Regular expression matching
Another "big" difference is the use in 2.1 of a Perl-compatible regular expression engine,
(PCRE) instead of the POSIX-style regex matching that used to be in 2.0.
This is a good change for me, as I'm pretty much used to Perl regex and I know next to nothing
about POSIX.
This change actually created a subtle problem that I caught only with a thorough testing
of our configurations. We use regex matching in a few places in our VCL configuration,
usually to analyze cookies and set special "flags" that are then used to force
a HTTP Vary
header, to make Varnish store different cached versions of the same
URL.
One of these cases is the language
cookie, where we store a sticky
user preference about site language. Here's how the code changed:
# STD: Sticky language cookie if (req.http.Cookie ~ "language=") { set req.http.X-Language = - regsub(req.http.Cookie, "^.*?language=([^;]*?);*.*$", "1"); + regsub(req.http.Cookie, "^.*?language=([^;]*);*.*$", "1"); } ... # Mobile view cookie if (req.http.Cookie ~ "mobile=") { - set req.http.X-Mobile = - regsub(req.http.Cookie, "^.*?mobile=([^;]*?);*.*$", "1"); + set req.http.X-Mobile = + regsub(req.http.Cookie, "^.*?mobile=([^;]*);*.*$", "1"); }
In case you find it difficult to spot the change, it's the removal of the *?
(non-greedy star) operator. Non-greedy matching was used in 2.0, POSIX matching, to make
sure that the * didn't match too many characters, and thus eat part of other cookies. Except
POSIX regex matching does NOT have a non-greedy star operator. I just
didn't know that, and it's of course a bug, but it had worked perfectly so far… WTF???
For even more weirdness, why did I take the non-greedy star (*?) away now that it should
be supported with PCRE-matching? I removed it because otherwise the result of those
regsub()
expressions are always empty!
Believe it or not, it looks exactly like 2.0 had PCRE and 2.1 has POSIX, which is
obviously not what's happening. If you know more about this and you can shed some light,
please contact me or leave a comment below.
Hope you liked this 2.0 -> 2.1 migration journey. I'm looking forward to 2.1 -> 3.0!
It's a bit more work there, because I will need to migrate my
my accept-language C extension
to the new vmod system, which I already started working on :-)
Have fun!