May | 2013 | Random hacking

Haven’t written much in the last few months. The reason is that I’ve been at work building the Opera Discover service backend, that we launched on Opera mobile for Android just a few days ago.

A few weeks before, during the first stress test sessions of Discover, I wrote this little tool called vlogdump that Opera allowed me to put up on github. The main purpose, besides learning awk :-) is to display and filter traffic coming into your varnish daemon..

vlogdump is not meant to replace varnishlog but I know that sometimes varnishlog gives me too much output to deal with, especially if I want to pinpoint a single client or a single request. I know that the varnishlog that ships with varnish 3.0.x is way better in this regard, but we’re using 2.1.x, and that version of varnishlog is not as capable.

vlogdump is easier to look at than varnishlog, but at the same time it conveys much more information than varnishncsa or the typical access.log format.

Here’s an example of output:

$ varnishlog | vlogdump -v only_misses=1

  172.22.0.15 => GET /assets/e85ed0a7b1b87120a0a2bfa025531c6733a48802 HTTP/1.0 MISS

            <= 200 OK                                                          28.432 ms
  172.22.0.18 => GET /assets/5a9e9440c5c85e8dc5d65e03e15c95e390901fa7 HTTP/1.0 MISS

            <= 200 OK                                                          36.905 ms
  172.22.0.18 => GET /icons/categories/te/icon32x32-technology.png HTTP/1.0 MISS

            <= 304 Not Modified                                                 0.589 ms
  172.22.0.15 => GET /api/fetch/article-preview/?client=2&language=en-GB HTTP/1.1 MISS

            <= 301 MOVED PERMANENTLY                                            8.381 ms
  172.22.0.18 => GET /assets/c3830e95b717761005e26ce49ebab253e0ccb40b HTTP/1.0 MISS

            <= 200 OK                                                         291.354 ms
  172.22.0.18 => GET /api/category?client=2&language=en-GB HTTP/1.1 MISS

            <= 200 OK                                                          58.025 ms
   ...

Another interesting example.

Show request and response headers of transactions that resulted in cache hits and had request headers (any of them) matching "Android":

$ varnishlog | vlogdump -v show_req_headers=1 -v show_resp_headers=1 -v req_headers_match=Android -v only_hits=1

  83.149.37.122 => GET /api/category/?... HTTP/1.1 HIT

            <= 200 OK         0.088 ms

          req.http.Accept = application/json;version=1

          req.http.Accept-Encoding = gzip

          req.http.Host = ...opera.com

          req.http.Connection = Keep-Alive

          req.http.User-Agent = Mozilla/5.0 (Linux; Android 4.1.2; GT-N7100 Build/JZO54K) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.58 Mobile Safari/537.31 OPR/14.0.1074.57768

 

          beresp.http.Server = Apache

          beresp.http.Content-Encoding = gzip

          beresp.http.Content-Type = application/json

          beresp.http.Vary = Accept-Encoding, Origin

          beresp.http.Content-Length = 4217

          beresp.http.Date = Sat, 25 May 2013 07:59:53 GMT

          beresp.http.X-Varnish = 1611090407 1611007435

          beresp.http.Age = 267

          beresp.http.Via = 1.1 varnish

          beresp.http.Connection = keep-alive

Now that you're eager to try it :-), you can do so in a few commands, and assuming you have the right awk installed:

wget -q -Ovlogdump https://raw.github.com/cosimo/vlogdump/master/vlogdump varnishlog | ./vlogdump [options]

The documentation lists all the available options.

You can do more interesting things:

display request or response headers for each transaction
(-v show_req_headers=1, -v show_resp_headers=1)
show only requests slower than 200ms (vlogdump -v only_slow=200)
show only cache misses or hits (-v only_hits=1 or only_misses=1)
show only transactions where the URL matches regexp X
(-v url_match='X' or -v url_match='!X' for negative match)
show only transactions where the HTTP status code was X
(-v only_status=X)
show only transactions where the request or response headers match a given regular expression (-v req_headers_match=Blah, -v resp_headers_match=Error)

You can also combine most of these options together. That is very useful when you are interested in a small fraction of the traffic, but you want to see the whole in-flight transactions.

One recommendation though. It is my first (last?) significant awk script :-) I know it works well, and I'm using it, but due to the way it works, I wouldn't leave it running for long periods of time, as it will slowly eat your memory keeping track of all transactions and clients.

If you have feedback or questions, feel free to comment on github or send me an email.

Random hacking

Assume nothing. Code defensively. Keep it simple, stupid!

Monthly Archives: May 2013

Display and filter traffic at the varnish level: vlogdump