Tag Archives: conference

My experience at Velocity Europe 2011 in Berlin

TL;DR

This year there was the 1st edition of Velocity Europe. I got to present a talk on a DDoS attack we faced at Opera, and it was really awesome to be there.

The long version…

Around July this year I knew there was going to be a Velocity Conference in Europe, and I decided I would try to propose a few ideas for talks. I didn't have my hopes too high, but I wanted to give it a shot anyways, pushing myself way out of my comfort zone :)

The worst that could happen was that the talks didn't get accepted. After a month or so the crazy thing happened, and I got an invite to speak at Velocity, due in November.

Preparation

The first few months passed while I was slowly gathering material for the talk. The idea was talking about the DDoS attack that struck us in October 2010. Almost a year had passed, so if we hadn't taken notes and collected all sorts of logs and information, we wouldn't have had any chance to reconstruct all the "story" with enough detail to be interesting.

Anyway, weeks went by, and in September I started writing down an outline of the talk. It consisted in describing what happened during the DDoS and how we faced it, what we did, how we figured out what to do, etc… but I didn't have a clear idea of what to convey with the actual presentation. What would be the core message, if any?

If I learned anything out of all this, is that writing an outline is absolutely the best favor you can do to yourself to avoid so many problems later on. Just write it down as a text, a blog post, a story. Mind maps are also useful for me.

Last 3-4 weeks flew away while I was trying to put together a decent deck of slides.

In Berlin: pre-conference

"Birds of feathers" was the pre-conference event that took place on Monday 7th (November 2011, if you're reading this in the future), put together by a local team led by Schlomo Schapiro, which I got to talk to also during the conference. It was a good event, Steve Souders and John Allspaw and many other conference attendees were already there. There were sponsor companies presenting their products.

The most interesting sessions of the pre-conference IMO were:

  • 100ms: Steve Souders pushed everyone to think about the next level of web performance. How to bring down the "loading time" of web pages to 100ms. There was an interesting discussion about that. My point was that loading time really needs to be divided into at least dns resolution, server processing, network transfers, client rendering. So there's at least 4 totally different chunks that make up the load time and all of them can be optimized, but with varying levels of gain and complexity.
  • Dyn inc presentation about their product dashboard, that led to a better productivity and communication between teams. Cory van Wollerstein explained their mash-up of Jira and Confluence, used to automatically pull information from the tickets db and provide high-level overviews to executive teams. Very cool. He also argued whether having product managers is a good thing for a company.

The rest of the day I was busy polishing my presentation, and trying to rehearse at the hotel. A month before the conference, I had bought The Naked Presenter (ebook edition), hoping that it would help me do a decent presentation. The book of course recommended to rehearse. It felt very weird and embarassing, but I'm *so* glad I did it. I managed to streamline the presentation, and memorize the sequence of slides.

The Conference – Day 1

Schedule:

http://velocityconf.com/velocityeu/public/schedule/grid/2011-11-08

Keynotes

Opening remarks, plus Theo Schlossnagle, one of the minds behind SurgeCon, on how good operations dudes are usually generalists and need to have a wide spectrum knowledge instead of being "(Perl|Python|Ruby|Java) developers". I really recognize myself in this more generalist role than, for example, the Ruby-on-Rails guy.

Lightning demos

These were lightning demos during the first morning:

Rest of Day 1 went to hell

I had to convert all my slides to 4:3 and test again with the on-site equipment. I was also freaking out at the same time, so I missed everything else until my talk. Sorry :)

Most talks have been recorded and are already up on the Velocity site. Particularly interesting IMO, but video not available yet, are:

My talk

As I said, it was about the DDoS attack to my.opera.com of October 2010. I basically talked about how we found out we were under DDoS, and how we struggled to find our way to keep the site up and running despite the traffic. This was a mid-scale DDoS with around 18k distinct IPs attacking us. We had a hard time, but it was also very much fun in retrospect :) We learned quite a lot in the process, about HTTP and TCP/IP, nkiller2 and the TCP zero-window exploit. Most importantly, we learned to make better use of old and new tools to do troubleshooting. You will find all of this in the slides.

I did my best, and I think it was well received by the audience. While on stage, I really had the feeling that people enjoyed it, plus several folks came to say hello afterwards. One of the most frequent comments I heard was that people found my talk honest. That is the single thing I appreciate the most, because that had been one of my goals since the start. To tell an honest and detailed story of how things went, without pretending to be the super awesome heroes that know everything and can fix anything in no time.

Unfortunately, after the conference I was informed that there had been no recording of the talk. That is really sad. However, since there's no recording, I can pretend I was a nice speaker, given the ratings :). Seriously, if you have a picture or video recording, contact me :)

Here's the slides if you're interested:

http://velocityconf.com/velocityeu/public/schedule/detail/21653

The Conference – Day 2

Schedule:

http://velocityconf.com/velocityeu/public/schedule/grid/2011-11-09

Keynotes

Very inspirational talk by Jeff Veen, Typekit.com

Very well presented, great visuals. Great overall. How to create conditions for teams to work and work well.

http://velocityconf.com/velocityeu/public/schedule/detail/21788

Anticipation: What could possibly go wrong? by John Allspaw, Etsy

A great talk about how to prevent, analyze, respond to Operations problems. I very much like John's style, I think he's a pioneer, at least he introduced me to many great ideas, one above all, continuous deployment. I also like his many references to aviation, aerospace and military engineering fields.

http://velocityconf.com/velocityeu/public/schedule/detail/22258

Full stack awareness, Artur Bergman, Fastly

He's Artur Bergman. Listen to him :-) If anything, because he's really authentic.

http://velocityconf.com/velocityeu/public/schedule/detail/22914

Lightning demos

Another session of lightning demos, for our pleasure:

Browser performance track

This was a track in itself. I lost all of it, since I mainly followed the Operations track, but this was really interesting I heard. Recent speed enhancements in Opera, Chrome, Firefox and Javascript in general were explained in detail.

Afternoon talks

Deploying large payloads at scale, Ramon van Alteren (hyves.nl)

Biggest social network in the Netherlands (4M daily active users, ~10M total users). Ramon is a very cool guy. They have 3.5K servers, and their main application consists of 750Mb compiled php binaries to deploy. And they are experimenting with bittorrent tools to do that :)

I had a few hours of engaging talk with Ramon at one of the social events that followed the conference. We found lots of similarities in how we're dealing with infrastructural growth, scaling, etc… We both use config management tools like puppet extensively in our organizations. We promised each other to remain in touch about deployment matters.

http://velocityconf.com/velocityeu/public/schedule/detail/21571

HTTP connection management from 10 users to 100 millions, Bradley Heilbrun, YouTube

Really interesting dive into YouTube early (2005-2007) architecture with Apache, load balancers, GSLB.

I met Bradley later on that day and we had a quick chat. Turns out they use(d) PowerDNS with its pipe backend for geographic load balancing, much like as we do in Opera with GeoDNS. That made my day :-) It's a pity that companies like YouTube don't talk much about their current technology. They usually tell you about 2-3 years old architectures. That's still very valuable, of course.

http://velocityconf.com/velocityeu/public/schedule/detail/21708

Conclusions

If you're even remotely interested in operations, devops, running a service, scaling, performance, infrastructure, then Velocity is the conference. Surge is another one, probably even better, more hardcore-engineering focused. From my perspective, there's a couple of things that could be improved:

  • while I understand that sponsors are what makes conferences like Velocity possible, some sponsors took too much time out of the actual talk tracks. One or two talks were very promotional in nature, and it was clear to everyone that these companies were pushing their products or themselves. Maybe it wasn't their intention, but to me and to others I talked to, it came out that way.
    I think Velocity needs to screen better this type of talks and separate them from the authentic content that people want, the "stories from the trenches". As a counter-example, Google, among other companies, were doing sponsoring (and recruiting!) activities in a separate hall. That worked very well for everyone. Please let's keep it that way.
  • the on-site technical team wasn't fully prepared to handle presentations made with Open Office. That is not acceptable if you ask me, even if the majority of speakers have a Mac. It's 2011 (2012 now even), so you really need to be prepared to read OpenOffice files. I realize that wasn't Velocity organizers' fault, but I think it's something to consider for next time.

That said, I'm really really happy about my experience at Velocity Europe, both as a speaker and as attendee. It was really awesome, and worth every moment I spent working to prepare for it. Thank you O'Reilly, and I hope to be able to participate again some day :)

From disaster to stability: the scaling challenges of My Opera (Surge 2010)

I just read John Allspaw's blog post about his talk at Surge 2010. John's talks were among the best of the conference IMO. So I was amazed to read his post.

I was really honored to take part in this conference as speaker. Just before my proposal being accepted, I had bought the book Web Operations. When the Surge speakers list was announced, I was thrilled to discover that many contributors to the book, including Allspaw, were also speaking at Surge, and even more honored to have the privilege to being in the same group and knowing them in person.

If you're interested in the talk I presented, while the Omniti folks bring up videos and slides on the Surge website, you can download them from Slideshare:

I hope the video is delayed, so I can avoid the embarrassment for a little while :-)

Surge 2010 scalability conference in Baltimore, USA – DAY 2

This is a summary of day 2 of the Surge conference that took place in Baltimore, USA, 30th of September and 1st of October 2010. For a quite comprehensive blog post about day 1, you can read my previous post.

Here comes the list of talks I attended during Day 2.

Brian Cantryll – failures in commodity hardware

What happens when commodity hardware is used in an "enterprise" hardware project? Brian guided the audience through this industrial hw project. There was no recorded video of this talk, due to the content being potentially "sensitive". Very interesting talk, and Brian is IMO a very good speaker.

Benjamin Black – FastIP

Benjamin presented a – for me – new way to analyze metrics of a network, named "Flow". The flow-based network metrics can represent a network activity in a way that is completely different and much more accurate than what's usually done by operations and sysadmin departments. The downside is that is generates a lot of data. The advantage is that you can analyze and even replay? any traffic that took place between any two nodes of the network. I'm sure I didn't understand correctly because this would be amazing.

There's products out there that offer flow-based network analysis: Cisco Netflow, Ntop NProbe, etc… There's also a IETF working group about flow. We couldn't see any example/demo because there was a problem with the slides, IIRC.

FastIP also offers a related service. I contacted Benjamin about this after his talk. Maybe we'll be able to try something out or at least have a demonstration.

My TODO list:

Gavin Roy – Scaling MyYearBook.com

One of the most interesting talks in this conference IMO. MyYearbook is a Postgres shop, among the top 25 trafficked sites in the USA.

Gavin talked about many things they did to scale their site as the traffic was growing. Here's some of the things I remember:

  • DB connection pooling very important for them. Made a world of difference. They use PgBouncer and pgPool2
  • DB Horizontal scaling with pIProxy. TODO: look it up
  • DB Replication w/ Londiste, Slony, Bucardo
  • Postgresql 9.0 based standby to increase read-only capacity, and for hot-standby.
  • Partitioned the database by table, feature available since Pg 8.1

They have a primary-to-secondary master failover procedure. They looked into automating it, but a tech judgement is really necessary in case something goes wrong, so they will keep it manual. This was a question I asked to Gavin, since we've thought about automating our failover procedure for MySQL, but it's not so easy to just decide when to trigger the failover…

For user storage, they use Isilon IQ Series, apparently a FreeBSD appliance with on-board NFS. For DB servers, they looked at different solutions, but they keep coming back to direct attached storage. Their man db server, they have a massively powerful machine, IIRC, 512Gb of RAM and 128 cores machine. I have to double check this because it seems really impressive.

John Allspaw – Go or No Go

Another great talk by John, well presented and with great content. Not easy to summarize. The main topic was the "Go or no-go meeting", a 10 minutes get together of all involved parties before releasing changes or launching any new feature live.

This meeting basically consists of Yes/No questions:

  • Have you tested enough to deploy? QA still needed?
  • Has the feature being communicated (blog/forum/…)?
  • Does everyone know: when it will go live? who will push the feature?
  • Has the feature been in production for staff (or beta users)? That can be tricky to implement if the new feature implies social interactions (beta user tagging non-beta user)
  • Is it possible to dark launch this feature? Will we?
  • Is it possible to turn on this feature on a % of users? Will we?
  • Does it involve new infrastructure? If so: is there monitoring in place? (BLOCKER)
  • On/Off switch in the code/config is in place? Is it documented?
  • Are all the relevant people available for communication and launch?
  • Is there a place for users to provide feedback about the feature?
  • Post-launch "it's all done" time agreed?
  • Contingency checklist done and everytime reviewed it? (BLOCKER)

The "Contingency list" should answer the question: "What could possibly go wrong? What will we do about it?", with a list of potential issues and how to solve them in case shit hits the fan.

Apart from the Go/No-go meeting, which would be, also according to my past experience, a great way to avoid problems, there's at least a couple more really nice things to keep in mind when developing or launching a new feature:

  • "Dark launches": a dark launch is essentially a full launch of the new feature, but in such a way that is invisible to users. So if you're making db queries and processing stuff, you keep doing all that, you just throw the data away. You will be able to realize the (almost) full impact of the new feature on your application and compensate accordingly.
  • Feature "sampling" (% of users): you just enable the full feature for a small, and then growing, percentage of your user base. You can gradually grow to 100% and test the effect of the changes.

Great stuff.

Neil Gunther – Quantifying scalability

Here I was a bit too excited, due to my talk coming next, so unfortunately I didn't pay too much attention. It's a full analysis of scalability seen as a mathematical function, as capacity of your system as the load increases.

Cosimo Streppone – Scaling challenges of my.opera.com

I think

I used 5 minutes to show a live demo of the My Opera realtime monitor application that we built and afterwards I got very interesting questions, and also some nice twitter messages about it.

I also talked about how we've experimented in distributing requests across the different datacenters with our little geodns tool.

All in all, for me it was a fantastic experience. Practice will make me better, so I look forward to a next time :-)

Baron Schwartz – Scaling without sharding

Baron works for Percona. I had read some talks of his. I think he's a really good speaker. He explained in detail the scenarios that arise when dealing with database scaling, the typical characteristics of reads and writes, single server vs multiple servers deployments.

Basically what the talk tries to suggest is that very few situations require to shard your database. Single server setups can go very far, by optimizing the way the db works. Quote: "Sharding should be your last resort". Sharding should be enforced when write demand exceeds write capacity, so avoid sharding if you can, try to buffer/collate writes, defer update work, etc..

Closing day 2

Theo Schlossnagle closed the conference with a plenary keynote about a semi-serious "brief history of computing". Much fun, and a goodbye to next year's Surge.

For a glimpse of what happened live at the conference, you can also check out the Twitter stream for #surgecon.

Definitely a great conference. Stay tuned for videos and slides on the official site, http://omniti.com/surge/2010.

Communities in Action 2010 in Oslo

Last Monday, 10th of May, the regular Oslo.pm (Oslo Perl Mongers) meeting was a little special. In fact, there wasn't any Oslo.pm meeting, but we went to a free, one evening mini conference sponsored by several norwegian companies: "Communities in Action".

The basic idea was to put together several different communities in the Oslo area, so the conference program was diverse and exciting. There were 7 different tracks. Among them, the most interesting for me, apart from Oslo.pm, were:

  • XP (extreme programming, agile, etc…)
  • scalabin (about the scala language)
  • OWASP
  • Oslo C++ users group

Other tracks were by the NNUG, Norwegian .NET users group and IASA, some norwegian association about something…

There were a couple of colleagues from Opera, and other guys I know from Oslo.pm, but being there alone, I couldn't follow more than one track, so I tried jumping a bit between different rooms. I found the XP talk a bit boring, so I settled on the Oslo.pm track. There was a talk on rakudo * by Karl Rune Nilsen and a talk about meta-object programming in Perl vs Ruby by Matt Trout.

Both talks rocked. I think I had attended a very similar rakudo talk before, but the Ruby one was entirely new. It was funny when Matt, just before starting, tried to attract Ruby folks screaming and shouting
throughout the hotel hall :-)

After the conference, a small group of us gathered and headed over to tilt, a nice pub/pinball place, where I ended up making the day record even if I suck at pinballs…

European Perl Conference, Day 1

Every YAPC::EU (Yet Another Perl Conference Europe) is a really big event in the Perl world, with lots of people from every part of the planet. I got to know some of them already, so we just meet like good friends :-) This year's theme was Corporate Perl, how Perl is used in the corporate world.

This time though I was presenting a talk during the first day of the conference: How Opera uses Perl, that's up on Slideshare right now. If you take a look at it, you will find out that we actually use Perl for a lot of systems, from the very tiny to very complex, mission-critical ones. It's been quite some fun preparing the talk, and I think it also went decently.

There were lots of other interesting talks, even lightning talks, like Giuseppe Maxia's MySQL Sandbox, or Sue Spencer's talk about "Perl at Cisco Systems". There was also a talk on roles and inheritance in OO systems by Curtis Poe of the BBC, and a really funny lightning talk by Alex Kapranoff, a russian guy, but I don't remember the title. Merijn Brand presented lots of ways to improve your Perl modules. This guy's amazing. Also avid Opera user.

During lunch we met up with Martin Berends and Carl Mäsak and talked about Perl 6 syntax, CPAN 6, etc… really cool people.