On the Oct 19th AWS us-east-1 outage

These will be personal comments on the text that AWS put out after the October 19th 2025 us-east-1 outage, which you can find here: https://aws.amazon.com/message/101925/ and a bit more readable version here.

Important premises before starting:

  • While I’ve been working on web operations for a long time, I have never dealt with services as big as AWS, and also I don’t know anything about how they operate internally.
  • I have the utmost respect for AWS SREs and engineers that had to deal with this outage, so this is in no way intended to downplay the quality of the services or the recovery work done there. On the contrary…

Let’s start.

At Kahoot!, the first signal that something was wrong was Slack being slow or unresponsive in our, Central Europe, morning. We didn’t know that AWS or the us-east-1 region was involved in any of it. As we realized Slack was not operating correctly, a few of us sent test messages in our backup channel on a Google Chat room ? SRE Team”, used a handful of times over a few years. It can be cumbersome to establish a backup channel when Slack is suddenly down.

Takeaway 1: establish and document your backup comms channel if and when the primary fails.

In my case, I had another problem. My Firefox install had started acting up, in the special way Firefox fails when you have used snap to install it. I will spare you my thoughts on snap itself, none of which are positive. After 15 minutes of head scratching, I realized the issue might be with Firefox and not related to the AWS outage, and restarted my browser.

The impact on our infrastructure has been minimal. We’ve seen a few AWS API calls fail, but essentially nothing else. Our EC2 instances and AutoScalingGroups in us-east-1 have been up and running with no issues.

It’s always DNS! Except when it isn’t…

Many cited the “It’s always DNS!” meme. My guess is that they probably haven’t read the AWS text. This was not a DNS failure. It was the system AWS designed to update those “hundreds of thousands” of DynamoDB DNS records, that failed due to a race condition. More on that below.

Is that a sign that few people take the time to read things through?

Outage text commentary follows

If you haven’t, I’d suggest reading Lorin Hochstein’s blog about the outage. I won’t be mentioning any of Lorin’s points here.

Timeline of the outage as generated by Claude Code (will definitely be incorrect)

Engineering teams for impacted AWS services were immediately engaged and began to investigate. By 12:38 AM on October 20, our engineers had identified DynamoDB’s DNS state as the source of the outage.

11:48 PM to 12:38 AM means 50 minutes from when the issue started to detection. That seems … quite a lot of time. I don’t know the details, of course. My guess is that DynamoDB is so core to most AWS services and so reliable that it’s hard to imagine it could have issues. This is also confirmed by the fact that “key internal tooling” depends on DynamoDB, meaning it must be very rare that it’d be down or unavailable.

Makes me think of those times when some component, script or cronjob that had been reliably working for years, seems to be the one failing. You’re thinking: “No way, it can’t be THAT! It’s been working perfectly fine for at least 5 years!”. And yet, something this time has changed that caused the failure. Such cases happen. I’ve been smacked in the face a few times :-)

By 2:25 AM, all DNS information was restored, and all global tables replicas were fully caught up by 2:32 AM. Customers were able to resolve the DynamoDB endpoint and establish successful connections as cached DNS records expired between 2:25 AM and 2:40 AM.

We can conclude that no DNS records for us-east-1 DynamoDB endpoints were available between 11:48 PM and (partially) 2:40 AM. The DynamoDB endpoint hostname for the us-east-1 region is dynamodb.us-east-1.amazonaws.com. Negative DNS lookups, which I take to be NXDOMAIN responses in this case, are to be cached by resolvers. This can be a problematic if the TTL for such negative responses is high. In case of the DynamoDB regional DNS zone, this is currently set to 5 (five) seconds.

To understand this, let’s look at the SOA record for the us-east-1 DNS zone, which controls how long DNS responses are cached for:

$ dig soa us-east-1.amazonaws.com +multiline +noall +answer
us-east-1.amazonaws.com. 895 IN SOA dns-external-master.amazon.com. root.amazon.com. (
                                22366      ; serial
                                180        ; refresh (3 minutes)
                                60         ; retry (1 minute)
                                2592000    ; expire (4 weeks 2 days)
                                5          ; minimum (5 seconds)
                                )

The last number in the SOA record is the minimum TTL, used as TTL for negative responses. Hence any lookup for dynamodb.us-east-1.amazonaws.com that returned an NXDOMAIN response (record not existing) would be cached for just 5 seconds. I’m wondering whether AWS have changed this value after the outage, as otherwise the time for clients to recover during the incident would have been much shorter. Wondering what sort of loads this imposes on their DNS serving infrastructure…

Takeaway 2: if you have particularly critical services, verify that the negative DNS response TTL you advertise in your SOA records is appropriately set, so that clients can recover quickly when DNS records are restored. Five seconds might be a bit extreme for anyone except huge companies, also because it can impose a tremendous load on the DNS infrastructure. Something like 60s might be more appropriate for mere mortals. YMMV.

EC2

DynamoDB being the backbone of many internal AWS services meant that EC2 was also impacted. I won’t reiterate here how or why. Instead:

Existing EC2 instances that had been launched prior to the start of the event remained healthy and did not experience any impact for the duration of the event.

That’s what we saw. Our existing EC2 instances in us-east-1 kept running with no issues. We were also “lucky” that our AutoScalingGroups didn’t initiate any scale-in or scale-out events, being night time in us-east-1. Those would have probably failed, at least the scale-out ones, as launching new instances was one of the impacted operations.

Other more “complex” services were impacted. Our AWS usage is relatively basic, we don’t use esoteric services or configurations, so we weren’t affected by the EC2 issues during the outage. Keeping things simple has been advantageous in this case.

Takeaway 3: Simple, “boring” infrastructure choices can be surprisingly resilient. Complex service configurations and dependencies increase your surface area for cascading failures.

DropletWorkflow Manager

Each DWFM manages a set of droplets within each Availability Zone and maintains a lease for each droplet currently under management. This lease allows DWFM to track the droplet state, ensuring that all actions from the EC2 API or within the EC2 instance itself, such as shutdown or reboot operations originating from the EC2 instance operating system, result in the correct state changes within the broader EC2 systems.

As part of maintaining this lease, each DWFM host has to check in and complete a state check with each droplet that it manages every few minutes.

Starting at 11:48 PM PDT on October 19, these DWFM state checks began to fail as the process depends on DynamoDB and was unable to complete.

While this did not affect any running EC2 instance, it did result in the droplet needing to establish a new lease with a DWFM before further instance state changes could happen for the EC2 instances it is hosting.

Between 11:48 PM on October 19 and 2:24 AM on October 20, leases between DWFM and droplets within the EC2 fleet slowly started to time out.

At 2:25 AM PDT, with the recovery of the DynamoDB APIs, DWFM began to re-establish leases with droplets across the EC2 fleet. Since any droplet without an active lease is not considered a candidate for new EC2 launches, the EC2 APIs were returning “insufficient capacity errors” for new incoming EC2 launch requests.

This description of the inner workings of the DropletWorkflow Manager is quite fascinating. From my point of view, DWFM was designed really well to effectively fail in such a graceful way. Being defensive is a useful trait in systems design. This is an excellent example of “failing open” design philosophy. The system degraded gracefully rather than causing widespread instance failures.

After attempting multiple mitigation steps, at 4:14 AM engineers throttled incoming work and began selective restarts of DWFM hosts to recover from this situation. Restarting the DWFM hosts cleared out the DWFM queues, reduced processing times, and allowed droplet leases to be established.

The oldest trick in the book! A well-placed server restart can help :-) I did this regularly years ago, then started to prefer trying to understand the actual failure at hand, before kicking the server and thus sometimes preventing further observations to understand the cause of the fault. Sometimes it’s still a viable way to get out of trouble, even in AWS apparently.

Network Load Balancers

NLBs being based on EC2 instances, they were impacted by the outage.

Our monitoring systems detected this at 6:52 AM, and engineers began working to remediate the issue.

That means 80 minutes from the first NLB issues to detection at the monitoring layer. That’s an indication that the recovery work was either quite challenging, or just that it took that long to notice. It would be again very interesting to know what exactly happened during that time.

Other AWS Services

By 2:24 AM, service operations recovered except for SQS queue processing, which remained impacted because an internal subsystem responsible for polling SQS queues failed and did not recover automatically. We restored this subsystem at 4:40 AM and processed all message backlogs by 6:00 AM.

One aspect I haven’t seen mentioned anywhere else is the fact that with all these different subsystems, we can assume many SREs must have been on deck to deal with this outage. Given that, the coordination work will have been absolutely massive and critical as well. No doubt it would have been extremely fascinating to observe how this coordination went on, and how the different teams communicated and collaborated to get things back up and running. Maybe it was a single team of three-five people instead? It was night time in US, so who knows.

Inbound callers experienced busy tones, error messages, or failed connections. Both agent-initiated and API-initiated outbound calls failed. Answered calls experienced prompt playback failures, routing failures to agents, or dead-air audio.

It was the first time I read the term “dead-air audio”. A detour to Wikipedia was definitely worth it.

Customers with IAM Identity Center configured in N. Virginia (us-east-1) Region were also unable to sign in using Identity Center.

Fortunately, our IAM Identity Center is in a different region, so we weren’t impacted there either. I certainly don’t envy teams who were shut off from access to the AWS console. Our observability systems also weren’t affected, but I can see how losing console access and perhaps also losing your observability platform could be a completely paralyzing situation.

Takeaway 4: Spend some time pondering how not only your own systems, but also the 3rd party systems you depend on would react to an AWS outage. Would they make you unable to react? Can you do something about it?

In Conclusion

Finally, as we continue to work through the details of this event across all AWS services, we will look for additional ways to avoid impact from a similar event in the future, and how to further reduce time to recovery.

The AWS message clearly wasn’t meant to be a post-mortem. With that said, there are zero mentions of a human element anywhere in the outage. Perhaps because the DNS automation was … automation, and no manual intervention caused the issues. I would have appreciated learning more about the human factors involved in any case. For example, what challenges the teams faced in identifying that DynamoDB DNS records were missing? Perhaps this is more material for an AWS-internal post-mortem, that people might still be working on.

Multiple DNS Enactor instances applying potentially outdated plans simultaneously seems (in hindsight, clearly) quite risky? It’s always easy to criticize after the fact, but since we’re missing such a huge amount of context and background, the only thing we can do is speculate and also learn from this, as much as we can. If you have, or know where to get more insights into AWS internals, reach out and let me know!

eBPF (Extended Berkeley Packet Filter) for dummies

This is a eBPF simple primer post written with generous help from Claude.

ELI5 version

eBPF is like having magic glasses for your computer. These glasses let you see what’s happening inside your computer without stopping it or slowing it down. You can watch programs talk to each other, see how fast things are moving, and even catch bad behaviors. The best part is you can program these glasses to look for specific things and take action when they happen.

What is eBPF?

eBPF is a technology in the Linux kernel that allows you to run small programs in a safe, sandboxed environment directly in the kernel. It was originally designed for network packet filtering but has evolved into a powerful, general-purpose monitoring and tracing framework.

Key features:

  • Runs safely inside the kernel without modifying kernel code
  • High performance with minimal overhead
  • Versatile application across networking, security, and observability
  • JIT (Just-In-Time) compilation for near-native performance

eBPF Tools Ecosystem

  1. BCC (BPF Compiler Collection): A toolkit for creating eBPF programs using Python and Lua frontends.
  2. bpftrace: A high-level tracing language for eBPF, similar to awk or DTrace. It provides a simple, powerful scripting interface for writing eBPF programs.
  3. Cilium: Uses eBPF for container networking, observability, and security.
  4. Falco: Security monitoring tool that uses eBPF to detect anomalous behavior.
  5. Hubble: Network and security observability platform built on eBPF.
  6. Pixie: Observability platform for Kubernetes applications using eBPF.

What is bpftrace?

bpftrace is a high-level tracing language for eBPF that makes it easy to write small programs to trace and analyze system behavior. Think of bpftrace as the friendly interface to eBPF’s power.

Relationship to eBPF:

  • bpftrace is to eBPF what SQL is to a database engine
  • It compiles your human-readable scripts into eBPF bytecode
  • Handles the complexity of loading and running your eBPF programs
  • Provides built-in functions and easy syntax for common tracing needs

Simple bpftrace example:

# Count system calls by process name
bpftrace -e 'tracepoint:syscalls:sys_enter_* { @[comm] = count(); }'

This one-liner counts all system calls grouped by process name, demonstrating bpftrace’s concise yet powerful syntax.

Kprobes and Uprobes

Kprobes

Kprobes (Kernel Probes) are debugging mechanisms in the Linux kernel that allow you to dynamically break into any kernel routine and collect debugging and performance information non-disruptively. They’re essentially dynamic breakpoints you can insert anywhere in the kernel code.

Key features:

  • Can be attached to virtually any instruction in the kernel
  • Minimal performance impact when not triggered
  • Collect register and memory state at the probe point
  • Available in two flavors: kprobes (at function entry) and kretprobes (at function return)

Uprobes

Uprobes (User Probes) are similar to kprobes but work in userspace. They allow you to trace and instrument user applications by inserting breakpoints at specific functions or instructions.

Key features:

  • Trace applications without modifying their source code
  • Attach to specific functions in userspace programs
  • Monitor application behavior in production
  • Available as both uprobes (function entry) and uretprobes (function return)

Relationship to eBPF

Kprobes and uprobes provide the attachment points for eBPF programs to hook into kernel and user application code. The relationship works like this:

  1. Attachment mechanism: eBPF programs use kprobes/uprobes as the “hooks” to insert themselves into kernel or application execution paths
  2. Data collection: When a probe is triggered, the associated eBPF program executes, collecting data and potentially making decisions
  3. Performance: eBPF added JIT compilation to make probe handlers extremely efficient
  4. Programmability: Before eBPF, probes were limited in functionality; eBPF adds a programmable layer to determine what happens when a probe triggers

An example in bpftrace showing both kprobe and uprobe:

# Trace kernel function
bpftrace -e 'kprobe:do_sys_open { printf("Opening file: %s\n", str(arg1)); }'

# Trace user function in libc
bpftrace -e 'uprobe:/lib/x86_64-linux-gnu/libc.so.6:malloc { printf("malloc called, size: %d\n", arg0); }'

eBPF transformed kprobes and uprobes from simple debugging tools into a powerful, programmable observability framework, turning them from basic breakpoints into sophisticated monitoring tools with minimal performance impact.

How to pin a specific apt package version

I’d like to pin a specific package, say redis-server, to a specific version, in my case 7.0.*, and that seems straight-forward to do with:

Package: redis-server
Pin: version 7.0.*
Pin-Priority: 1001

Now, I would also like to have apt fail when 7.0.* is not available, either because there are only newer versions available, f.ex. 7.2.* or 7.4.* or perhaps because only older versions like 6.* are available.

I can’t seem to find a way to achieve that. I’ve read various resources only, consulted man 5 apt_preferences, but I’m still not sure how to.

I tried combining the previous pinning rule to another one with priority -1, as in the following:

Package: redis-server
Pin: release *
Pin-Priority: -1

But that seems to make all versions unavailable unfortunately. Here’s what I’m seeing:

$ apt-cache policy redis-server
redis-server:
  Installed: (none)
  Candidate: 5:7.0.15-1build2
  Version table:
     5:7.0.15-1build2 500
        500 [http://no.archive.ubuntu.com/ubuntu](http://no.archive.ubuntu.com/ubuntu) noble/universe amd64 Packages

$ cat > /etc/apt/preferences.d/redis-server
Package: redis-server
Pin: version 7.0.15*
Pin-Priority: 1001

Package: redis-server
Pin: release *
Pin-Priority: -1

$ apt-cache policy redis-server
redis-server:
  Installed: (none)
  Candidate: (none)
  Version table:
     5:7.0.15-1build2 -1
        500 [http://no.archive.ubuntu.com/ubuntu](http://no.archive.ubuntu.com/ubuntu) noble/universe amd64 Packages

I expected this configuration to provide an available candidate, since one exists (7.0.15), but that doesn’t work.

Note that a successful outcome for me is:

  • define a target wanted version, f.ex. redis-server=7.0.*
  • provide an apt preferences.d file such that:
    • when any 7.0.* versions are available, apt will install that version
    • when no 7.0.* versions are available, apt will fail installing nothing

A bad outcome is when redis-server is installed, but with a package version that does not match what I had specified as requirement (hence, different from 7.0.*).

This is on Ubuntu 24.04, although there is nothing specific to 24.04 or Ubuntu here I would think.

Any ideas?

Posted on stackoverflow, let’s see! https://unix.stackexchange.com/questions/790837/how-to-pin-an-apt-package-to-a-version-and-fail-if-its-not-available

UPDATE: based on the stackoverflow feedback, it seems that the solution wasn’t far off.

Package: redis-server
Pin: version 5:7.0.15*
Pin-Priority: 1001

Package: redis-server
Pin: release *
Pin-Priority: -1

I needed to prepend the version with the "5:“.

TIL: Styling Obsidian text paragraphs

TIL that it’s possible to style Obsidian text paragraphs in a way that allows me to focus on the actual text content and not on the fact that I need to add an artificial line break every time I type a paragraph :-)

Obsidian can use CSS snippets to style the application itself and the text/markdown content. The CSS snippets need to be saved in <vault_directory>/snippets/whatever.css.
This is how to get that “natural” book-like spacing between paragraphs, and avoid adding spurious line breaks in the markdown code:


.cm-contentContainer {
  line-height: 1.70rem;
}

.markdown-source-view.mod-cm6 .cm-content > .cm-line {
  padding-bottom: 12px !important;
}

Of course, the values for main line-height and padding will depend on your particular screen and font settings. In my case I use a screen tilted in vertical position for writing and coding, and my font of choice is the beautiful Berkeley Graphics’s Berkeley Mono.

SREcon EMEA 2022 Logo

My experience at SREcon EMEA 2022

A couple of weeks ago I attended SREcon EMEA in Amsterdam. Here’s some sparse thoughts about it, with no pretense of being exhaustive or coherent.

Looking Back

There are only a handful of conferences I’ve attended where I felt “at home”. Going back in time, Surge was the first one, then came Velocity. I’m adding SREcon to that list. It definitely felt like I was among people that speak the same language and have similar breadth and depth of expertise, and yet it is somewhat strange at the same time.

As I see it, there’s at least three “tiers” for such a big and niche conference. The FAANG folks, the tiny company with a sysadmin or devops or two, and then the big ocean of mid-sized companies, where people like us are. Our SRE team is four people and we manage a service with millions of monthly users. Needless to say, we have a lot on our plate :-)

I came to SREcon after a hiatus from conferences for some years. After a while, conferences tend to become self-referential and people start talking about the same things over and over again. I wanted to understand how things had changed in our field, what were people talking the most about, get some fresh perspectives and perhaps connect with people from other companies. What prompted me to do this was Niall Murphy tearing the SRE bible book apart.

The Question of SRE Identity

This year’s conference topic was “What could SRE be?”.
No surprise, then, that a good portion of the talks were about what I refer to as the question of identity for SREs. We have seen the same happen — and a lot more during all these years — for the DevOps movement.

What could SRE be, then? According to some presentations, one would conclude that whatever SRE is, it’s no longer what Google intended, it’s not what anyone else thinks it is either, it’s just what you think it is: a subjectivist view.

Among the Usenix slack conversations, there was a lot of chit-chat about SRE identity. My personal contribution was the following meme:

Other funny memes that were shared:

An interesting fact I learned during the conference is that the Google SRE book was written by assembling contributions from the best teams at Google, picking out their respective best practices. Paradoxically, this implies that the SRE book is not representative of how even Google itself does SRE. If you also consider that, at the time the SRE book was published (2016), Google employed about 1,200 people in the various SRE teams, the only possible conclusion is… if you are not Google, there is likely very little that you can apply to your everyday mere-mortal-SRE life.

Before you think I’m exaggerating, such conclusion was claimed by (ex-)Google engineers themselves, for example in Alex Hidalgo’s “Diamonds under Pressure” talk and (in my opinion) in one of the best talks of the conference, Emil Stolarsky’s Unified Theory of SRE. Another entertaining presentation in the same vein was Andrew Clay Shafer’s “SRE as She Is Spoke”. Andrew expressed this thesis that “progress [on the SRE journey] stops when the needs are met”, which seems a reasonable and pragmatic approach.
The videos are not up yet, but they should be in a few weeks.

Alongside to the “subjectivist” view, there were other talks, which could be classified as systems thinking, that focused on the more general and broad aspects of what SREs do, how to handle complex systems, human factors, etc… Among the best IMO were:

What else?

The question of SRE identity accounted for a notable part of the talks, but thankfully not all. It’s good to pause and reflect on our role, but personally that’s not why I was interested in SREcon, not primarily at least. What I like are the deep technical talks, where I get to know more about how other companies actually do the stuff we call SRE. Given my past conference experience, I expected Facebook/Meta’s talk to be somewhat disappointing, and it was. While some details of how Meta is structured were shared, and are always interesting, I expected a bit more on how the incident actually happened.

I loved Effie Mouzeli’s talk on how to make teams resilient, “Is Our Team as Resilient as Our Systems?”. We naturally focus on systems, but teams are a crucial part of the equation. My team and I have had to work on this a lot in the past years, and I’m hoping to share more about this soon. I felt this talk had a lot of good insights, some of which we’ve also applied over time.

Another talk that deserves a mention is Chris Sinjakli’s reflection on broadening the scope of how we work on reliability for our systems. This is sometimes difficult to do when toil is a big part of our jobs. Luckily it’s not for our team, not anymore at least, so this talk felt very relevant to me, and I recommend it.

I couldn’t attend some of the talks due to the two parallel tracks. I hope to catch-up when slides and videos will be published later on.

What about the hallway track?

In general, people say that conferences are most useful because of the casual conversations you can have in the hallways. While I do agree with it, the opportunities to have conversations vary depending on the type of person you are, and the people you meet, of course. My impression is that while some people at SREcon were happy to have conversations, most were likewise happy to be left alone, which is fair enough :-)
Just to say that it was really nice to meet people and chat, and almost all I talked to knew Kahoot! directly and were happy to share details about what they’re doing and equally interested in what we’re doing.

In some of these conversations I’ve been trying to motion for more concrete, down to earth, talks on how smaller companies like ours do SRE. It’s ok to aspire or be interested in how Google runs, but you come away with absolutely zero information that’s useful to your work life. Possibly there’s a downside even: people going home thinking they have to do whatever Google does (see chapters above) so ultimately… let’s give less importance to the Googles of the world, please!

Besides the hallway track, there was a nice “sidewalk” track. We walked around the city, 15 km a day on average — you gotta track those SLOs… — and I also managed to snap some nice pictures of Amsterdam at sunrise and sunset.

The Venue and Organization

Loved all of it, honestly the best conference I’ve ever been to. The venue was spectacular, there was plenty of space, slides were clearly visible on screen, and the food was awesome! We also used one of the available meeting rooms to participate in our own company hackaton after the conference finished, until they kicked us out. Here’s a sneak peek of what our team was working on:

I hope to return to SREcon next year in Dublin. By then, I’d love to see more not-Google, not-Meta, etc… talks on the program. Perhaps we (or you!) should think about presenting too, why not?

On feeling stupid, how mathematics is taught in school, and an Ikea bowl

This story begins with a tweet:

In other words: how to find the quantity (volume) of each layer of jelly to make sure each layer in a bowl is of equal height?

Initially, I thought of a quick solution: mark height levels every n cm on the bowl with a pen, and then just fill up to each mark with the different liquid. There won’t be any need to calculate anything.

This can work assuming the bowl is made of glass. Maybe one could mark the bowl on the outside and then fill it up and still have a sufficiently clear reference for when to stop.

But… this didn’t feel satisfying. How would one approach the problem if they had to solve it without filling the bowl, with mathematics only?

One approach could be to find the area – labeled as B below – between the X axis and the curve of the “bowl”.

We know what A + B is, that is, the height of the layer we desire, multiplied by the width of the bowl at that point. From that area (the green colored above) we subtract B, and we find A, the area of the liquid.

Can we then find the curve or mathematical function that characterizes our particular bowl? How do we find it? I tried to use different methods, unsuccessfully, for instance tracing the bowl profile on a sheet of paper. In the end, I used my phone to take a picture of the bowl, a common Ikea metal bowl we have in the kitchen.

This bowl is 20 cm in diameter. By using some amount of zoom and taking the picture from farther away, it’s possible to get a picture with less lens distortion, more representative of the actual curvature of the bowl.

I opened the picture in Photoshop, changed the image resolution to match the real dimensions, so to have 20 cm in Photoshop correspond to the width of the bowl in the picture, then proceeded to add guides every 1 cm in width and height, and sampling the bowl curve at every cm.

By doing some curve fitting, I imagined I would be able to find a formula for a hypotetical f(x) function that could approximate a half bowl shape. This required a bit of fiddling, but in the end I got something semi-accurate.

Tried several different functions to fit the curve, but the exponential was the smoothest, without artifacts that polynomial functions tend to have. Even though the fit is still not perfect, I thought I’d be good enough. In the end the function that approximates one half of my Ikea bowl is the following:

f(x) = 0.0902209 e 0.458166 x

This function defines the height of the bowl profile given the horizontal coordinate.

Now we can find the area below the curve by calculating the defined integral of this function from 0 to the x coordinate. All we need to do now is to find the points on the x axis that correspond to the equal height layers on the bowl. Let’s call them x1,x2, and x3. The last layer will fill up the bowl completely, so x4 is 10 cm since the bowl’s radius is 10 cm.

Below is an example for the first layer of my bowl, where x1 = 6.76314 cm.

I did refresh my derivative and integral rules for this, but ultimately to avoid stupid mistakes and spend another afternoon on it, I again resorted to Wolfram to do this, and the resulting area was then 4.16831 square centimeters.

This is the B area I have marked in my earlier diagram:

Let’s now find A + B, the product of the layer height, which I have set to 2 cm and the x1 coordinate, so:

A + B = 2 cm × x1
= 2 cm × 6.76314 cm
=  13.52628 cm2

We know A + B, and we know B, so we know A as well now.

A = (A + B) – B = 13.52628 cm2 – 4.16831 cm2
A = 9.35797 cm2

Ok, so finally we have the area A of our layer. An area is not a volume though, and we want to know the volume of the liquid or jelly we need to use to fill our first layer.

To do that, we need to calculate the volume of the solid obtained by the rotation (or revolution) of our curve around the y axis. Now, this is easy to visualize if thinking about a cylinder for example, but to calculate the volume of a revolution solid for an arbitrary curve?

I tried several web searches but I could not make much sense of the explanations, sometimes they have errors in the formulas… Once again I get the feeling that I’m too stupid or slow to understand. Sadly, such a feeling has been a constant companion in my life, as a child but not only… The more I grow personally and professionally, the more I think this is often related to the quality of the teaching, articles, text, or paper. In my opinion, some of these materials are not made to be understood, they’re made to make the authors seem smart and competent.

Anyway… while I understand the idea of a revolution solid, the calculations escaped me in this case, so I resorted once again to WolframAlpha, which understands the query directly, if formulated in a way it can digest. In this case the query is:

volume of solid of revolution about the y axis for y = f(x) for x = 0 to x1

If this is not an example of AI, or in Alan Kay’s words,  an amplifier of human intellect, I don’t know what is.

Wolfram calculated a volume for our revolution solid of 128.333 cm3, or 1.28333 dl.
Consider that the volume in question is the volume external to the bowl, underneath it. To get the volume inside the bowl, we need, once again, like in the earlier case of the A and B areas, to subtract this quantity from the volume of an ideal cylinder of height 2 cm and radius = x1 = 6.76314 cm (2 cm is the layer height I chose). If we do that, we obtain:

Vcyl = π × r2 × h = π × 6.763142 × 2 ≈ 287.39329

The internal volume of liquid to fill our bowl to 2 cm of height is then:

Vlayer1 = Vcyl – 128.333 cm3
Vlayer1 = 287.39329 – 128.333 = 159.06029
cm3

This result can’t be correct, can it? It seems too small a volume, there must be something wrong…

Instead of trying to fill the bowl with 159 ml of water, I slightly changed course and tried to calculate the volume of the whole bowl, to see how much water it would contain if it were to be filled to the brim. Following the same method:

Vbowl = Vcyl (h=9, r=10) – Vrev(x=0..10)
with Vcyl (h=9, r=10) = 2827.43339 cm3

and Vrev(x=0..10) = 947.447 cm3

which results in the full bowl to have a volume of:

Vbowl = 2827.43339 – 947.447
≈ 1,879.98 cm3

At this point, I felt the anticipation of a child waiting for a birthday cake. I was a bit doubtful a 20 cm bowl could hold almost 1.9 liters of water… nevertheless I took a container big enough, filled it with about 1879 grams of water, and then slowly poured the water in the ikea bowl.

I watched as the water almost immediately filled the bowl up to a half. I still had so much left that it would seem impossible but, to my excitement and complete amazement, the bowl ate up all the water, and there was not even a millimeter of height left.

Oh the absolute joy I felt in that moment! I started screaming from the excitement, my family thought I lost my mind for a minute :-) Such a nerdy thing, but so cool!

If you ask me, this is what the joy of mathematics (if you consider this mathematics, maybe it’s more engineering?) should be all about! In my opinion, this is a perfect example of what we should be teaching our children when we teach them mathematics. It was an ultimately pointless, but so intellectually satisfying achievement. I loved it!

Of course, I wasn’t done with this yet. I used the exact same procedure to calculate the volume of the other 2 cm high layers, and it turned out the first layer indeed consists of a very small volume, and the higher in the bowl you go, the more water is “absorbed” by the upper layers.

Why write this long and very boring post?

I wanted to share this positive feeling with others that, like me, may think of themselves that they don’t “get” math, they were never good in school, or maybe – like me – they were taught mathematics in a joyless, boring, mechanical or too abstract way, being completely deprived of the satisfaction that comes from the discovery and application of math to the real world.

I now wonder, perhaps there is already a website for cakes or similar that lets you calculate the volume of these layers, for different bowls, maybe with proportions of recipes, … who knows?
If you enjoyed this, I’d love to hear from you.

Thanks to Excalidraw and WolframAlpha for being awesome tools one can use to calculate layers of a bowl. Like and subscribe for more :-)

Deploying Large Deep Learning Models in Production

Most deep learning or machine learning (ML) articles and tutorials focus on how to build, train and evaluate a model. The model deployment stage is rarely covered in detail, even though it is just as important if not fundamental part of a ML system. In other words, how do we take a working ML model from a jupyter notebook to a production ML-powered API?

I hope more and more practitioners will cover the deployment aspect of ML models. For now, I can offer my own experience about how I approached this problem, hoping this will be useful to some of you out there.

Creating a useful ML model

How to create a useful ML model is the part of the work I won’t cover in this post. :-)

I assume that you already have:

  • a model or pipeline that is either pre-trained or that you have trained yourself
  • a model based on PyTorch, though most of the information here will probably help with any ML framework
  • some idea on how to make your model available as a RESTful API

First step: defining a simple API

The rest of this article will use Python as a programming language, for various reasons, the most important being that the ML model is based on PyTorch. In my specific case, the problem I worked on was text clustering.

Given a set of sentences, the API should output a list of clusters. A cluster is a group of sentences that have a similar meaning, or as similar as possible. This task is usually referred to with the term “semantic similarity”.
Here’s an example. Given the sentences:

  • “Dog Walking: 10 Simple Steps”
  • “The Secrets of Dog Walking”
  • “Why You Need To Dog Walking”
  • “The Art of Dog Walking”
  • “The Joy of Dog Walking”
  • “Public Speaking For The Modern Age”,
  • “Learn The Art of Public Speaking”
  • “Master The Art of Public Speaking”
  • “The Best Way To Public Speaking”

The API should return the following clusters:

  • Cluster 1 = (“Dog Walking: 10 Simple Steps”, “The Secrets of Dog Walking”, “Why You Need To Dog Walking”, “The Art of Dog Walking”, “The Joy of Dog Walking”)
  • Cluster 2 = (“Public Speaking For The Modern Age”, “Learn The Art of Public Speaking”, “Master The Art of Public Speaking”, “The Best Way To Public Speaking”)

The model

I plan to describe the details of the specific model and algorithm I used in a future post. For now, the important aspect is that this model can be loaded in memory with some function we define as follows:

model = get_model()

This model will likely be a very large in-memory object. We only want to load it once in our backend process and use it throughout the request lifecycle, possibly for more than just one request. A typical model will take a long time to load. Ten seconds or more is not unheard of, and we can’t afford to load it for every request. It would make our service terribly slow and unusable.

A simple Python backend module

Last year I discovered FastAPI, and I immediately liked it. It’s easy to use, intuitive and yet flexible. It allowed me to quickly build up every aspect of my service, including its documentation, auto-generated from the code.

FastAPI provides a well-structured base to build upon, whether you are just starting with Python or you are already an expert. It encourages use of type hints and model classes for each request and response. Even if you have no idea what these are, just follow along FastAPI’s good defaults and you will likely find this way of working quite neat.

Let’s build our service from scratch. I usually start from a python virtualenv, an isolated python environment where you can install your dependencies.

virtualenv --python /usr/bin/python3.8 .venv
source .venv/bin/activate

If you are not familiar with virtualenv, there are many tutorials you can read online.
Next step, we write our requirements file, with all the python modules we need to run our project. Here’s an example:

# --- requirements.txt
fastapi~=0.61.1

Save the file as requirements.txt. You can install the modules with pip. There are plenty of guides on how to get pip on your system if you don’t have it:

pip install -r requirements.txt

Doing so will install FastAPI. Let’s create our backend now. Copy the following skeleton API into a main.py file. If you prefer, you can clone the FastAPI template published at https://github.com/cosimo/fastapi-ml-api:

from typing import Optional

from fastapi import FastAPI

app = FastAPI()
model = get_model()

@app.post("/cluster")
def cluster():
return {"Hello": "World"}

You can run this service with:

uvicorn main:app --reload

You’ll notice right away that any changes to the code will trigger a reload of the server: if you are using the production ML model, the model own load time will quickly become a nuisance. I haven’t managed to solve this problem yet. One approach I could see working is to either mock the model results if possible, or use a lighter model for development.

Invoking uvicorn in this way is recommended for development. For production deployments, FastAPI’s docs recommend using gunicorn with the uvicorn workers. I haven’t looked into other options in depth. There might be better ways to deploy a production service. For now this has proven to be reliable for my needs. I did have to tweak gunicorn’s configuration to my specific case.

Running our service with gunicorn

The gunicorn start command looks like the following:

gunicorn -c gunicorn_conf.py -k uvicorn.workers.UvicornWorker --preload main:app

Note the arguments to gunicorn:

  • -k tells gunicorn to use a specific worker class
  • main:app instructs gunicorn to load the main module and use app (in this case the FastAPI instance) as the application code that all workers should be running
  • --preload causes gunicorn to change the worker startup procedure

Preloading our application

Normally gunicorn would create a number of workers, and then have each worker load the application code. The --preload option inverts the sequence of operations by loading the application instance first and then forking all worker processes. Because of how fork() works, each worker process will be a copy of the main gunicorn process and will share (part of) the same memory space.

Making our ML model part of the FastAPI application (or making our model load when the FastAPI application is first created) will cause our model variable to be “shared” across all processes!

The effect of this change is massive. If our model, once loaded into memory, occupies 1 Gb of RAM, and we want to run 4 gunicorn workers, the net gain is 3 Gb of memory that we will have available for other uses. In a container-based deployment, it is especially important to keep the memory usage low. Reclaiming 75% of the total memory that would otherwise be used is an excellent result.

I don’t know enough details about PyTorch models or Python itself to understand how this sharing keeps being valid across the process lifetime. I believe that modifying the model in any way will cause copy-on-write operations and ultimately the model variable to be copied in each process memory space.

Complications

Turns out we don’t get this advantage for free. There are a few complications with having a PyTorch model shared across different processes. The PyTorch documentation covers them in detail, even though I’m not sure I did in fact understand all of it.

In my project I tried several approaches, without success:

  • use pytorch.multiprocessing in the gunicorn configuration module
  • modify gunicorn itself (!) to use pytorch.multiprocessing to load the model. I did it just as a prototype, but even then… bad idea
  • investigate alternative worker models instead of prefork. I don’t remember the results of this investigation, but they must have been unsuccessful
  • use /dev/shm (Linux shared memory tmpfs) as a filesystem where to store the Pytorch model file

A Solution?

The approach I ended up using is the following.

gunicorn must create the FastAPI application to start it, so I loaded the model (as a global) when creating the FastAPI application, and verified the model was loaded before that, and only loaded once.

I added the preload_app = True option to gunicorn’s configuration module.

I limited the amount of workers (my tests showed 3 to work best for my use case), and limited the amount of requests each gunicorn worker will serve. I used max_requests = 50. I limited the amount of requests because I noticed a sudden increase in memory usage in each worker regularly some minutes after startup. I couldn’t trace it back to something specific, so I used this dirty workaround.

Another tweak was to allow the gunicorn workers to start up in a longer than default time, otherwise they would be killed and respawned by gunicorn’s own watchdog as they were taking too long to load the ML model on startup. I used a timeout of 60 seconds instead of the default 30.

The most difficult problem to troubleshoot was workers suddenly stopping and not serving any more requests after a short while. I solved that by not using `async` on my FastAPI application methods. Other people have reported this solution not working for them… This remains to be understood.

Lastly, when loading the Pytorch model, I used the .eval() and .share_memory() methods on it, before returning it to the FastAPI application. This is happening just on first load.

For example, this is how my model loading looks like:

def load_language_model() -> SentenceTransformer:
    language_model = SentenceTransformer(SOME_MODEL_NAME)
    language_model.eval()
    language_model.share_memory()

    return language_model

The value returned by this method is assigned to a global loaded before the FastAPI application instance is created.

I doubt this is the way to do things, but I did not find any clear guide on how to do this. Information about deploying production models seems quite scarce, if you remember the premise to this post.

In summary:

  • preload_app = True
  • Load the ML model before the FastAPI (or wsgi) application is created
  • Use .eval() and .share_memory() if your model is PyTorch-based
  • Limit the amount of workers/requests
  • Increase the worker start timeout period

Read on for other tips about dockerization of all this. But first…

Gunicorn configuration

Here’s more or less all the customizations needed for the gunicorn configuration:

# Preload the FastAPI application, so we can load the PyTorch model
# in the parent gunicorn process and share its memory with all the workers
preload_app = True

# Limit the amount of requests a single worker will handle, so as to
# curtail the increase in memory usage of each worker process
max_requests = 50

Bundling model and application in a Docker container

Your choice of deployment target might be different. What I used for our production environment is a Dockerfile. It’s easily applicable as a development option but also good for production in case you deploy to a platform like Kubernetes like I did.

Initially I tried to build a Dockerfile with everything I needed. I kept the PyTorch model file as binary in the git repository. The binary was larger than 500Mb, and that required the use of git-lfs at least for Github repositories. I found that to be a problem when trying to build Docker containers from Github Actions. I couldn’t easily reconstruct the git-lfs objects at build time. Another shortcoming of this approach is that the large model file makes the docker container context huge, increasing build times.

Two stage Docker build

In cases like this, splitting the Docker build in two stages can help. I decided to bundle the large model binary into a first stage Docker container, and then build up my application layer on top as stage two.

Here’s how it works in practice:

# --- Dockerfile.stage1

# https://github.com/tiangolo/uvicorn-gunicorn-fastapi-docker
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8

# Install PyTorch CPU version
# https://pytorch.org/get-started/locally/#linux-pip
RUN pip3 install torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

# Here I'm using sentence_transformers, but you can use any library you need
# and make it download the model you plan using, or just copy/download it
# as appropriate. The resulting docker image should have the model bundled.
RUN pip3 install sentence_transformers==0.3.8
RUN python -c 'from sentence_transformers import SentenceTransformer; model = SentenceTransformer("")'

Build and push this container image to your docker container registry as stage1 tag.

After that, you can build your stage2 docker image starting from the stage1 image.

# --- Dockerfile
FROM $(REGISTRY)/$(PROJECT):stage1

# Gunicorn config uses these env variables by default
ENV LOG_LEVEL=info
ENV MAX_WORKERS=3
ENV PORT=8000

# Give the workers enough time to load the language model (30s is not enough)
ENV TIMEOUT=60

# Install all the other required python dependencies
COPY ./requirements.txt /app
RUN pip3 install -r /app/requirements.txt

COPY ./config/gunicorn_conf.py /gunicorn_conf.py
COPY ./src /app
# COPY ./tests /tests

You may need to increase the runtime shared memory to be able to load the ML model in a preload scenario.
If that’s the case, or if you get errors on model load when running your project in Docker or Kubernetes, you need to run docker with --shm-size=1.75G for example, or any suitable amount of memory for your own model, as in:

docker run --shm-size=1.75G --rm <command>

The equivalent directive for a helm chart to deploy in Kubernetes is (WARNING: POSSIBLY MANGLED YAML AHEAD):

apiVersion: apps/v1
kind: Deployment
metadata:
  ...
spec:
  ...
  template:
    ...
    spec:
      volumes:
        - name: modelsharedmem
          emptyDir:
            sizeLimit: "1750Mi"
            medium: "Memory"
      containers:
        - name: {{ .Chart.Name }}
          ...
          volumeMounts:
            - name: modelsharedmem
              mountPath: /dev/shm
          ...

A Makefile to bind it all together

I like to add a Makefile to my projects, to create a memory of the commands needed to start a server, run tests or build containers. I don’t need to use brain power to memorize any of that, and it’s easy for colleagues to understand what commands are used for which purpose.

Here’s my sample Makefile:

# --- Makefile
PROJECT=myproject
BRANCH=main
REGISTRY=your.docker.registry/project

.PHONY: docker docker-push start test

start:
    ./scripts/start.sh

# Stage 1 image is used to avoid downloading 2 Gb of PyTorch + nlp models
# every time we build our container
docker-stage1:
    docker build -t $(REGISTRY)/$(PROJECT):stage1 -f Dockerfile.stage1 .
    docker push $(REGISTRY)/$(PROJECT):stage1

docker:
    docker build -t $(REGISTRY)/$(PROJECT):$(BRANCH) .

docker-push:
    docker push $(REGISTRY)/$(PROJECT):$(BRANCH)

test:
    JSON_LOGS=False ./scripts/test.sh

Other observations

I had initially opted for Python 3.7, but I tried upgrading to Python 3.8 because of a comment on a related FastAPI issue on Github, and in my tests I found that Python 3.8 uses slightly less memory than Python 3.7 over time.

See also

I published a sample repository to get started with a project like the one I just described: https://github.com/cosimo/fastapi-ml-api.

And these are the links to issues I either followed or commented on while researching my solutions:

The Perl echo chamber, marketing and … is Perl really dying?

Recently I came across this tweet from Curtis/Ovid, which references longer post about a proposal to integrate a better, more modern object-oriented “system” (Corinna) in Perl 5.

The proposal itself is not what I’d like to address here. I haven’t followed Corinna’s evolution. I believe it goes in a positive direction for the language, FWIW.

From that original tweet, a comment from Rafael followed:

[…] but I’m still wondering what are the real factors that make companies seek an exit strategy from Perl 5. Who makes this kind of expensive decision, and why? I suspect obscure OO syntax is not a major one.

This is what I replied with:

This is indicative of the fundamental problem in the Perl echo chamber. Some people still have no idea why companies are moving away from Perl. If you want to hear the perspective from someone who has seen this happen in multiple companies, let me know :-)

Sorry for this premise, but I was afraid what follows would make no sense otherwise.

Why is Perl dying today?

First of all, I don’t think “<language> is dying” is a useful question to ask, nor it is indicative of anything particularly interesting. I’m sure everyone reading this will have encountered plenty of “C is dying”, “Java is dying” or similar, and yet, C and Java are still being used everywhere. In one sense, no language really dies ever. In Perl’s situation, things are slightly different though, as (I believe) Python slowly conquered Perl’s space over time.

What does it mean for a language to die, or to be dead?

From an end user point of view, let’s say a random programmer employed in a company or freelance, a language could be dying if a task they want to accomplish using that language is hard because there are no supporting libraries for it (think CPAN or PyPi), or the libraries are so old they don’t work anymore. That situation surely conveys the idea that the language is not in use anymore, or very few people must be using that language. One would expect that a common task in 2021 must be easy to accomplish with a language worth using in 2021.

What about a company‘s point of view? The reality is that companies don’t have an opinion on languages, only people do. Teams do have an opinion on languages. The group dynamics inside a team influence what languages are acceptable for current and new projects.

Is Perl dying then?

My experience

Some years ago I was a fairly active member of the Perl community, I attended and presented at various Perl conferences around Europe, talking about my experience using Perl at a few small and large companies.

I remember picking up Perl for the first time based on a suggestion from my manager back then. He gave me a hard copy print-out of the whole of Perl 5.004 man pages, and said: “We are going to use this language. It’s amazing, take some time to study it and we’ll start!”. This was 1998, and I had such a fantastic time :-). I was such a noob, but Perl was amazing. It could do everything you needed and then some, and it was easy and simple. The language was fast already back then, and it got faster over time. At that point in time, I was working in a very small company, we were three people initially, and we ended up writing a complete web framework from scratch that is still in use today, after more than 20 years. If that’s not phenomenal, I don’t know what is. It’d be cool to talk about this framework: it was more advanced than anything that’s ever been done even considering it’s 2021… a story for another time.

And by the way, we were running our Perl code on *anything*, and I mean anything, Windows PCs, Linux, Netware and even AS/400, a limited subset of it at least, at a time when Java’s “write once, run everywhere” was just an empty marketing promise. Remember this was the time of Netscape Navigator and Java applets. Ramblings, I know, but perhaps useful to understand where things have gone wrong.

In 2007, I left my job in Italy and moved to Norway to work for Opera Software. Back then, Opera’s browser was still running the Presto engine, and a little department inside Opera was in charge of web services. That’s where I was headed. Most services there were written in Perl. Glorious times for me, I would learn an awful lot there, meet a lot of skilled developers. Soon after I started working there, 2007, some colleagues were already making fun of Perl. It’s a “write-only language”, “not meant for serious stuff”, “lack of web frameworks”, etc… Those were the times when Python frameworks started to emerge, some of which would eventually disappear. I remember a few colleagues strongly arguing to move to this Python framework called Pylons, and then eventually to Django.

I believe this general attitude towards Perl originated from different factors:

  • personal preference towards other languages and/or dislike towards Perl
  • the desire to be working with the latest “hip” framework or language
  • the discomfort of maintaining an aging codebase with problems

These factors exist and are legitimate reasons to want to move away from any language or framework. I’m not saying they are justified, but I do understand why people wanted that. In our field, I have seen it’s quite common to try and avoid the objective difficulties of maintaining a legacy project, going the greener way of an overly optimistic rewrite, which normally ends in tears.

Throughout the years, I noticed other contributing factors to the progressive abandonment of Perl, even in companies like Opera.
I’ll mention two that I experienced directly:

  1. Outdated or non existent supporting libraries
  2. Teams composition

There was a time a few years ago, when CPAN was awesome, the best language support system in existence and every other language community was envying it. CPAN pretty much was selling Perl by itself. In my case, the libraries on CPAN educated me and made me adopt a testing culture that no other language (in my knowledge) had before Perl. Today, seeing npm modules being installed without running tests makes me uncomfortable :-)

Then over time (years) a shift happened. You would search on CPAN for a library that would help you with a common task and you wouldn’t find anything, or you would only find quick hacks that didn’t really work properly. In my case, I remember the first example of that being OAuth2. If I had to speculate, I would say this is a product of many elements, one of which is the average age of Perl programmers getting higher.

Another related shift I remember from those years is companies publishing their APIs/SDKs started dismissing Perl, at first relying on some CPAN module to eventually appear, then completely omitting Perl support. In the beginning, we politely complained to those companies, trying to make a point, but unfortunately there was no turning back. These days almost no SDK comes with a Perl component.

The second major aspect I have experienced is related to teams. In 2012 I was tasked with writing my first ever greenfield project, entirely from scratch, a project that would turn out to be one of the things I’m most proud of, Opera Discover, an online news recommendation system for the Opera browser, still working today! A team of three veteran engineers (myself included) was assembled, and there and then, we were faced with a decision: what language should we use for this?

While I was most experienced in Perl and knew Python a little, the other two colleagues didn’t know Perl. They had experience in C++ mostly, as this was Opera after all. We were chosen not based on our programming language expertise, rather (I suppose) based on our capability to tackle such a big and complex project. While I could have proposed that the project be written in Perl, in good conscience I knew that choice was not viable. Django was readily available and could provide a wide range of functionality we actually needed. No alternative in the Perl world could come close to such a good value proposition. The fact that Python was (like Perl had been for me!) a very accessible choice, simple to pick up, easily installed on any Linux system, and with plenty of solid up-to-date libraries, made the choice obvious.

With the Discover project, I started learning Python properly as a day-to-day programming language. I remember being horrified (and making fun of) the httplib2/httplib3 situation initially. Then I learned about the requests module and forgot all about it. This is to say, Python also has its quirks of course. The disastrous Python 2 vs Python 3 decision in the Python community caused a lot of grief and uncertainty for people (Perl could have learned something from that…). Nowadays, that’s a non-argument, everything runs on Python 3 and if you still haven’t moved, you will soon.

In general, having learned Python quite well, my mindset with regards to programming and my job changed completely. I’m not a Perl programmer. I’m not a Python programmer either. I can use different tools whenever they are more suited to what I need to do. In fact, in my last four years I have written software in NodeJS and Java of all things… I used to despise and make fun of Java, but I had never worked on any professional project before. While I do maintain that Java has some horrible aspects, contrary to my expectations, I have enjoyed working with it, it has an efficient runtime, awesome threading, solid libraries and debugging/inspection tools.

While I do understand Ovid’s point about wanting to keep the business going, and enjoying Perl as a language, I have personally moved on many years ago. I still use Perl for the occasional script when it’s convenient, but for other use cases, like web APIs, I prefer Python and FastAPI, PyTorch for machine learning, etc.. so my conclusion is that it’s the libraries and the ecosystem that drive language use, and not the language itself.

A better OO system will unfortunately do nothing for Perl (in my opinion at least). Better marketing will without a doubt do nothing for Perl. As if a prettier website could change the situation and the aspects I talked about… it can’t! The situation we have in front of us in 2021 is the result of technological and social changes started at least a decade ago.

I realize this may be an incoherent post. Sorry about that, I tried to write it right away or it would have probably never come out.
If you have questions or comments, let me know and I’ll try to address them if I can.

Most importantly, I do not wish to convince anyone that what I wrote is true. It is simply my experience. If there’s one thing I wish people would take from it, it’s to move away from the thought of yourself being a “X Programmer” and broaden your horizons and set of tools available to you. It was a tremendously positive move for myself, one I wished I had done before.

Peace.

pgtop – a top clone for PostgreSQL

According to meta::cpan records, the first release of pgtop is dated April 26, 2005, which makes this little software more than 15 years old!

Back then I had just found out about the brilliant mytop by Jeremy Zawodny, and my day-to-day experience being on Postgres, IIRC version 6.5.3, I decided to try and “convert” mytop to Postgres.

Being quite naive, I thought the endeavour would be much easier than it really was. I’m glad I started though, which is why pgtop exists in the first place. It’s not the only one either. I seem to remember a few similar pgtop projects by other programmers.

After using MySQL and Percona Server for many years, due to a new job, I have gone back to Postgres, version 9.5 and 10 at this time. In recent months, I have done some work to improve performance of our database queries, and remembered writing and using pgtop years before.

Since I lost(*) the original sources, I tried the pgtop version I last uploaded to CPAN, 0.05, dated 2008. It did work, in the sense that I could run the same perl code unmodified, a great testament to Perl as language and as runtime. It didn’t work because the underlying Postgres meta tables that were used in version 6 changed their schema in the 10-12 years since :-)

I spent some time to adapt the metadata queries to work with recent Postgres versions, and was slightly amused by the quality of my 15 year old code… The best feeling about this little tool was to rediscover how useful a few dozen lines of code can be. The service provider monitoring helps, but doesn’t even come close to the level of detail pgtop can provide.

After getting pgtop to work again, I quickly added a few more useful features. I was pleased by the efficiency with which I could work on this tool, considering its age.

So far I added just what was strictly necessary to me:

  • Updated pgtop to the current decade. Now requires perl >= 5.014
  • Fixed to work with Postgres >= 9.0
  • Added a sample Dockerfile to build and run pgtop as Docker container
  • Added a --config option, to load arbitrary config files. This is useful if you want to monitor several databases at once, for example in a tmux session. The config file supports all the options that are available on the command line.
  • Implemented a query killer command, activated pressing K to kill at once all queries slower than a given threshold, in seconds. This is useful if the database is overwhelmed by a lot of slow queries. I don’t recommend using it, particularly if it involves killing UPDATE or INSERT queries, but it can be quite useful.
  • Added a --slow_threshold option, to consider queries slow if they have been running for longer than the given value (in seconds). Now the tool highlights slow queries in bold yellow, and logs all the slow queries to a pgtop.log file.
  • Added a --slack_webhook option, to automatically notify a slack channel if a query crosses the slow threshold runtime value. All the information about the slow query including the SQL will be included in the slack message.

Please let me know if you give it a try! :-)

Five tips to be a more effective command line user

In the movies, heroes manipulate complex graphics environments using only their keyboard; no mouse in sight. Descending from the movies realm to reality, the command line, not the GUI, is where heroes save the day.

This article intends to be helpful for those who are already command line (CLI) users. Complete beginners are of course encouraged to read on, even though they may not grasp all the advantages immediately and perhaps there is a lot of other more important things to learn when starting. On the other hand, I expect long time CLI users to already work similarly. I do hope they might also find interesting tricks to adopt.

The motivation

First of all, why be more effective? Not everyone wants to, and that is fine. These tips contribute to two primary objectives:

Looking back on everything I tried over the years, I'd like to illustrate those tips that I believe brought me the most "bang for the buck", the most value with the smallest effort and the ones that are more easily applicable to anyone else.

Assumptions

I'm going to assume you are using bash on Linux or MacOSX. A recent MacOSX install shouldn't be too different. Windows has bash too these days, so hopefully these suggestions will be widely applicable.

I'm also going to assume that you, reader, already know how to comfortably move around the CLI (CTRL + A, CTRL + E, CTRL + W, …), recall past commands (!<nnn>, !!, CTRL + R) or arguments (ALT + .). There is enough material to write other posts about this. Let me know and I'll be happy to write it!

1. Shell History

Here we go, first recommendation: use your shell history capabilities.

If you are not already doing that, you can search through your shell history — all the commands you have typed — with CTRL + R. However, the default configuration for bash only keeps up to a certain number of commands.

Computers and hard drives being what they are in 2020, there's no reason why you shouldn't extend your shell history to record all commands you have ever typed from the very beginning of your system history. When I setup a new computer, I normally copy over all my $HOME files, so my command history extends, time-wise, well beyond the system I am writing this on.

My shell command history starts in October 2015, when I first learned this trick. Here's how to do it:

# /etc/profile.d/extended_history.sh

# Show the timestamp for each entry of the history file
export HISTTIMEFORMAT="%Y-%m-%dT%H:%M:%S "

# Ensure the history file size and entry number is large
# enough to record years upon years of history
export HISTFILESIZE=500000000
export HISTSIZE=50000000

At least on Debian and derivative systems, dropping a file into /etc/profile.d/ makes it part of the system-wide profile settings, so that is a handy way of applying those settings to all users.

As a result, the history command will work as before, but the numeric index of each command will not reset every time you open a new shell, or every time the history file gets over a certain size, either in number of entries or in file size.

Here's how the history command output looks like with those settings:

23  2015-10-06T19:51:30 git diff
24  2015-10-06T19:51:33 git add locale/en/LC_MESSAGES/django.pot
25  2015-10-06T19:51:49 git status -uno
26  2015-10-06T19:51:51 git commit -a
27  2015-10-06T19:52:11 git push
28  2015-10-06T20:11:35 make test-recommender_translations
29  2015-10-07T18:53:33 vim ~/notes/recsys/impressions-tracking.txt

At the moment, my shell history file (~/.bash_history) is almost 7 Mb, corresponding to a little less than five years worth of commands. There is really no risk of running out of disk space, so keep those commands around.

Keeping a full history has obvious advantages:

  • If you don't remember how you did something or specific options to a command, you can always use history | grep xyz (or CTRL + R) to find out, and all the commands from months (or years!) back will be there. Obviously this does not apply retroactively :-)
  • If you remember only when you did something but not what it was, it's also easy to grep for specific dates and times.
  • You can easily analyze your shell usage patterns, for example finding what are the top 50 shell commands you have ever used:
$ history \
    | awk '{ print substr($0, length($1 $2) + 3) }' \
    | sort | uniq -c \
    | sort -rn \
    | head -50

# on one line:
$ history | awk '{ print substr($0, length($1 $2) + 3) }' | sort | uniq -c | sort -rn | head -50

In order, those lines do the following:

  1. history: take all history entries
  2. awk ...: remove the entry numeric index and timestamp, to only display the command itself and all the arguments
  3. sort | uniq -c: count the number of occurrences for all the distinct entries
  4. sort -rn: reverse sort numerically all the commands
  5. head -50: take the first 50 commands

If you are confused by all these commands, don't worry too much about them. It's just a way to count the most typed commands in your history.
As a curiousity, here's some of my top commands:

13071  ls -l
 7422  git diff
 6338  git status
 3469  cd ..
 2219  git push
 1816  git pull
 1499  git commit -a
 1367  git log
  940  git commit
  851  gpr
  687  gcs
  400  srdm platf
  348  vimp
  333  l1
  314  srdm merl
  306  dcu
  302  mp;rl-f
  206  gce
  196  realias
  169  gcm
  153  mptr;rl-f
  152  gc-

2. Fast Directory Changes

One of the most frequent operations on the command line is moving among directories, with the cd built-in command.

Especially if you've worked for a long time on many projects, or if you work with Java, you tend to have a lot of directory levels nested quite deeply. The cd commands is then tedious to type. Using tab to invoke your shell autocomplete comes in handy, but not having to type at all can easily beat that.

This is a trick I learned from Damian Conway's Productive Programmer course. He was in Oslo a few years ago, and with the help of my company we organized to have him hold this course internally.

The idea is to use a bespoke shell (or Perl, Python, Node, …) script, to quickly navigate to any directory. Example. Currently I am working on a project called merlin, whose parent directory is ~/src/work. Every time I want to do something in this project, I have to:

cd ~/src/work/merlin

Within that project, there are a bunch of directories, so you could end up writing something like:

cd ~/src/work/merlin/gameserver/prototype/java/src/

The idea is to construct a program that can do the "typing" for you, so you'd use the following command instead:

cd2 src w merl g p j s

I called it cd2 but you can call it however you like of course. This program should:

  • take as input a list of string arguments
  • try to expand them to the closest directory name entry
  • if a directory is found, navigate to it
  • take the next argument and repeat this cycle

When this is done, your shell will be left into the target directory of your choice without any long typing or waiting for autocomplete misfired tabs.

I chose to implement my script in bash for simplicity and call it ~/bin/search-directory.sh. The code is almost trivial and here it is in its entirety:

#!/bin/bash
#
# Search through the home directory space based on the
# list of string arguments given as input.
# From an idea by Damian Conway.
#

# Start from my home directory
SRC_DIR=~

# Take all arguments given as input
ARGS="$*"

# For each argument, try to expand it into the nearest directory name
# at the current level
for dir in $ARGS ; do
    sub=$(find -L "$SRC_DIR/" -mindepth 1 -maxdepth 1 -type d -name "$dir*" \
        | sort \
        | egrep -v '\.egg-info' \
        | head -1)
    if [ ! -z "$sub" ]; then
        # We found a subdir, search will proceed from there
        SRC_DIR=$sub
    else
        # Stop: we didn't find any matching entry.
        exit 1
    fi
done

echo "$SRC_DIR"

exit 0

One could clearly do better than this by employing more sophisticated logic. Initially I thought I'd need better, but this simple script has served me well for the past years, and I don't want to complicate it unnecessarily.

There is one more obstacle to clear though. The script will print the final directory match and exit, without affecting the parent shell's current directory.

How to make the active shell actually change directory? I added a tiny function to my ~/.bashrc file:

# `srd` stands for 'source directory'
srd () {
    match=$(~/bin/search-directory.sh src $*)
    if [ ! -z "$match" ]; then
    echo "→  $match"
        cd "$match"
    fi
}

I made the function always supply the src directory by default, so I don't have to type that either. With these bits set up, you can then move to the example directory above with the command:

srd w merl g p j s

And this is just the beginning :-)
Read on for how to combine this technique with the power of aliases and shorten the command even more.

3. Aliases

Shell aliases are a simple way to define or redefine commands.
The typical example would be to shorten common options for your commands. If you know you always type ls -la, you might want to teach that to your shell. The way to do that is:

$ alias ls='ls -la'

From then on, every time you type ls, your shell will automatically expand the command to ls -la.

Based on what I have seen during my career, shell aliases are something that relatively few people are using. Right now, my shell configuration contains almost 500 lines of aliases, of which around 200 I keep active and probably 30-50 I normally use.

I wasn't always such a heavy alias user. I became one when I had the fantastic experience to work with the Fastmail team in Australia for a period of a few months. I learned how they were doing infrastructural and development work and from the first day I saw they were using a ton of shell commands that were completely obscure to me.

I was quite good at operations/sysadmin work, but after seeing how that team worked, the bar was forever raised and it sank in that I had still a lot to learn. I still do :-)

I use aliases for many things, but mainly to not have to remember a lot of unnecessary details. Here's a few from my list:

Alias Expanded command What/why
less less -RS shortening and options expansion. -RS is to show ANSI color escapes correctly and avoid line wrapping
gd git diff shortening
gc- git checkout - switch to the previous git branch you were on
vmi vim saver for when I type too quickly
cdb cd .. cd "back"
cdb5 cd ../../../../../ to quickly back out of nested directories
kill-with-fire killall -9 for those docker processes…
f. find . -type f -name find file names under the current directory tree
x1 xargs -I{} -L1 simplify using xargs, invoking commands for each line of input f.ex.
awk<n> awk '{ print $<n> }' for when you need to extract field number from a text file or similar. Ex.: awk5 < file extracts the 5th field from the file
vde1 ssh varnish-de-1.domain.com host-based alias. I don't want to have to remember hostnames, so I add aliases instead, with simple mnemonic rules, such as vde1 -> varnish node 1 in the german cluster
jq. jq -C . when you want to inspect JSON payloads, f.ex. curl https://some.api | jq.
dcd docker-compose down is anybody really typing docker-compose ?
dcp docker-compose pull
dcu docker-compose up
dkwf docker-kill-with-fire shorthand for docker stop + docker rm or whatever sequence of commands you need to stop a container. See? I don't have to remember :-)
db docker-bash db postgres instead of docker exec -it container-id bash
dl docker-logs same for docker logs -f ...

Some aliases that I have added thinking they'd be useful I have rarely used. Some have become a staple of my daily CLI life. Sometimes, if a new alias catches on only depends on the first few days. If you make a mindful effort to use it, there's a good chance it will stick (if it's actually good).

To make aliases persistent, instead of typing the alias command in your shell, you can add it to your ~/.bashrc file as you can with any other command. You can also create a ~/.aliases file and keep all your aliases there. If you do that, you then need to include your aliases file in your bash configuration. You do that by adding (only once) this line to your ~/.bashrc:

# ~/.bashrc
...
source ~/.aliases

Every time you feel the need to add a new alias, you can simply edit the ~/.aliases file and reload it into your active shell (source ~/.aliases). When you get tired of that, you can use another trick from Conway's course, and add the last alias you will ever need:

alias realias="${EDITOR:-vim} ~/.aliases; source ~/.aliases"

Typing realias will bring up the alias file in your editor and when you save it and exit, all the new aliases will be immediately available in the parent shell.

Once you start down this path, your creativity won't stop seeing new ways to work smarter and faster.

4. Directory Autorun

This is one of the most recent additions to my arsenal. I found myself typing the same commands over and over whenever I entered specific directories.

The idea is then simply to have a sequence of commands automatically executed for me whenever I enter a directory. This is extremely useful in many occasions. For example, if you want to select a specific Python virtualenv, a Node.js version or AWS profile whenever you enter a specific directory.

I chose to do this by dropping an .autorun file in the target directory. Here's a tiny .autorun I have in a Javascript-based project:

#!/bin/bash
REQUIRED="v11.4.0"
CURRENT=$(nvm current)

if [ "$CURRENT" != "$REQUIRED" ]; then
    nvm use $REQUIRED
fi

In this case I want the shell to automatically activate the correct node.js version I need for this project whenever I enter the directory. If the current version, obtained through nvm current, is already the one I need, nothing is done.

It's quite handy, and I immediately got used to it. I can't do without it now. Another example, to select the correct AWS credentials profile and Python virtualenv:

#!/bin/bash

if [ -z "$AWS_PROFILE" -o "$AWS_PROFILE" != "production" ] ; then
    export AWS_PROFILE=production
    echo "— AWS_PROFILE set to $AWS_PROFILE"
fi

if [ -z "$VIRTUAL_ENV" ] ; then
    source .venv/bin/activate
    echo "— Activated local virtualenv"
fi

The glue to make this work is a couple of lines added to your ~/.bashrc file:

# Support for `.autorun` commands when entering a directory
PROMPT_COMMAND+=$'\n'"[ -s .autorun ] && source ./.autorun"

If you are concerned other users could use your machine, or even in general if you like to keep things tidy, ensure you set appropriate permissions for these .autorun files. A chmod 0600 .autorun could be in order.

Remember to run source ~/.bashrc if you make changes to that file, or they won't immediately reflect on your active shell session.

5. SSH Configuration

SSH is one of the most powerful tools in your arsenal. It can be used to tunnel, encrypt and compress data for connections to arbitrary protocols. I'm not going to cover that functionality here. There are good tutorials out there already, such as this and this.

A smart ssh configuration can help you be more effective on the command line. I'd like to show three specific examples that I use every day:

  1. Persistent ssh connections
  2. Hostname aliases
  3. Automatic ssh key selection

Persistent ssh connections

If you connect to remote hosts often, I'm sure you have noticed the amount of time it takes to establish a new ssh connection. The higher the latency, the longer it takes. It is normal for that initial handshake — where a lot of things happen — to take 2 to 5 seconds.

Performing many small operations via ssh can waste a notable amount of time. One solution to this problem is the transparent use of persistent ssh connections.

The connection is established the first time you ssh (or scp) to a host, and next time you perform a similar operation towards the same host and port, the same TCP/IP connection will be used. This implies that the connection remains active after the ssh command has completed.

The ssh configuration directives that enable this behaviour are the following:

# Normally this is in ~/.ssh/config
ControlMaster auto
ControlPath /var/tmp/ssh_mux_%h_%p_%r
ControlPersist 1h

ControlMaster auto enables this behaviour automatically, without you having to specify whether you want to use shared connections (the ones already opened from before) or not. In particular cases, you may want to specify ControlMaster no on the command line to prevent ssh from using an already open connection. Generally this is not desired though, so ControlMaster auto will normally do what you want.

ControlPath is the filename template that will be used to create the socket files, where:

  • %h is the hostname
  • %p is the port number
  • %r is the username used to connect

ControlPersist is the option that determines how long the connections will stay shared waiting for new clients after being established. In my case, I set it to 1h (one hour) and that works well for me.

In case you want to know more about ssh configuration, I recommend reading the related man page. On linux, that is available with:

man 5 ssh_config

Hostname aliases and key selection

I mentioned I want to get unnecessary details out of my memory as much as possible. The ssh configuration file has lots of useful directives. One of these is the per-host configuration blocks.

If you need to connect to a host quite often and its name is not particularly memorable, like an AWS or GCP hostname, you can add host-specific directives to your ~/.ssh/config file:

# ~/.ssh/config

...

Host aws-test
    Hostname 1.2.3.4
    User my-username

From then on, you can use the command ssh aws-test to connect to this host. You won't have to remember the IP address, or the username you need to use to connect to this host. This is particularly useful if you have dozens of hosts or even projects that use different usernames or host naming schemes.

When you have to work with different projects, it's good practice to employ distinct ssh key-pairs instead of a single one. When you start using ssh, you have a ~/.ssh/id_rsa (or ~/.ssh/id_dsa) file, depending on the type of key and an associated ~/.ssh/id_rsa.pub (or ~/.ssh/id_dsa.pub).

I like to have several key-pairs and use them in different circumstances. For example, the key that is used to connect to a production environment is never the same used to connect to a staging or test environment. Same goes for completely different projects, or customers if you do any freelance work.

Continuing from the example above, you can tell ssh to use a specific private key when connecting to a host:

Host aws-test
   Hostname 1.2.3.4
   User my-username
   IdentityFile ~/.ssh/test_rsa

Host aws-prod
   Hostname 42.42.42.42
   User my-username
   IdentityFile ~/.ssh/prod_rsa

Host patterns work too:

Host *.amazonaws.*
   User my-aws-username
   IdentityFile ~/.ssh/aws_rsa

Host *.secretproject.com
   User root
   IdentityFile ~/.ssh/secret_rsa

Final tip

The more I write, the more it feels there is to write about the command line :-) I'll stop here for now, but please let me know if you'd like me to cover some more basic — or maybe more advanced? — use cases. There are a lot of useful tools that can make you more effective when using the command line.

My suggestion is to periodically gather information about how you use the command line, and spend some time to reassess what are the most frequent commands you use, if there are new ways to perform the same actions, perhaps entirely removing the need to type lots of commands.

When I have to do boring, repetitive tasks, I can't help but look into ways to get myself out of those. Sometimes writing a program is the best way to automate those tasks away. It may take more in the beginning, but at least I managed to transform a potentially boring task into programming, of which luckily I'm never bored :-)