12 Oct 2023

From Agility to Stability

I think we could plot basically all software projects¹ on this graph:

A graph plotting "cost of experiment" against "uncertainty"

On the X axis, I'm measuring uncertainty. For now, think of this as "How uncertain am I that I'm building the right thing?"

On the Y axis, I'm measuring cost of experiment. For now: "How much time and money will it cost to get a little more certainty that I'm building the right thing?

I also think that good software will tend to move around the graph over the course of its lifetime. But I'm getting ahead of myself…

Below, I will talk about two things:

Part 1 is about what the graph means, and where different software lives in it.
Part 2 is about how to use this graph intentionally to make your software better.

Part 1: Different software lives in different zones

Let's plot some example projects to make this a bit clearer.

Project 1: Crowdsourced House Cooling
It's hot, and I want ideas for keeping my house cool. I could write an app for this! Let's get people on the internet to share ideas for keeping the house cool!

There is a bunch of uncertainty associated with this app:

Do people care enough about this to download an app?
Should the advice be presented as location-specific?
Should I use GPS to learn where advice is coming from…?
Does anyone want to pay money for this?
What about advertising…?

Fortunately, the cost of experiment for an app like this is really low. I can do the whole thing with my PaaS-of-choice², and I can push updates instantly. This means I can try out potential new features quickly, roll them back if folks don't like them, and do A-B testing and so on.

This project is high uncertainty, low cost of experiment.

Project 2: Drone-Delivered Ice-Cream
I'm still too hot, and I want an ice cream. Delivered through my window, by drone. Now.

This project has at least as much uncertainty as the last one:

How much extra will people pay for ice-cream delivered by drones?
How do they specify which window the drone should fly in?
What if the window is closed…?
…

This project is clearly also high uncertainty. However, it's also high cost-of-experiment. If I want to try out a new feature with real customers I have to muck about with hardware, logistics, melting ice-cream, and whatever local laws might have to say about flying drones into people's houses.

Project 3: Selling FIDO2 Keys
FIDO2 Keys are hardware devices for logging in to things. You can use them as a second factor, or as a password-replacement, depending on your needs.

If I want to build and sell these things, I'm not at 0 uncertainty (I still need to know about my addressable market, my cost centres, and so on), but compared to the examples above my uncertainty is pretty low. There is literally a specification that will tell me exactly what software to build. That doesn't mean it's easy, but it is pretty low uncertainty.

However, it's also very high cost of experiment. Once I've built a given hardware key, I can never update the software on it. If someone finds a way to update the software on my security key, something has probably gone very badly wrong…

Here's what these projects look like plotted on our graph:

Our three example software projects plotted on our graph. Crowdsourced housecooling is bottom-right in "the Agile zone", high uncertainty and low cost of experiment. Drone-delivered ice-cream is top-right in "the world of Pain", high uncertainty and high cost of experiment. FIDO2 Keys are top-left in "the Stable zone", low uncertainty and high cost of experiment.

I've drawn zones around each project, because I believe that the way we build and maintain software should be different depending on which zone we're in:

The "Agile Zone" is the space with high uncertainty and low cost of experiment. It contains our crowdsourced house-cooling app.
The "Stable Zone" is the space with low uncertainty and high cost of experiment. It contains our FIDO2 keys.
The "World of Pain" is the space with high uncertainty and high cost of experiment. It contains our drone-delivered ice-cream.

Different zones need different practices

Practices and techniques that most people call "agile" tend to be well suited for the agile zone. Here I'm thinking of things like Lean, XP, and so on. Disclaimer: I used to work at Pivotal Labs, and I currently work on Cloud Foundry. It should be no surprise that I think Lean/XP practices are good for these sorts of things.

Practices and techniques that go by names like "formal methods", "specification-driven", or sometimes "waterfall" can be very well suited to building software in the stable zone.

This is because these different practices optimise for different things.

"Agile" methods help you live with uncertainty

They help you discover the problems you need to solve.

They help you figure out which software you actually want to write.

"Agile" methods help you live with the pain of high uncertainty by taking advantage of your low cost of experiment. Consider, for example, the following attitude to internal docs and code comments in agile software.

In the agile zone, we can't afford to write much down (comments, docs, etc) because the world will likely change under our feet tomorrow. Docs and comments that lie are worse than no docs at all, so we shouldn't write them in the first place! This is painful.

We can mitigate that pain by pairing on experiments, and rotating our pairs all through the project. The act of performing experiments helps us learn what we need to know, and get an intuitive feel for what's changing and what isn't. By rotating pairs throughout the project, we ensure that crucial knowledge is never in the head of just one person. Instead, important knowledge tends to swirl around the team as a whole.

When a new engineer joins an existing agile project, we often talk about the amount of time it takes them to "pick up context" on that project. This is the act of pairing with people who already "have context" until that context takes root in the new engineer's brain. On a healthy project, this doesn't take very long. This "context" is the stuff that we're not writing down, because it keeps changing. Our agile methods mean we don't need to write it down, and we can live quite happily with a high rate of change.

"Formal" methods help you live without experiments

They help you solve the problem that's in front of you

They help you avoid distracting yourself with side-quests, or writing something other than what was asked for. The spec is the spec. We write software that meets the spec.

"Formal" methods help you live with the pain of high cost of experiment by taking advantage of your low uncertainty.

In the stable zone, we can't afford to ship a quick experiment to see if it works for us. Fortunately, we already have a very good idea of what we ought to build! Since we know what we want to build, we can use whatever high-assurance software development tools we like to ensure that the software we write does the things we said we wanted it to do.

For example, when writing software that controls passenger trains, the cost of a failed experiment would be measured in lives. In this sector there is a long history of using mathematical models and techniques to ensure that our software does precisely what we said it should do, without the need for experiments on real passengers.

What about the World of Pain?

I think the best thing to do about the World of Pain is to avoid it. Do whatever it takes to either reduce your uncertainty or your cost of experiment, and then proceed from the zone you moved to. There are various tricks you might use to get out of this kind of bind, but they're out of scope for this post.

Successful software moves between zones

The same graph, with an arrow sweeping from the bottom-right first (the Agile zone) leftwards, then upwards. From high uncertainty to low uncertainty, and then from low cost-of-experiment to high cost-of-experiment. It ends in the Stable zone in the top-left.

I think most (certainly lots of) interesting commercial software starts in the Agile Zone: high uncertainty and low cost of experiment. However, once the project starts, we're going to start taking advantage of the low cost of experiment to learn things. If we do this well, this will reduce the uncertainty. In our diagram, this means the software moves leftwards.

While we're running experiments and reducing uncertainty, we're also likely to be finding customers who can use our stuff. If our stuff is really good, then our customers will start to rely on it.

As our customers start to rely on our software, they're likely to get less keen on experimenting with radically new versions of our software. As a result, our cost of experiment will increase. We'll start moving upwards in our diagram.

Customers relying on us aren't the only thing that can increase our cost-of-experiment. For example legislation could change in the space we're working in, and force us to do much more due diligence every time we interact with a customer. Or our parent company could mandate new compliance or security processes that increase the amount of time it takes us to ship an experimental release. Some of these experiment-costs might be things we can avoid or work around….

But I believe that once our software is successful, the most fundamental experiment-cost will always eventually show up. If we're doing a good job, customers will rely on us. If customers rely on us, they won't want us to mess around with their UX.

If we're lucky, we'll trace a path from the agile-zone towards the stable zone roughly as in the diagram above. We'll reduce our uncertainty first, and then as we get more confident we'll start getting more successful and our cost-of-experiment will start increasing. There will be an exciting point in the middle of this journey where we realise that this thing really has legs! This is The Growth Zone.

Some routes are better than others…

The same graph, but this time the arrow attempts to sweep upwards to high-cost-of-experiment before sweeping leftwards to low-uncertainty. The arrow never makes it, because it explodes when it hits the high uncertainty and high cost-of-experiment in the top-right corner. This is the world of Pain.

If we start an agile project in the Agile Zone and we're unlucky, our cost-of-experiment will start to go up before we've sufficiently reduced our uncertainty.

This takes us into the World of Pain, and this is something we really want to avoid.

Prior Art: The Product Development Triathlon and The 3X Engineer

If you think you've seen this business about agile -> growth -> stable before, there's a good chance you have.

In 2016 Kent Beck wrote about The Product Development Triathlon. There, he described the phases:

Explore
Expand
Extract

They map exactly onto the spaces I've named:

Agile
Growth
Stable

Kent Beck doesn't talk much about the space I called "The World of Pain". He is concerned with describing best practices for successful projects, and doesn't dwell on this particular failure mode. Personally, I like to keep The World of Pain in mind. I think I've seen a few projects slip into The World of Pain when they thought they were being good agile practitioners, and I think there's some value in naming the danger.

I think Beck's names "Explore, Expand, Extract" do the job Beck wants them to do. They focus on the thing you're doing in each space. I also like the names "Agile, Growth, Stable", because they describe what we ought to optimise our processes for in each space.

Here's the relationship between Beck's terminology and mine:

What?	How?	Why?
Explore	Optimise for Agility	Because the world keeps changing around us.
Expand	Optimise for Growth	Because we have customers now, and they're demanding bigger more reliable workloads.
Extract	Optimise for Stability	Because everything has settled down. Now we're free to take advantage of the long tail.
	Escape the World of Pain	Because it hurts.

Part 2: Navigating the zones

I think the ideal software project would follow a smooth curve from the agile zone, through the growth zone, and into a long happy life in the stable zone. However, there are all sorts of paths that a project might take in the real-world, and not all of them will feel successful. I think being intentional about our movement is likely to feel more successful than not.

Let's start with figuring out where we are in the first place.

Which zone am I in right now?

We can estimate our position on this graph by estimating our position on each axis in turn.

How much uncertainty are we living with?

If I could wave a magic wand to create any software I can imagine, and I'm still not sure if I could make money… Then this is a high-uncertainty space.
If I have happy customers who like our software, this is a low uncertainty space.

What if I'm 100% sure that if I built exactly the thing that I'm imagining, then I would have happy customers? In that case, I believe I'm in a low uncertainty space. Whether my belief is true or not remains to be seen.

How much is our cost of experiment?

If I can have an idea, code it and ship it today, have a real customer try it and get some feedback within a couple of days… Then this is a low cost-of-experiment space.
If it takes a while to code up any change, and then we have lots of compliance, security, and legal obligations before we can ship… And then when we do ship, our customers don't want to upgrade… This is a high cost-of-experiment space.

Hopefully now you have some idea of which zone you're in. Let's look at how to move intentionally between zones.

How to deliberately move from Agile to Growth

The same graph with arrows pointing left, towards less uncertainty. We can use them to move from the Agile zone on the bottom-right to the Growth zone on the bottom-left. To move leftwards on our graph, from the Agile zone to the Growth zone, we need to reduce our levels of uncertainty. We can do this by playing stories that answer questions.

This is the bread-and-butter of Lean/XP practice the way I learned it at Pivotal Labs. Every story we write is intended to perform an experiment or answer a question. We sort them (so far as is possible) in risk-order. So we validate our highest-risk assumptions first.

This is how you take a germ of a business idea and turn it into something that could make money. …..or how you discover as early as possible that there's no money to make here. Hopefully before you've re-mortgaged your house.

How to deliberately move from Growth to Agile

(…or accidentally from Stable to Pain)

The same graph with arrows pointing right, towards more uncertainty. We can use them to move from the Growth zone in the bottom-left to the Agile zone in the bottom-right. But if we start in the Stable zone in the top-left, the same movement will take us to the world of Pain in th etop-right.

To move rightwards on our graph, we need to increase our levels of uncertainty. We can do this by taking risks, discovering new customers, and opening new markets.

In a lot of textbook agile projects, a good chunk of this has already happened before any of the software developers have even shown up. This is where the bright idea happens. It might involve systematic market research, or a sudden intuitive insight. From the point of view of the devs, this work is often surfaced and discussed in a project inception meeting.

How to deliberately stay Agile

(… or accidentally move from Agile to Pain)

The same graph, with a circular arrow looping back from agile to agile, in the bottom-right.

If you want to stay in the Agile zone, you could write a balance of stories that answer questions and which explore new ideas and open new markets.

Every time you complete a story that answers a question, the world gets a little less uncertain, and you move a little bit leftwards. Every time you complete a story that opens a new market, you get a whole bunch of new questions to consider, and the world gets more uncertain again.

This might be a great idea if your team is dedicated to exploring the market, and not to making money. However, be wary of what happens if any customers take your stuff seriously! If you have customers who start relying on your software, then they might start to get annoyed if you keep on changing it under their feet. Your cost of experiment might go up, and if you haven't sufficiently reduced your uncertainty yet, you might drift into the World of Pain.

How to deliberately move from Growth to Stable

The same graph with arrows pointing up towards higher cost of experiment. We can use them to move from Growth in the bottom-left to Stable in the top-left. Here's where we're going to want to write stories that improve scalability and reliability.

When you start to enter the growth zone, the business gets excited. Now you have Real Customers, and it looks like they might be about to start Really Relying on your software. You're starting to make money.

Congratulations! Now that you have real customers, a lot of your earlier uncertainties have been resolved. You know who is willing to pay for the software, because they are in fact paying for it.

Also: Beware! Now that you have real customers, you can't get away with changing things so quickly all the time. Or at least, you won't be able to for long… Not once your customers start building on top of your stuff. And certainly not once you have more than one or two of them all building on top of your stuff. Fortunately, you don't need to do so many experiments as you did in the Agile zone, because you already know what customers are willing to pay for.

To prepare for customers relying on your software, you'll want to improve reliability. Fix any bugs you've discovered. Maybe something surprising happens roughly 1 time out of every 100 installs. But that used to be OK because it almost never came up, and when it did, you could just ask the customer to re-install. Maybe now it's time to reduce that chance to 1 time in every 10,000 or fewer.

It's worth noting at this point that production code isn't the only thing that improves reliability. Good docs also make your software more reliable. You probably shouldn't have been writing a lot of docs in the Agile zone, because things kept changing. Now you should be writing a lot of really good docs.

To prepare for customers using your software a lot, you'll want to improve scalability. How many API requests can you handle every minute? Can you increase that number by horizontally scaling³? What's the limit before horizontal scaling stops helping? In the Agile zone, worrying too much about this kind of thing was a distraction from running experiments and reducing risk. In the Growth zone, this kind of thing should be your main focus.

How to accidentally move from Agile to Pain

Exactly the same graph. With arrows pointing up towards higher cost of experiment. If we follow them from Agile in the bottom-right, they lead us to Pain in the top-right.

If you start trying to optimise for growth too early, you will stifle your ability to run experiments. If you only have one customer, who cares if you can handle hundreds of thousands?

This is what people mean when they worry about "gold plating" a product. This is why people say "YAGNI" – "you ain't gonna need it". You might want to do this stuff later, but if you're in the agile zone, doing it right now is wasted effort. At best it slows you down while the world moves on without you. At worst, it leaves you just as uncertain as you were before, but less able to make changes quickly when you need them.

How to deliberately stay Stable

The same graph, with a circular arrow looping back from stable to stable, in the top left. Here's where you get the most benefit from automating all the things.

If you started writing your software in the Agile zone⁴, then once you enter the Stable zone, you've got customers so reliant on you that it's very difficult to make any changes at all. But that's fine! You don't need to make changes, because you're already making money! This is where we get to settle in and enjoy the long tail.

Here are some things that might spoil your enjoyment of the long tail:

You keep getting paged by customers whose stuff broke.
People keep discovering that your software is vulnerable to CVEs, and asking you to fix it.
People keep asking you to prove that every library you use in every patch release has a software license that's compatible with your software license.
Your boss wants to move all your devs to an exciting new Agile zone project.

These kinds of things either didn't happen so much, or felt much easier in the Agile zone. We knew exactly how to fix the bug, because we were working on that code anyway. We didn't have many customers, so incoming bug reports were exciting, not overwhelming. We knew every library in the system, because we only included most of them last week. The boss was excited, and we were hiring…

These kinds of things can make winning feel like losing. It might be tempting to try to re-capture the old days of Agile development by looking for ways to pivot your project and become more relevant. Resist this temptation. If you're making money, you're already more relevant than 90% of Agile zone projects will ever be.

Instead:

Fix bugs.
Document workarounds.
Automate everything that can be automated.

If you do these things you'll find that you can keep releasing stable, secure software; you can meet the demands of your legal department; you can keep your customers happy; and you can do it with fewer engineers.

How to do these things is going to have to be the topic of a future post.

Footnotes:

…although the exercise might be most useful for commercial software.

Heroku, Cloud Foundry, Lambda, etc

Running more copies in more containers or on more VMs.

⁴

If your software project started in the Stable zone, then things are different. But in that case you already know exactly what software you want to write.