I’m a software engineer at heart – I love writing code, deploying functionality, and seeing the impact it has. Even when I’m just shaving a few seconds off a repetitive task by removing an unnecessary button click – it’s all about small improvements, often. So it might be surprising to hear, most problems which impact my working life do not originate in the code.
For many enterprises, defining work is the hardest part of the delivery process. Get that wrong and progress can’t be tracked, changes can’t be tested, the team are confused about what’s getting done, and there’s a lot of wasted effort pursuing the wrong outcomes. In this post, I hope to outline how I like to see things done. This approach is a nexus of ideas and behaviours from all the high performing teams I’ve been fortunate enough to be a part of.
Requirements
Oh yes! They’re still a thing!
Although a LOT of software gets built “just because”, it’s pretty likely that behind the work you’re doing, there are some fundamental requirements. You might not have a list of them, know where they came from, or know which are most important, but they’re there. Also, if you don’t have a list, or sources, or a priority – why not?
These days most teams at least try to be agile, working from stories; but are those stories being driven by requirements? You have to start with requirements – functional and non-functional (yes, you remember this from college – no different). In an agile world, we might not demand to know every requirement in detail, up front, but if you’re spending money without being able to point at the reason, then you’re on a slippery slope. Fundamentally, we shouldn’t be building anything which isn’t needed. It’s fine to be vague when there are uncertainties, but the process of asking the questions (yep – going to the business and asking someone) and compiling the list, will drive thought and conversation which otherwise won’t happen.
You’ll want to get these requirements into a list; it really doesn’t have to be a spreadsheet, but that works well and can even be used collaboratively these days. Whatever you’re using, you’ll want to give each requirement a unique ID so it can be referred to, categories such as source and type, and a priority. Guess what – after a lot of discussion, you’re going to want to tackle the higher priority requirements first.
This is where you’re going to hit the first obstacle in fulfilling the CEO’s biggest expectations: the design. You may have architects on the team, or you might have a tech lead, or a technical BA, or maybe it’s a group work thing, but you need to get some idea of how these requirements are going to be fulfilled, before anyone can start generating stories.
High level technical design
Steering the development effort to produce the right architecture is a tricky process. The right design will be driven by the business – new startups will likely want very fast, very simple first versions of software, targeting a monolithic design. More mature businesses will likely be moving beyond monolithic design and will have new driving principles which will help determine the best approach. Whatever the right design might be, there are often conflicting opinions within a team as to what’s best; driving discussion and white board sessions to hammer out different possibilities.
This should be an exciting process. Whenever there’s real discussion and collaboration, the end result cannot be fully known up front. It’s like a big game of chess where all the players are on the same side, picking the best combination of tactics to force a perfect game. Eventually, even if everything isn’t fully agreed, there will be at least the basic framework: you’ll know what subdomains you’re working with, how you’re going to break them down into services, what types of events you expect to be flowing, what storage mechanism to use, and so on. Enough to get started on the first few work items.
There’s an idea called emergent design which gets banded about as either a bit of a joke or the holy grail. I like to think of emergent design as ‘just in time design’, because it’s really the practice of putting off design decisions until the last possible moment (sometimes while code is being written). Depending on the seniority of your developers, you may feel you need to make more decisions up front, but deciding as late as you can is always the best approach. Most of the time, there are overarching patterns which are already in place, such as existing dev ops code which can build, test and deploy microservices. These will drive some earlier decisions for most cases, but you shouldn’t be afraid to break the mould where it makes sense.
At some point, after enough requirements have been gathered to start designing, and after enough design has happened to be able to start building something, you can start creating some work items or stories.
The act of delivering work can inform the next iteration of design in a number of ways. A spike story can be created which tasks someone with a deep dive of technical knowledge gathering, aimed at answering very specific questions. Particularly difficult logic flows could be developed using TDD, to allow the developer to play with different implementations without changing the overall result. It might not be until a feature has been built that it becomes apparent how big it really is – maybe it belongs in its own service, but no-one realises until someone has written some code. The constant learnings during development should be fed back into the design. Without that feedback the outcome generally becomes an unmanageable ball of spaghetti code with a high WTF/f ratio.
The measured what the fucks per function is inversely proportionate to the quality of product subsequently built on top of current code. The result of a high score is only observed in future work.
The perfect story
The perfect story doesn’t exist but if it did, it would be succinct, complete, and unambiguous.
A story is a representation of some work which needs to be done. Once that work is done, the story is archived, but still available for review in future.
A story is not a specification. Once it is being worked on (or once it has been pulled into a sprint, if you’re practicing SCRUM), a story does not change.
A story has 6 main elements:
- Number
- Title
- Description
- Acceptance Criteria
- Requirements
- Notes / Comments
The number is a unique id so the story can be referred to clearly in other systems.
The title is a short, succinct line of text which embodies the work to be done.
The description should explain what is to be built. It should be unambiguous and free from extraneous details. You might not want to copy all relevant detail from the design, but you should make sure that there is a process in place to prevent the design changing without revisiting the story. The amount of detail used in the description is determined by the team – this detail is generally added during a backlog grooming / elaboration session. Everyone who might need to understand the story should be present and engaged.
The acceptance criteria is literally the list of things which should be tested and working in order for the story to be considered done. Generating this list should involve the 3 amigo’s. Often this excludes items which are in the agreed definition of done, as these are more general and apply across the board.
The requirements are a link back to the reasons for this story to exist. It’s important for both prioritisation and scoping. This helps keep the story focused on the result and makes it easier for a Product Owner to prioritise (or deprioritise).
The notes and comments are there for team members to record activities, question, decision, etc which come up during delivery. I’ve often found that keeping a log of activities in the story helps me remember what I was doing or planning to do (my memory is very good at forgetting over a weekend). If you’re ever taken ill and need to hand over a story to someone else, these are also handy for whoever has to pick up where you left off.
This is my take on what makes a good story. I’ve sat in a room listening to Dan North explain that one word on a sticky note is enough, if it explains things. While I agree with him in principle, I think it’s pretty unlikely one word would be enough in most cases. I find it helpful to err on the side of usefulness.
Estimation
The idea of estimation is to give the team a benchmark of how much work can be done in any single iteration. Unfortunately, these are always used for planning and become quotes. This isn’t intentional, but the moment estimates are advertised outside of the team, someone is using them for planning – something they are not intended for. Estimates on a story are no use outside of the context of a single sprint and a single team – they include dimensions which are not time related, and they are not particularly stable, when a team is transitioning (and if you’re still estimating, it’s likely you’re still transitioning). In a new team who are new to Agile ways, there can be some value in the process of estimating, as it highlights where some people have a drastically different view of the work than others, and it highlights when a piece of work is too big to be tackled in a single story. But once the team gets better at grooming, they will cease to depend on these prompts. In general, if a story looks like it’s ready to go, it’ll have been defined well enough, and reduced enough, so it fits whatever has become the team ‘normal’.
So, if estimation isn’t of huge value for the team (and eventually becomes a waste of time) how is planning done? Planning requires time estimates – these should not come from the wider team. Instead, these come from a combination of the most technically senior developers, architects, and QA. They are based on the growing knowledge of how long the team takes to do certain things and may even be timeboxes rather than estimates.
We will work on this feature for 3 sprints, review where we are and pick the best set of priorities for another 2 sprints, then we’re done.
The important thing is to keep the wider team out of this process – the wider team commit to delivering each iteration, that’s as far as they should be focusing. Larger timelines are estimated by those in charge of delivery, and it’s their responsibility to make good estimates, and plan in a flexible manner.
A truth which gets overlooked far too often, is that work flows through a system the fastest when the system is only at 50% capacity. This is misinterpreted and shunned by most managers, because it sounds like a contradiction: if 100% capacity is 50 points of work per iteration, how can only assigning 25 points of work to an iteration make the team move faster? This paradox is due to confusion over what capacity is. 100% capacity is the amount of work a team can get through if there are no unresolved dependencies, no need for collaboration, zero complexity, no unexpected work, no bugs, no distractions, no last-minute changes, no sick days, no exhaustion, and each story description conveys absolute clarity of intent. If you think your team can function like this, you need to take a long hard look at the furrowed brows and tired faces you’re surrounded by. By only assigning half that amount of work, you leave room for the inevitable imperfections within teams of human beings. You leave room for unplanned work, uncertainty, people being unavailable, reworking bugs, and thoroughly proving quality, along with all the other unexpected turns any given iteration brings. You are feeding work through a system of people, not machines. This is the 50% – it’s busy, all the time, but not so busy that you can’t help each other out or take the time to do a good job.
Every team will have a different point of equilibrium, but by initially halving the amount of work, you will give room for that point to be hit. Then you can start working with the team to move the point in a positive direction while maintaining a healthy workplace and generating top class software.
It’s still easy for manager to baulk at the idea of halving the planned work. On a project plan, that looks crazy – instantly doubling the time to deliver. What experience shows is that the plan is already wrong. If you are trying to push more work through than can be comfortably completed, then you are already failing, and the short cuts people are taking will cost you time in future. Look at your plan and assume you have a missing 50% of work over the next couple of years – that will give you a more realistic idea of where you will be (at best) if you keep going like that. When you half the amount of work being planned, you instantly free up the space for the team to self-organise and improve – without that space there is no improvement, no matter how much people want it. With improvement comes more efficient ways of working, better approaches, more ingenuity; you quickly find you’ll be making ground back on your original plans (and that won’t come with the hidden 50%) and you won’t be grinding the team into dust to get there.
If you have embraced the 50% concept and given your team the space to grow, then you can safely take advantage of the lean principle of building just enough. This is a dangerous idea if your team is under constant time pressure because it relies on there being room to rebuild, change direction, refactor, and grow the software to support the next piece of functionality. If your team doesn’t have the room to do this, then you just end up with poorly designed systems with insufficient resilience, growing in complexity with every iteration in an uncontrolled fashion, making additional changes harder and more expensive.
If your team is unable to commit to refactoring as they go, due to time pressures, then you are not agile, so trying to apply agile working patterns will ultimately cause your team’s failure – you will create huge stress, exhaustion, and begin to lose team members. If you can’t let your team be agile, don’t expect them to succeed without absolute certainty in what is being delivered. You will need significant up-front design, and excellent documentation of every aspect of the intended solution. This will leave you unable to reach the high performance of other teams in other enterprises, but this is the choice you make.
I’ll focus on the ‘good’ path – let’s talk about how to prioritise these stories based on lean principles.
What should we build next?
What you should be building next depends on a combination of: the highest priority requirements, technical dependencies, and what you are ready to deliver. Of course, the starting point is always going to be the highest priority requirements – if you can’t draw a clear line between a piece of work and a high priority requirement, then you are heading in the wrong direction; but that line isn’t always a straight line.
A system which contains sensitive data might have a high priority requirement for users to log in using an Open ID service. In order to fulfil that requirement, you’d need to build something to protect with a login, some kind of user interface, and pipelines to build, test, and deploy it. This gives us a great example of how technical dependencies can drive ahead of high priority requirements.
A system which requires complicated logic, perhaps to generate quotes or to plan required materials for a manufacturing job, could have a high priority of a rules engine. It’s possible that this core business concept is hard to define, with shifting ideas based on the involvement of different parts of the business. Until the ideas have become more stable, it might be that you are just not ready to deliver this, and instead should focus on the next priority.
By allowing technical designers to focus on not just the highest priority requirement, but also what is logistically possible to do right now, you will create an ecosystem where the big-ticket items get easier to deliver over time. By maintaining a reasonable workload within the team, you guarantee a good flow of features getting released at a high standard – this engenders confidence within the business, of your team’s ability to deliver. You become better positioned to dictate direction based on continuous success.
This approach also focuses you on the requirements.
It might surprise you to hear that the business doesn’t care how many tickets/story points/work items/lines of code/etc your team can deliver per iteration – they care about fulfilling requirements. It’s what the business have asked for, and every other artifact of delivery created after that first list is only there to help you deliver – no-one else cares.
Once you know what things you want to focus on, you can start creating stories with your team, in grooming sessions. You need to include your team because you can’t create a story which conveys everything everyone needs without those people being involved. You can’t write acceptance criteria without a QA explaining what will be tested (and what won’t be). You can’t decide whether a story is too big without the people who will be doing the work and know the current state of the codebase.
What information is required in a story will differ from person to person, but without including everyone in the process, you will exclude some. Those excluded are the people who will need to go over the whole thing again, often raising questions which should have been fielded in the grooming session, and which might have led to other work items being created. Take the time to define work well.
Develop, test, deploy
I’ve talked about agile software development in almost every article I’ve written. The concepts for short turn around and high quality are always the same: make it easy to get work into production, make it difficult for defects to be released into production. A huge source of outages and bugs is the complexity of managing a big bang release of multiple changes. If you can hide changes behind a feature flag, so they aren’t even on an execution path, then you can push them right out to production – proving your deployment code and getting changes from one developer shared among everyone else as quickly as possible. In an experienced team, I’d expect each developer to cause between 2 and 10 deployments into production every day.
Of course, with a high frequency of deployments, testing can’t be manual. As changes are made, automated tests are written which prove the new code in environments where those features are enabled. Then in another environment, the set of features which are currently live in prod are verified with existing automated tests. If those all pass, then the code hits production (with zero down time). Any manual verification of new features can be carried out before a feature is enabled, or at any time after the functionality requiring test has hit the test environment.
The key concept here is that a developer doesn’t have to consider the order of changes across dependencies. If something works for them, it has hit the pipeline and will get to production. If there is a bug which prevents a dependency getting through, then that feature will fail – the pipeline will prevent anything broken getting into production.
If a defect does get into production, it’s easy and fast to fix. There is no need for a change request or emergency meeting – just push the fix with verification tests and let the pipeline get the change into production.
If you need to hear more about agile software delivery, please read through some of my other articles, or download the State of DevOps report.
Summarising
I believe the key to successfully delivering great software can be distilled down to balancing these 6 concepts:
- Start with requirements.
- Link work items to the requirements they help fulfil.
- Allow your team to be agile – give them the room to grow and help them do that.
- Allow learnings to feed back into the design by making decisions as late as possible.
- Keep the designing ‘just far enough’ ahead of the development work.
- Do agile software development – devops, automated testing, continuous delivery, etc.
If you are experiencing troubles, it’s likely one or more of these things needs attention.
If you get these 6 things right and work hard to maintain them, it’s hard to imagine anything else getting in the way.