I get so frustrated when I see perfectly talented DevOps engineers building pipelines which drive big bang thinking, and calling it CI/CD. Can you all please stop? Continuous integration and continuous deployment are two very special principals which drive high quality, prevent bugs reaching production, and generally help things get delivered quicker. Automation alone does not necessarily drive nor support CI/CD practices; CI/CD, although almost always implemented with automation, are principals which exist apart from it.
I haven’t yet written about CI/CD because there are a lot of great resources on the subject. If you are interested in how to do it right, you absolutely should go and read what Martin Fowler has to say about Continuous Integration, or what he has to say about Continuous Delivery (he’s a better writer than I am). While this post will undoubtedly cover some of what he has to say, it’s inspired by my own experiences.
CI vs CD
So what’s the difference between CI and CD?
Well, Continuous Integration is a practice where code from multiple developers is merged into the main branch and verified repeatedly, at intervals ranging from every few minutes to a few times a day; as often as possible, the more often the better.
Continuous Delivery is where we take small, incremental changes and deploy them right through all the non-production environments, carry out all appropriate testing, and get them released to production, one at a time; as often as possible, the more often the better.
The two practices are obviously strongly related, which is why we so often talk about CI/CD together as a unified practice. Because both deal with small changes, repeatedly, they seem like they should fit together well, and in practice they most certainly do.
Let’s first take a look at the difference Continuous Delivery makes over more traditional approaches.
Many traditional approaches to integrating work from more than one developer are based on the idea that merges are error prone and a source of bugs. This isn’t completely wrong, but traditional approaches try to solve this problem by merging as seldom as possible; kicking the can down the road. The idea is to let a developer write a significant amount of code, uninterrupted, before eventually trying to merge it into whatever changes have come from other engineers. With this approach, merging truly is a painful source of errors. Until the merge, there might not be any bugs; after the merge, it’s likely there are at least a few. It’s also quite possible that whole pieces of functionality are lost during merge, as the individual trying to sew everything together has to make some difficult choices about who’s code wins, when there’s a conflict.
CI overcomes this problem in a different way. Rather than hiding from merges, they are embraced and happen as often as possible. Whenever there is a piece of code which is at least ‘not broken’, then source can be merged. This means merges are tiny, conflicts are few, and bugs are rare.
CI gives developers fast feedback if there are any problems, so changes often trigger builds on a build server, and automated tests – all of which generally run in a couple of minutes at most. After each push, it should be possible for a developer to wait for the build and test phases to complete before they move on. If a developer waits to be informed of any problems, they can fix them right away without having to context switch. CI is so successful because the build and test process fits into the developer’s cycle of writing code, writing test code, rebasing with concurrent changes from other developers, and pushing to source control.
Well implemented CI practices act as a safety net for developers. Well conceived stories can be implemented a bit at a time; put through several integration and builds before the final piece of code is pushed. Every few minutes, the developer is informed that their code works – from their point of view, it’s welcome feedback.
CI taken to its logical conclusion, leads us to a working practice known as trunk based development; a practice where developers commit directly to the master branch, integrating with every push.
“This is how people work when they first start writing code, before they learn about branching strategies, and how to do it ‘properly’.”An engineer working for a client, not so long ago
I remember being surprised by this statement when I heard it, and then quite pleased. Yes, trunk based development is how you start – it’s the simplest approach to source control. The added complexities of different branching strategies are there to fix problems which only exist once something slows the process of delivery down. Within the confines of a single team, working toward a common goal, work can (and should) be organised so trunk based development will work.
This approach can sound dangerous to those who haven’t experienced the results first hand, but the truth is that developers need to know right away if they’ve broken someone else’s code – the only way to do this is to put everyone’s code together and run the tests. Any delay in integrating code is a delay in finding problems, which makes fixing those problems harder.
Individuals and interactions over processes and toolsThe Agile Manifesto
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan
The Agile Manifesto places people over process. It’s sometimes easy to forget that developers are people. CI doesn’t just reduce the size of merges, it also encourages some fantastic and healthy behaviours, such as actually talking to each other.
It’s no secret that developers like to spend their time writing code. If I’m honest, I vastly prefer the familiarity and predictability of software systems over the messy, surprising flow of social interaction; but avoiding people is terribly unhealthy (especially in remote scenarios) and no way to foster good working relationships within a team. CI gives us a nudge to reach out to others and plan how to help each other. If we discover two or more devs are working in the same area and their changes simply can’t be applied simultaneously, this means the work must be carried out sequentially – delaying integrating simply puts off the problem and makes things much more complicated when you finally come to deal with it! Instead, why not assist each other?
Code review is sometimes held up as proof that code shouldn’t be integrated right away; that someone other than the developer writing it is responsible for ensuring code quality before it is integrated with other people’s work. This opinion is driven from a place of mistrust; a culture of passing the buck and refusing responsibility, driven by the practices which CI replaces. It’s also plain wrong. A developer may write beautiful code, or they might write something ugly as sin. Ask the person paying for the software delivery if they care enough about that one change, to delay delivery. The benchmark for software hitting production is that it works. Well structured code does increase the quality of the overall product, but not immediately – there is plenty of opportunity to forward fix some dirty code, taking the time for the entire team to learn how to do better in future.
Any experience of growth within the team is of far greater value than immediately calling someone out for inelegant code.
So we have two alternative ideas: with CI and without. With CI, we talk to each other often, we minimize a major source of stress and bugs, we foster behaviour where we consider how we are impacting each other, any problems are seen early and fixed before they impact anyone else. This sounds wonderful, to me.
On the other hand, we have without CI. Changes are saved up and dropped on the rest of the team in big, difficult to manage chunks. Sometimes a full day can be lost across multiple developers while they try to integrate new work (I’ve seen a merge take more than a day, with several attempts to get it right). Gates are placed to prevent code reaching the master branch before being approved by other developers, showing a lack of trust, and creating a bottle neck. Problems are dealt with in the world of code review comments, going backwards and forwards asynchronously, instead of just having a conversation. More bugs are found in higher environments, causing QA to become hyper sensitive to risk. Roles are pitched against each other instead of working together. Fingers begin to be pointed. I’d rather pass.
CD is very much the continuation of CI. If you have embraced CI, then you already understand that smaller changes are less risky than larger changes, and any problems caused by small changes can be easily overcome. CD takes this to the next level and asserts that it’s safer to constantly deploy small changes right through into production than it is to queue changes up into one big release event.
Nirvana. Won’t work here.
So many of the arguments I hear against CD boil down to a reluctance to change, and I think that’s a cop out. It doesn’t matter how long you’ve been doing something, if there’s a way it could be done better, at least try!
The regulators won’t allow it.
The QA team won’t allow it.
It sounds too hard.
That’s just a pipe dream – real development work isn’t like that.
I’ve heard all of these and more. They always turn out to be wrong.
For a long time, I believed that regulated scenarios might be the one place where Continuous Delivery just couldn’t work, but I was proven wrong; it turns out many regulators are generally unwilling to tell businesses how to operate (it opens them up for liability when something goes wrong). Instead, they assert that there must be sufficient controls in place, and want proof. There is nothing in the practice of Continuous Delivery which runs contrary to this. There are most certainly controls in place. These controls are so well designed that the quality, security, and usability of a system are proven over and over. As for the question of knowing what is being deployed (watching for malicious code), it is far easier to watch what is being built, and apply reviews at the point at which code is introduced, rather than waiting until the end of the process and having to address the safety of and entire system.
For a practice which is meant to benefit everyone involved in the delivery of a software solution, CD often seems like such a hard sell. So what makes it worth pursuing?
Continuous Delivery relies heavily on the ability to repeatedly prove the software works as required. We generally automate as much testing as possible. The pipeline might run to production once or twice a day, or maybe every few minutes – whatever the heart beat, it’s far quicker than waiting for specific project phases of ‘Test’, ‘UAT’, ‘Pre-Prod practice release’, and finally ‘release to production’. Don’t have many automation engineers? That’s fine – your Developers should really be writing most of these tests, with help and input from QA and the business. The delivery of each user story should include tests which prove the functionality in each of the environments the application gets deployed into. All of these tests run for all subsequent stories, proving the lack of regression (this is why we aren’t too worried if unfinished code goes to production – it obviously hasn’t broken anything).
The CI/CD pipeline becomes both the gateway to production and the primary enabler to moving quickly. Making a small change, getting it into the pipeline with appropriate tests in place, means that change will undoubtedly get to production far quicker than with any other approach.
A fun benefit of this is that when something does go wrong (because something always does – you won’t avoid problems completely) a fix can get pushed out just as quickly – no such thing as an emergency fix, it’s just a fix which goes out quickly, like any other change.
Code which is not intended for release, such as part of a change which will take several stories to implement, can still go to production. Martin Fowler, again, has what I think is the best explanation of branching by abstraction. It’s far preferable to do this than delay the integration of the change. This strategy also provides a mechanism for toggling a change on and off without affecting the source code.
Well crafted stories
If there’s one thing which I’ve found helps make CI/CD run smoothly, it’s well considered stories. Where acceptance criteria can be easily converted into automated tests by developers. Focusing stories on things which are easily testable enables developers to work with clarity, moving much quicker than otherwise. I have had some resistance to stories which are deemed ‘too technical’, but if it works, it works. These focused stories can be grouped together into features which include more classical, user focused acceptance criteria, which can be addressed in a specific UAT effort.
At the end of the day, every team wants to move quickly. CI/CD makes that happen whilst raising the quality bar – no other approach achieves both. It’s no wonder that the phrase CI/CD gets plastered around so much, it’s just a shame that often it’s hung on something which simply isn’t CI/CD.