Testing Times

Although I see developers writing more and more tests, their efforts are often ignored by QA and not taken into account by the test strategy. It’s common for developers to involve QA in their work, but this is not the full picture. To maximise on efficiency of test coverage, I think developer tests should be accounted for as part of the test approach.

The role of QA as the gateway of quality, rather than the implementors of quality, allows for good judgement to be used when deciding if a developer’s tests provide suitable coverage to count toward the traditional QA effort. This requires QA to include some development skills, so the role is capable of seeing the benefits and flaws in developer written tests.

What follows is a breakdown of how I categorise different types of test, followed by a “bringing it all together” section where I hope to outline a few approaches for streamlining the amount of testing done.


Unit Tests

Unit testing can take the form of a few different flavours. I’ve noticed some difference depending on language and platform. There are also different reasons for writing unit tests.

Post development tests

These are the unit tests developers would write before TDD became popular. They do have a lot of value. They are aimed at ensuring business logic does what it’s supposed to at the time of development, and keeps on working at the time of redevelopment five years later.

TDD

I fall into the crowd who bleieve TDD is more about code design than it is about functional correctness. Taking a TDD approach will help a developer write SOLID code, and hopefully make it much easier to debug and read. Having said that, it’s a fantastic tool for writing complex business logic, because generally you will have had a very kind analyst work out exactly what the output of your complex business logic is expected to be. Tackling the dev with a TDD approach will make translating that logic into code much easier, as it’s immediately obvious when you break something you wrote 2 minutes ago.

BDD

I’m a big believer in BDD tests being written at the unit level. For me, these are tests which are named and namespaced in a way to indicate the behaviour being tested. They will often have most of the code in a single setup method and test the result of running the setup in individual test methods, named appropriately. These can be used in the process of TDD, to design the code, and they can also be excellent at making a connection between acceptance criteria and business logic. Because the context of the test is important, I find there’s about a 50/50 split of when I can usefully write unit tests in a BDD fashion vs working with a more general TDD approach. I’ve also found that tests like these can encourage the use of domain terminology from the story being worked on, as a result of the wording of the AC’s.

Without a doubt, BDD style unit tests are much better at ensuring important behaviours remain unbroken than more traditional class level unit tests, because the way they’re named and grouped makes the purpose of the tests, and the specific behaviour under test, much clearer. I can tell very quickly if a described behaviour is important or not, but I can’t tell you whether the method GetResult() should always return the number 5. I would encourage this style of unit testing where business logic is being written.

However! BDD is not Gherkin. BDD tests place emphasis on the behaviour being tested rather than just on the correctness of the results. Don’t be tied to arbitrarily writing ‘given, when, then’ statements.

Hitting a database

The DotNet developer in me screams “No!” whenever I think about this, but the half of me which loves Ruby on Rails understands that because MySQL is deployed with a single ‘apt’ command and because I need to ensure my dynamically typed objects save and load correctly, hitting the db is a really good idea. My experience with RoR tells me that abstracting the database is way more work than just installing it (because ‘apt install mysql2’ is far quicker to write than any number of mock behaviours). In the DotNet world, you have strongly typed objects, so cheking that an int can be written to an integer column in a database is a bit redundant.

Yes, absolutely, this is blatantly an integration test. When working with dynamic languages (especially on Linux) I think the dent in the concept is worth the return.

Scope of a unit test

This is important, because if we are going to usefully select tests to apply toward overall coverage, we need to know what is and isn’t being tested. There are different views on unit testing, on what does and doesn’t constitute a move from unit testing to integration testing, but I like to keep things simple. My view is that if a test crosses application processes, then it’s an integration test. If it remains in the application you are writing (or you’re hitting a db in RoR) and doesn’t require the launching of the application prior to test, then it’s a unit test. That means unit tests can cover a single method, a single class, a single assembly, or multiple of all or any of these. Pick whatever definition of ‘a unit’ works to help you write the test you need. Don’t be too constrained on terminology – if you need a group of tests to prove some functionality which involves half a dozen assemblies in your application, write them and call them unit tests. They’ll run after compile just fine. Who cares whether someone else’s idea of a ‘unit’ is hurt?

Who writes these?

I really hope that it’s obvious to you that developers are responsible for writing unit tests. In fact, developers aren’t only responsible for writing unit tests, they’re responsible for realising that they should be writing unit tests and then writing them. A developer NEVER has to ask permission to write a unit test, any more than they need permission to turn up to work. This is a part of their job – does a pilot need to ask before extending the landing gear?

How can QA rely on unit tests?

Firstly, let’s expel the idea that QA might ‘rely’ on a unit test. A unit test lives in the developer’s domain and may change at any moment or be deleted. As is so often the case in software development, the important element is the people themselves. If a QA and a dev work together regularly and the QA knows there are unit tests, has even seen them and understands how the important business logic is being unit tested, then that QA has far greater confidence that the easy stuff is probably OK. Hopefully QA have access to the unit test report from each build, and the tests are named well enough to make some sense. With this scenarion, it’s easier to be confident that the code has been written to a standard that is ready for a QA to start exposing it to real exploratory “what if” testing, rather than just checking it meets the acceptance criteria. Reasons for story rejection are far less likely to be simple logic problems.


Component Tests

I might upset some hardware people, but in my mind a component test is ran against your code when it is running, but without crossing application boundaries downstream. So if you have written a DotNet Web API service, you would be testing the running endpoint of that service while intercepting downstream requests and stubbing the responses. I’ve found Mountebank to be an excellent tool for this, but I believe there are many more to choose from.

TDD and BDD

Component tests can be ran on a developer’s machine, so it’s quite possible for these to be useful in a TDD/BDD fashion. The downside is that the application needs to be running in order to run the tests, so if they are to be executed from a build server, then the application needs to be started and the stubbing framework set up – this can be trickier to orchestrate. As with any automation, this is only really tricky the first time it’s done. After that, the code and patterns are in place.

In my experience, component tests have limited value. I’ve found that this level of testing is often swallowed by the combination of good unit testing and good integration testing. From a point of view, they are actually integration tests, as you are testing the integration between the application and the operating system.

Having said that, if the downstream systems are not available or relilable, then this approach allows the application functionality to be tested seperately from any integrations.

Who writes these?

Developers write component tests. They may find that they are making changes to an older application, which thiey don’t fully understand. Being able to sandbox the running app and stub all external dependencies can help in these situations.

How can QA rely on component tests?

This again comes down to the very human relationship between the QA and the developer. If there is a close working relationship, and the developer takes the time to show the QA the application under test and explain why they’ve tested it in that way, then it increases the confidence the QA has that the code is ready for them to really go to town on it. It might be that in a discussion prior to development, the QA had suggested that they would be comfier if this component test existed. The test could convey as much meaning as the AC’s in the story.

Again, quality is ensured by having a good relationship between the people involved in building it.


Application Scoped Integration Tests

Integration tests prove that the application functions correctly when exposed to the other systems it will be talking to once in production. They rely on the application being installed and running, and they rely on other applications in ‘the stack’ also running. Strictly speaking, any test that crosses application boundaries is an integration test, but I want to focus on automated tests. We haven’t quite got to manual testing yet.

TDD and BDD

With an integration test, we are extending the feedback time a little too far for it to be your primary TDD strategy; you may find it useful to write an integration test to show a positive and negative result from some business logic at the integration level before that logic is deployed just so you can see the switch from failing to passing tests, but you probably shouldn’t be testing all boundary results at an integration level as a developer. It’s definitely possible to write behaviour focussed integration tests. If you’re building an API and the acceptance criteria includes a truth table, you pretty much have your behvaviour tests already set out for you, but consider what you are testing at the integration level – if you have unit tests proving the logic, then you only need to test that the business logic is hooked in correctly.

The difficult part of this kind of testing often seems to be setting up the data in downstream systems for the automated tests. I find it difficult to understand why anyone would design a system where test data can’t be injected at will – this seems an obvious part of testable architecture; a non-functional requirement that keeps getting ignored for no good reason. If you have a service from which you need to retrieve a price list, that service should almost certainly be capable of saving price list data sent to it in an appropriate way; allowing you to inject your test data.

Scope

The title of this section is ‘Application Scoped Integration Tests’ – my intention with that title is to draw a distinction between tests which are intended to test the entire stack and tests which are intended to test the application or service you are writing at the time. If you have 10 downstream dependencies in your architecture, these tests would hit real instances of these dependencies but you are doing this to test the one thing you are building (even though you will generally catch errors from further down the stack as well).

Who writes these?

These tests are still very closely tied with the evolution of an application as it’s being built, so I advocate for developers to write these tests.

How can QA rely on application scoped integration tests?

Unit and component tests are written specifically to tell the developer that their code works; integration tests are higher up the test pyramid. This means they are more expensive and their purpose should be considered carefully. Although I would expect a developer to write these tests alongside the application they are building, I would expect significant input from a QA to help with what behaviours should and shouldn’t be tested at this level. So we again find that QA and dev working closely gives the best results.

Let’s consider a set of behaviours defined in a truth table which has different outcomes for different values of an enum retrieved from a downstream system. The application doesn’t have control over the vaules in the enum; it’s the downstream dependency that is returning them, so they could conceivably change without the application development team knowing it. At the unit test level, we can write a test to prove every outcome of the truth table. At the integration level, we don’t need to re-write those tests, but we do need to verify that the enum contains exactly the values we are expecting, and what happens if the enum can’t be retrieved at all.

Arriving at this approach through discussion between QA and Dev allows everyone to understand how the different tests at different levels compliment each other to prove the overall functionality.


Consumer Contracts

These are probably my favourite type of test! Generally written for APIs and message processors, these tests can prevent regression issues caused by services getting updated beyond the expectation of consumers. When a developer is writing something which must consume a service (whether via synchronous or asynchronous means) they write tests which execute against the downstream service to prove it behaves in a way that the consumer can handle.

For example: if the consumer expects a 400 HTTP code back when it POSTs an address with no ‘line 1’ field, then the test will intentionally POST the invalid address and assert that the resulting HTTP code is 400. This gets tested because subsequent logic in the consumer relies on having received the 400; if the consumer didn’t care about the response code then this particular consumer contract wouldn’t include this test.

The clever thing about these tests is when they are ran: the tests are given to the team who develop the consumed service and are ran as part of their CI process. They may have similar tests from a dozen different consumers, some testing the same thing, the value is that it’s immediately obvious to the service developers who relies on what behaviour. If they break something then they know who will be impacted.

Scope

This is the subject of some disagreement. There is a school of thought which suggests nothing more than the shape of the request, the shape of the response, and how to call the service should be tested; beyond that should be a black box. Personally, I think that while this is probably correct most of the time, there should be some wiggle room, depending on how coupled the services are. If a service sends emails, then you might want to check for an ’email sent’ event being raised after getting a successful HTTP response (even though the concept of raising the event belongs to the emailer service) – the line is narrow, testing too deep increases coupling but all situations are different.

Who writes these?

These are written by developers and executed in CI.

How can QA rely on consumer contracts?

Consumer contracts are one of the most important classes of tests. If the intention is ever to achieve continuous delivery, these types of test will become your absolute proof that a change to a service hasn’t broken a consumer. Because they test services, not UI, they can be automated and all encompassing but they might not test at a level that QA would normally think about. To get a QA to understand these tests, you will probably have to show how they work and what they prove. They will execute well before any code gets to a ‘testing’ phase, so it’s important for QA to understand the resilience that the practice brings to distributed applications.

Yet again we are talking about good communication between dev and QA as being key to proving the tests are worth taking into consideration.


Stack Scoped Integration Tests

These are subtley different from application scoped integration tests.

There are probably respected technologists out there who will argue that these are the same class of test. I have seen them combined and I have seen them adopted individually (or more often not adopted at all) – I draw a destinction because of the different intentions behind writing them.

These tests are aimed at showing the interactions between the entire stack are correct from the point of view of the main entry point. For example, a microservice may call another service which in turn calls a database. A stack scoped test would interact with the first microservice in an integrated environment and confirm that relevant scenarios from right down the stack are handled correctly.

TDD and BDD

You would be forgiven for wondering how tests at such a high level can be included in TDD or BDD efforts; the feedback loop is pretty long. I find these kind of tests are excellent for establishing behaviours around NFRs, which are known up front so failing tests can be put in place. These are also great at showing some happy path scenarios, while trying to avoid full on boundary testing (simply because the detail of boundary values can be tested much more efficiently at a unit level). It might be worth looking at the concept of executable specifications and tools such as Fitnesse – these allow behaviours to be defined in a heirarchical wiki and linked directly to both integration and unit tests to prove they have been fulfilled. It’s an incredibly efficient way to produce documentation, automated tests, and functioning code at the same time.

Scope

Being scoped to the stack means that there is an implicit intention for these tests to be applied beyond the one service or application. We are expecting to prove integrations right down the stack. This also means that it might not be just a developer writing these. If we have a single suite of stack tests, then anyone making changes to anything in the stack could be writing tests in this suite. For new features, it would also be efficient if QA were writing some of these tests; this can help drive personal integration between dev and QA, and help the latter get exposure to what level of testing has already been applied before it gets anywhere near a classic testing phase.

These tests can be brittle and expensive if the approach is wrong. Testing boundary values of business logic at a stack level is inefficient. That isn’t to say that you can’t have an executable specification for your business logic, just that it possibly shouldn’t be set up as an integration test – perhaps the logic could be tested at multiple levels from the same suite.

How can QA rely on stack scoped integration tests?

These tests are quite often written by a QA cooperating closely with a BA, especially if you are using something like Fitnesse and writing executable specifications. A developer may write some plumbing to get the tests to execute against their code in the best way. Because there is so much involvement from QA, it shouldn’t be difficult for these tests to be trusted.

I think this type of testing applied correctly demonstrates the pinacle of cooperation between BA, QA, and dev; it should always result in great product.


Automated UI Tests

Many user interfaces are browser based, and as such need to be tested in various different browsers and different versions of each browser. Questions like “does the submit button work with good data on all browsers?” are inefficient to answer without automation.

Scope

This is a tricky class of test to scope correctly. Automated UI tests are often brittle and hard to change, so if you have too many it can start to feel like the tests are blocking changes. I tend to scope these to the primary sales pipelines and calls to action: “can you succesfully carry out your business?” – anything more than this tends to quickly become more pain than usefulness. Far more efficient to look at how quickly a small mistake in a less important part of your site/application could be fixed.

This is an important problem to take into consideration when deciding where to place business logic. If you have a web application calling a webservice, business logic can be tested behind the service FAR easier than in the web application.

Who writes these?

I’ve usually seen QA’s writing these, although I have written a handful myself in the past. They tend to get written much later in the development lifecycle than other types of test as they rely on attributes of the actual user interface to work. This is the very characteristic which often makes them brittle, as when the UI changes then they tend to break.

How can QA rely on automated UI tests?

Automated UI tests are probably the most brittle and most expensive tests to write, update, and run. It is one of the most expensive ways to find a bug, unless the bug is a regression issue detected by a pre-existing test (and your UI tests are running frequently). To rely on these tests, they need to be used carefully; just test the few critical journeys through your application which can be tested easily. Don’t test business logic this way, ever. These tests are often sat solely in the QA domain; written by a QA, so trusting them shouldn’t be a problem.


Exploratory Testing

This is the one of the few types of testing which human beings are built for. It’s generally applied to user interfaces and is executed by a QA who tries different ways to break the application either by acting ‘stupid’ or malicious. It simply isn’t practical yet to carry out this kind of testing in an automated fashion; it requires the imagination of an actual person. The intention is to catch problems which were not thought of before the application was built. These might be things which were missed, or they may be a result of confusing UX which couldn’t be forseen without the end result in place.

Who does this?

This is (IMO) the ‘traditional’ QA effort.


User Acceptance Testing

Doesn’t UAT stand for “test everything all over again in a different environment”?

I’ve seen the concept of UAT brutalised by more enterprises than I can count. User acceptance testing is meant to be a last, mostly high level, check to make sure that what has been built is still usable when exposed to ‘normal people’ (aka. the end users).

Things that aren’t UAT:

  1. Running an entire UI automation suite all over again in a different environment.
  2. Running pretty much any automation tests (with the possible exception of some core flows).
  3. Blanket re-running of the tests passing in other environments in a further UAT environment.

If you are versioning your applications and tests in a sensible way, you should know what combinations of versions lead to passing tests before you start UAT. UAT should be exactly what it says on the tin: give it to some users. They’re likely to immediately try to do something no-one has thought about – that’s why we do UAT.

Any new work coming out of UAT will likely not be fixed in that release – don’t rely on UAT to find stuff. If you don’t think your pre-UAT test approach gives sufficient confidence then change your approach. If you feel that your integration environment is too volatile to give reliable test results for features, have another environment with more controls on deployments.

Who runs these tests?

It should really be end users, but often it’s just other QA’s and BA’s. I recommend this not being the same people who have been involved right through the build; although some exposure to the requirements will make life easier.

How can QA rely on User Acceptance Testing?

QA should not rely on UAT. By the time software is in a UAT phase, QA should be ok for whatever has been built to hit production. That doesn’t mean outcomes from UAT are ignored, but the errors found during UAT should be more fundamental gaps in functionality which were either not considered or poorly conceived before a developer ever got involved, or (more often than not) yet more disagreement on the colour of the submit button.


Smoke Testing

Smoke testing originated in the hardware world, where a device would be powered up and if it didn’t ‘smoke’, it had passed the test. Smoke testing in the software world isn’t a huge effort. These are a few, lightweight tests which can confirm that your application deployed properly; slightly more in-depth than a simple healthcheck, but nowhere near as comprehensive as your UI tests.

Who runs these tests?

These should be automated and executed from your deployment platform after each deploy. They give early feedback of likely success or definite failure without having to wait for a full suite of integration tests to run.

How can QA rely on Smoke Testing?

QA don’t rely on smoke tests, these are really more for developers to see fundamental deployment issues early. QA are helped by the early feedback to the developer which doesn’t require them to waste their time trying to test something which won’t even run.


Penetration Testing

Penetration testing is a specific type of exploratory test which requires some specialist knowledge. The intention is to gain access to server and/or data maliciously via the application under test. There are a few automated tools for this, but they only cover some very basic things. This is again better suited to a human with an imagination.

Who runs these tests?

Generally a 3rd party who specialises in penetration testing is bought in to carry out these tests. That isn’t to say that security shouldn’t be considered until then, but keeping up with new attacks and vulnerabilities is a full time profession.

I haven’t yet seen anyone take learnings from penetration testing and turn them into a standard automation suite which can run automatically against new applications of a similar architecture (e.g. most web applications could be tested for the same set of vulnerabilities), but I believe this would be a sensible thing to do; better to avoid the known issues rather than repeatedly fall foul to them and have to spend time rewriting.

How can QA rely on Penetration Testing?

Your QA team will generally not have the expert knowledge to source penetration testing anywhere other than from a 3rd party. The results often impact Ops as well as Dev, so QA are often not directly involved, as they are primarily focussed on the application, not how it sits in the wider enterprise archetcture.


Bringing It All Together

There are so many different ways to test our software and yet I see a lot of enterprises completely ignoring half of them. Even when there is some knowledge of the different classes of test, it’s deemed too difficult to use more than a very limited number of strategies.

I’ve seen software teams write almost no unit tests or developer written integration tests and then hand over stories to a QA team who write endless UI automation tests. Why is this a bad thing? I think people forget that the test pyramid is about where business value is maximised; it isn’t an over-simplification taught to newbies and university students, it reflects something real.

Here is my list of test types in the pyramid:

My test pyramid

Notice that I haven’t included Consumer Contracts in my list. This is because Consumer Contracts can be ran at Unit, Integration, or Component levels, so they are cross-cutting in their way.

In case you need reminding: the higher the test type is on the pyramid, the more expensive it is to fix a bug which is discovered there.

The higher levels of the pyramid are often over-inflated because the QA effort is focused on testing, and not on assuring quality. In an environment where a piece of work is ‘thrown over the fence’ to the QA team, there is little trust (or interest) in any efforts the developer might have already gone to in the name of quality. This leads to inefficiently testing endless combinations of request properties of API’s or endless possibilities of paths a user could navigate through a web application.

If the software team can build an environment of trust and collaboration, it becomes easier for QA to work closer with developers and combine efforts for test coverage. Some business logic in an API being hit by a web application could be tested with 100% certainty at the unit level, leaving integration tests to prove the points of integration, and just a handful of UI tests to make sure the application handles any differing responses correctly.

This is only possible with trust and collaboration between QA and Developers.

Distrust and suspicion leads to QA ignoring the absolute proof of a suite of passing tests which fully define the business logic being written.

What does it mean?

Software development is a team effort. Developers need to know how their code will be tested, QA need to know what testing the developer will do, even Architects need to pay attention to the testability of what they design; and if something isn’t working, people need to talk to each other and fix the problem.

Managers of software teams all too often focus on getting everyone to do their bit as well as possible, overlooking the importance of collaborative skills; missing the most important aspect of software delivery.

My First Release Weekend

At the time of writing this post, I am 41 years old, I’ve been in the business of writing software for over 20 years, and I have never ever experienced a release weekend. Until now.

It’s now nearly 1 pm. I’ve been here since 7 am. There are a dozen or so different applications which are being deployed today, which are highly coupled and maddeningly unresilient. For my part, I was deploying a web application and some config to a security platform. We again hit a myriad of issues which hadn’t been seen in prior environments and spent a lot of time scratching our heads. The automated deployment pipeline I built for the change takes roughly a minute do deploy everything, and yet it took us almost 3 hours to get to the point where someone could log in.

The release was immediately labelled a ‘success’ and everyone starts singing praises. As subsequent deployments of other applications start to fail.

This is not success!

Success is when the release takes the 60 seconds for the pipeline to run and it’s all working! Success isn’t having to intervene to diagnose issues in an environment no-one’s allowed access to until the release weekend! Success is knowing the release is good because the deploy status is green!

But when I look at the processes being followed, I know that this pain is going to happen. As do others, who appear to expect it and accept it, with hearty comments of ‘this is real world development’ and ‘this is just how we roll here’.

So much effort and failure thrown at releasing a fraction of the functionality which could have been out there if quality was the barrier to release, not red tape.

And yet I know I’m surrounded here by some very intelligent people, who know there are better ways to work. I can’t help wondering where and why progress is being blocked.

Scale or Fail

I’ve heard a lot of people say something like “but we don’t need huge scalability” when pushed for reason why their architecture is straight out of the 90’s. “We’re not big enough for devops” is another regular excuse. But while it’s certainly true that many enterprises don’t need to worry so much about high loads and high availability, there are some other, very real benefits to embracing early 21st century architecture principals.

Scalable architecture is simple architecture

Keep it simple, stupid! It’s harder to do than it might seem. What initially appears to be the easy solution can quickly turn into a big ball of unmanageable, tightly coupled string of dependencies where one bad line of code can affect a dozen different applications.

In order to scale easily, a system should be simple. When scaling, you could end up with dozens or even hundreds of instances, so any complexity is multiplied. Complexity is also a recipe for waste. If you scale a complex application, the chances are you’re scaling bits which simply don’t need to scale. Systems should be designed so hot functions can be scaled independently of those which are under utilised.

Simple architecture takes thought and consideration. It’s decoupled for good reason – small things are easier to keep ‘easy’ than big things. An array of small things all built with the same basic rules and standards, can be easily managed if a little effort is put in to working out an approach which works for you. Once you have a few small things all being managed in the same way, growing to lots of small things is easy, if it’s needed.

Simple architecture is also resilient, because simple things tend not to break. And even if you aren’t bothered about a few outages, it’s better to only have the outages you plan for.

Scalable architecture is decoupled

If you need to make changes in anything more than a reverse proxy in order to scale one service, then your architecture is coupled, and shows signs of in-elasticity. Other than being scalable, decoupled architecture is much easier to maintain, and keeps a much higher level of quality because it’s easier to test.

Decoupled architecture is scoped to a specific few modules which can be deployed together repeatedly as a single stack with relative ease, once automated. Outages are easy to fix, as it’s just a case of hitting the redeploy button.

Your end users will find that your decoupled architecture is much nicer to use as well. Without having to make dozens of calls to load and save data in a myriad of different applications and databases, a decoupled application would just make only one or two calls to load or save the data to a dedicated store, then raise events for other systems to handle. It’s called eventual consistency and it isn’t difficult to make work. In fact it’s almost impossible to avoid in an enterprise system, so embracing the principal wholeheartedly makes the required thought processes easier to adopt.

Scalable architecture is easier to test

If you are deploying a small, well understood, stack with very well known behaviours and endpoints, then it’s going to be no-brainer to get some decent automated tests deployed. These can be triggered from a deployment platform with every deploy. As the data store is part of the stack and you’re following micro-architecture rules, the only records in the stack come from something in the stack. So setting up test data is simply a case of calling the API’s you’re testing, which in turn tests those API’s. You don’t have to test beyond the interface, as it shouldn’t matter (functionally) how the data is stored, only that the stack functions correctly.

Scalable architecture moves quicker to market

Given small, easily managed, scalable stacks of software, adding a new feature is a doddle. Automated tests reduce the manual test overhead. Some features can get into production in a single day, even when they require changes across several systems.

Scalable architecture leads to higher quality software

Given that in a scaling situation you would want to know your new instances are going to function, you need attain a high standard of quality in what’s built. Fortunately, as it’s easier to test, quicker to deploy, and easier to understand, higher quality is something you get. Writing test first code becomes second nature, even writing integration tests up front.

Scalable architecture reduces staff turnover

It really does! If you’re building software with the same practices which have been causing headaches and failures for the last several decades, then people aren’t going to want to work for you for very long. Your best people will eventually get frustrated and go elsewhere. You could find yourself in a position where you finally realise you have to change things, but everyone with the knowledge and skills to make the change has left.

Fringe benefits

I guess what I’m trying to point out is that I haven’t ever heard a good reason for not building something which can easily scale. Building for scale helps focus solutions on good architectural practices; decoupled, simple, easily testable, micro-architectures. Are there any enterprises where these benefits are seen as undesirable? Yet, when faced with the decision of either continuing to build the same, tightly coupled, monoliths which require full weekends (or more!) just to deploy, or building something small, light weight, easily deployed, easily maintained, and ultimately scalable, there are plenty of people claiming “Only in an ideal world!” or “We aren’t that big!”.

Bonkers!

Avoiding Delivery Hell

Some enterprises have grown their technical infrastructure to the point where dev ops and continuous deployment are second nature. The vast majority of enterprises are still on their journey, or don’t even realise there is a journey for them to take. Businesses aren’t generally built around great software development practices – many businesses are set up without much thought to how technical work even gets done, as this work is seen as only a supporting function, not able to directly increase profitability. This view of technical functions works fine for some time, but eventually stresses begin to form.

Failing at software delivery.

Each area of a young business can be easily supported by one or two primary pieces of software. They’re probably off the shelf solutions which get customised by the teams who use them. They probably aren’t highly integrated; information flows from department to department in spreadsheets. You can think of the integration points between systems as being manual processes. 

While the flow of work is funnelled through a manual process such as sales staff on phones or shop staff, this structure is sufficient. The moment the bottleneck of sales staff is removed (in other words, once an online presence is built where customers can be serviced automatically) things need to run a bit quicker. Customers online expect to get instant feedback and delivery estimates. They expect to be able to complete their business in one visit and only expect to receive a follow up communication when there is a problem. Person to person interaction can be very flexible; a sales person can explain why someone has to wait in a way which sounds perfectly fine to a customer. The self-service interaction on a website is less flexible – a customer either gets what they want there and then, or they go somewhere else.

And so businesses start to prioritise new features by how many sales they will bring in, either by reducing the number of customers jumping out of the sales flow or by drawing additional customers into the sales flow.

Problems arise as these features require more and more integration with each of the off the shelf solutions in place throughout the various areas of the business. Tight coupling starts to cause unexpected (often unexplained) outages. Building and deploying new features becomes harder. Testing takes longer – running a full regression becomes so difficult and covers so many different systems that if there isn’t a full day dedicated to it, it won’t happen. The online presence gets new features, slowly, but different instabilities make it difficult for customers to use. The improvements added are off-set by bugs.

Developers find it increasingly difficult to build quality software. The business puts pressure on delivery teams to build new features in less time. The few movements among the more senior developers to try to improve the business’ ability to deliver are short lived because championing purely technical changes to the business is a much more complicated undertaking than getting approval from technical piers. It’s not long before enough attempts to make things better have either been shot down or simply ignored, that developers realise there are other companies who are more willing to listen. The business loses a number of very talented technologists over a few months. They take with them the critical knowledge of how things hang together which was propping up the development team. So the rate of new features to market plummets, as the number of unknown bugs deployed goes through the roof. 

It’s usually around this point that the management team starts talking about off-shoring the development effort. The same individuals also become prime targets for sales people peddling their own off the shelf, monolithic, honey trap of a system – promising stability, fast feature development, low prices, and high scalability. The business invests in some such platform without properly understanding why they got into the mess they are in. Not even able to define the issues they need to overcome, never mind able to determine if the platform delivers on them.

By the time the new system is plumbed in and feature parity has been achieved with existing infrastructure, the online presence is a long way behind the competition. Customers are leaving faster than ever, and a decision is made to patch things up for the short term so the business can be sold.

Not an inevitability.

Ok, so yes this is a highly pessimistic, cynical, and gloomy prognosis. But it’s one that I’ve seen more than a few times, and I’m guessing that if you’re still reading then it had a familiar ring for you too. I’m troubled by how easy it is for a business to fall into this downwards spiral. Preventing this is not just about being aware that it’s happening, it’s about knowing what to do about it.

Successfully scaling development effort requires expertise in different areas: architecture, development, operations, whatever it is the business does to make money, people management, leadership, change management, and others I’m sure. It’s a mix of technical skills and soft skills which can be hard to come by. To make things even harder, ‘good’ doesn’t look the same everywhere. Different teams need different stimuli in order to mature, and there are so many different tools and approaches which are largely approved of that one shoe likely won’t fit all. What’s needed is strong, experienced, and inspirational leadership, with buy in from the highest level. That and a development team of individuals who want to improve.

Such leaders will have experience of growing other development teams. They will probably have an online presence where they make their opinions known. They will be early adopters of technologies which they know make sense. They will understand how dev ops works. They will understand about CI and CD, and they will be excited by the idea that things can get better. They’ll want to grow the team that’s there, and be willing to listen to opinion.

Such leaders will make waves. Initially, an amount of effort will be directed away from directly delivering new functionality. Development effort will no longer be solely about code going out the door. They will engage with technology teams at all levels, as comfy discussing the use of BDD as they are talking Enterprise Architecture. Strong opinions will emerge. Different technologies will be trialled. To an outsider (or senior management) it might appear like a revolt – a switch in power where the people who have been failing to deliver are now being allowed to make decisions, and they’re not getting them all right. This appearance is not completely deceptive; ownership and empowerment are huge drivers for team growth. Quality will come with this, as the team learn how to build it into their work from architecture to implementation, and learn how to justify what they’re doing to the business.

Software delivery becomes as much a part of the enterprise as, for example, Human Resources, or Accounts. The CEO may not understand all aspects of HR, but it is understood that the HR department will do what needs to be done. The same level of respect and trust has to be given to software delivery, as it is very unlikely the CEO or any other non-technical senior managers will understand fully the implications of the directions taken. But that’s fine – that’s the way it’s meant to be.

In time to make a difference.

Perhaps the idea of strong technical leadership being critical to technical success is no surprise, it seems sensible enough. So why doesn’t this happen? 

There are probably multiple reasons, but I think it’s very common for senior managers to fear strong technical leadership. There seems to be a belief that devolving responsibility for driving change among senior technicians can bring about similar results as a single strong leader while avoiding the re-balancing of power. I see this scenario as jumping out of the same plane with a slightly larger parachute – it’ll be a gentler ride down, but you can only pretend you’re flying. By the time the business makes it’s mind up to try to hire someone, there’s often too much of a car crash happening to make an enticing offering. 

If we accept that lifting the manual sales bottleneck and moving to web based sales is the catalyst for the explosion of scale and complexity (which I’m not saying is always the case) then this would be the sweet spot in time to start looking for strong technology leadership. Expect to pay a lot more for someone capable of digging you out of a hole than for someone who has the experience to avoid falling in it to begin with. And other benefits include keeping your customers, naturally.

What’s Slowing Your Business?

There are lots of problems that prevent businesses from responding to market trends as quickly as they’d like. Many are not IT related, some are. I’d like to discuss a few problems that I see over and over again, and maybe present some useful solutions. As you read this, please remember that there are always exceptions. But deciding that you have one of these exceptional circumstances is always easier when starting from a sensible basic idea.

Business focused targeting.

For many kinds of work, quicker is better. For software development, quicker is better. But working faster isn’t the same thing as delivering faster.

I remember working as technical lead for a price comparison site in the UK, where once a week each department would read out a list of the things they achieved in the last week and how that had benefited the business. For many parts of the business there was a nice and easy line that could be drawn from what they did each week and a statistic of growth (even if some seemed quite contrived). But the development team was still quite inexperienced, and struggling to do CI never mind CD. For the less experienced devs, being told to “produce things quicker” had the opposite effect. Traditional stick and carrot doesn’t have the same impact on software development as on other functions, because a lot of the time what speeds up delivery seems counter intuitive.

  • Have two people working on each task (pair programming)
  • Focus on only one feature at a time
  • Write as much (or more) test code as functional code
  • Spend time discussing terminology and agreeing a ubiquitous language
  • Decouple from other systems
  • Build automated delivery pipelines

These are just a few examples of things which can be pushed out because someone wants the dev team to work faster. But in reality, having these things present is what enables a dev team to work faster.

Development teams feel a lot of pressure to deliver, because they know how good they can be. They know how quickly software can be written, but it takes mature development practices to deliver quickly and maintain quality. Without the required automation, delivering quick will almost always mean a reduction in quality and more time taken fixing bugs. Then there are the bugs created while fixing other bugs, and so on. Never mind the huge architectural spirals because not enough thought went into things at the start. In the world of software, slow and steady may lose the first round, but it sets the rest of the race up for a sure win.

Tightly coupling systems.

I can’t count how often I’ve heard someone say “We made a tactical decision to tightly couple with <insert some system>, because it will save us money in the long run.”

No.

Just no.

Please stop thinking this.

Is it impossible for highly coupled systems to be beneficial? No. Is yours one of these cases? Probably not.

There are so many hidden expenses incurred due to tightly coupled designs that it almost never makes any sense. The target system is quite often the one thing everything ends up being coupled with, because it’s probably the least flexible ‘off the shelf’ dinosaur which was sold to the business without any technical review. There are probably not many choices for how to work with it. Well the bottom line is: find a way, or get rid. Ending up with dozens of applications all tightly bound to one central monster app. Changes become a nightmare of breaking everyone else’s code. Deployments take entire weekends. License fees for the dinosaur go through the roof. Vendor lock in turns into shackles and chains. Reality breaks down. Time reverses, and mullets become cool.

Maybe I exaggerated with the mullets.

Once you start down this path, you will gradually lose whatever technical individuals you have who really ‘get’ software delivery. The people who could make a real difference to your business will gradually go somewhere their skills can make a difference. New features will not only cost you more to implement but they’ll come with added risk to other systems.

If you are building two services which have highly related functionality, ie. they’re in the same sub-domain (from a DDD perspective), then you might decide that they should be aware of each other on a conceptual level, and have some logic which spans both services and depends on both being ‘up’, and which get versioned together. This might be acceptable and might not lead to war or famine, but I’m making no promises.

It’s too hard to implement Dev Ops.

No, it isn’t.

Yes, you need at least someone who understands how to do it, but moving to a Dev Ops approach doesn’t mean implementing it across the board right away. That would be an obscene way forwards. Start with the next thing you need to build. Make it deployable, make it testable with integration tests written by the developer. Work out how to transform the configuration for different environments. Get it into production. Look at how you did it, decide what you can do better. Do it better with the next thing. Update the first thing. Learn why people use each different type of technology, and whether it’s relevant for you.

Also, it’s never too early to do Dev Ops. If you are building one ‘thing’ then it will be easier to work with if you are doing Dev Ops. If you have the full stack defined in a CI/CD pipeline and you can get all your changes tested in pre-production environments (even infra changes) then you’re winning from the start. Changes become easy.

If you have a development team who don’t want to do Dev Ops then you have a bigger problem. It’s likely that they aren’t the people who are going to make your business succeed.

Ops do routing, DBA’s do databases.

Your developers should be building the entire stack. They should be building the deployment pipeline for the entire stack. During deployment, the pipeline should configure DNS, update routing tables, configure firewalls, apply WAF rules, deploy EC2 instances, install the built application, run database migration scripts, and run tests end to end to make sure the whole lot is done correctly. Anything other than this is just throwing a problem over the fence to someone else.

The joke of the matter is that the people doing the developer’s ‘dirty work’ think this is how they support the business. When in reality, this is how they allow developers to build software that can never work in a deployed state. This is why software breaks when it gets moved to a different environment.

Ops, DBA’s, and other technology specialists should be responsible for defining the overall patterns which get implemented, and the standards which must be met. The actual work should be done by the developer. If for no other reason than the fact that when the developer needs a SQL script writing, there will never be a DBA available. The same goes for any out-of-team dependencies – they’re never available. This is one of the biggest blockers to progress in software development: waiting for other people to do their bit. It’s another form of tight coupling, building inter-dependent teams. It’s a people anti-pattern.

If you developers need help to get their heads around routing principals or database indexing, then get them in a room with your experts. Don’t get those people to do the dirty work for everyone else, that won’t scale.

BAU handle defects.

A defect found by a customer should go straight back to the team which built the software. If that team is no longer there, then whichever team was ‘given’ responsibility for that piece of software gets to fix the bug.

Development teams will go a long way to give themselves an easy life. That includes adding enough error handling, logging, and resilient design practices to make bug fixing a cinch, but only if they’re the ones who have to deal with the bugs.

Fundamental design flaws won’t get fixed unless they’re blocking the development team.

Everything else.

This isn’t an exhaustive list. Even now there are more and more things springing to mind, but if I tried to shout every one out then I’d have a book, not a blog post. The really unfortunate truth is that 90% of the time I see incredibly intelligent people at the development level being ignored by the business, by architects, even by each other, because even though a person hears someone saying ‘this is a good/bad idea’ being able to see past their own preconceptions to understand that point of view is often incredibly difficult. Technologists all too often lack the soft skills required to make themselves heard and understood. It’s up to those who have made a career from their ‘soft skills’ to recognise that and pay extra attention. A drowning person won’t usually thrash about and make a noise.