Don’t Stream JSON Data (Addendum)

Over the last few weeks, I’ve talked about how cool streaming JSON data can be and why maybe it might not always be the best idea. Have a read here:

After the last instalment I thought that the code I listed for generating a SQL select query based on a large quantity of values in an ‘IN’ clause could probably be generalised. So I had a bash at that here: Helpful.SQL.

I hope it’s useful to someone.

Don’t Stream JSON Data (Part 2)

I’ve discussed the merits of JSON streaming in two prior posts: Large JSON Responses and Don’t Stream JSON Data, if  you haven’t read these yet then take a quick look first, they’re not long reads.

I’m attracted to the highly scalable proposition of scaling out the consumer, so many requests can be made individually rather than returning a huge result from the service. It places the complexity with the consumer, rather than with the service which really shouldn’t be bothered about how it’s being used. At the service side, scaling happens with infrastructure which is a pattern we should be embracing. Making multiple simultaneous requests from the consumer is reasonably straight forward in most languages.

But let’s say our service isn’t deployed somewhere which is easily scalable and that simultaneous requests at a high enough rate to finish in a reasonable time would impact the performance of the service for other consumers. What then?

In this situation, we need to make our service go as fast as possible. One way to do this would be to pull all the data in one huge SQL query and build our response objects from that. It would definitely run as quick as we can go but there are some issues with this:

  1. Complexity in embedded SQL strings is hard to manage.
  2. From a service developer’s point of view, SQL is hard to test.
  3. We’re using completely new logic to generate our objects which will need to be tested. In our example scenario (in Large JSON Responses) we already have tested, proven logic for building our objects but it builds one at a time.

Complexity and testability are pretty big issues, but I’m more interested in issue 3: ignoring and duplicating existing logic. API’s in front of legacy databases are often littered with crazy unknowable logic tweaks; “if property A is between 3 and 10 then override property B with some constant, otherwise set property C to the value queried from some other table but just include the last 20 characters” – I’m sure you’ve seen the like, and getting this right the first time around was probably pretty tough going, so do you really want to go through that again?

We could use almost the same code as for our chunked response, but parallelise the querying of each record. Now our service method would look something like this:

public IEnumerable GetByBirthYear(int birthYear)
{
    IEnumerable customerIds = _customersRepository.GetIdsForBirthYear(birthYear);
    IList customerList = new List();
    Parallel.ForEach(customerIds, id =>
    {
        Customer customer;
        try
        {
            customer = Get(id);
        }
        catch (Exception e)
        {
            customer = new Customer
            {
                Id = id,
                CannotRetrieveException = e
            };
        }
        customerList.Add(customer);
    });
    return customerList;
}

public Customer Get(int customerId)
{
    ...
}

Firstly, the loop through the customer id’s is no longer done in a foreach loop, we’ve added a call to Parallel.ForEach. This method of parallelisation is particularly clever in that it gradually increases the degree of parallelism to a level determined by available resources – it’s one of the easiest ways to achieve parallel processing. Secondly, we’re now populating a full list of customers and returning the whole result in one go. This is because it’s simply not possible to yield inside the parallel lambda expression. It also means that responding with a chunked response is pretty redundant and probably adds a bit of extra complexity unnecessarily.

This strategy will only work if all code called by the Get() method is thread safe. Something to be very careful with is the database connection, SqlConnection is not thread safe.

Don’t keep your SqlConnection objects hanging around, new up a new object for every time you want to query the database unless you need to continue the current transaction. No matter how many SqlConnection objects you create, the number of connections are limited by the server and by what’s configured in the connection string. A new connection will be requested from the pool but will only be retrieved when one is available.

So now we have an n+1 scenario where we’re querying the database possibly thousands of times to build our response. Even though we may be making these queries on several threads and the processing time might be acceptable, given all the complexity is now in our service we can take advantage of the direct relationship with the database to make this even quicker.

Let’s say our Get() method needs to make 4 separate SQL queries to build a Customer record, each taking one integer value as an ID. It might look something like this:

public Customer Get(int customerId)
{
    var customer = _customerRepository.Get(customerId);
    customer.OrderHistory = _orderRepository.GetManyByCustomerId(customerId);
    customer.Address = _addressRepository.Get(customer.AddressId);
    customer.BankDetails = _bankDetailsRepository.Get(customer.BankDetailsId);
}

To stop each of these .Get() methods hitting the database we can cache the data up front, one SQL query per repository class. This preserves our logic but presents a problem – assuming we are using Microsoft SQL Server, then there is a practical limit to the number of items we can add into an ‘IN’ clause, so we can’t just stick thousands of customer ID’s in there (https://docs.microsoft.com/en-us/sql/t-sql/language-elements/in-transact-sql). If we can select by multiple ID’s, then we can turn our n+1 scenario into just 5 queries.

It turns out that we can specify thousands of ID’s in an ‘IN’ clause with a sub-query. So our problem shifts to how to create a temporary table with all our customer ID’s in it to use in our sub-query. Unless you’re using a very old version of SQL Server, then you can have multiple rows in a basic ‘INSERT’ statement. For example:

INSERT INTO #TempCustomerIDs (ID)
VALUES
(1)
(2)
(3)
(4)
(5)
(6)

Which will result in 6 rows in the table with the values 1 through 6 in the ID column. However we will once again hit a limit – it’s only possible to insert 1000 rows in this way with each insert statement.

Fortunately, we’re working one level above raw SQL, and we can work our way around this limitation. An example is in the code below.

public void LoadCache(IEnumerable customerIds)
{
    string insertCustomerIdsQuery = string.Empty;
    foreach (IEnumerable customerIdList in customerIds.ToPagedList(500))
    {
        insertCustomerIdsQuery +=
            $" INSERT INTO #TempCustomerIds (CustomerId) VALUES ('{string.Join("'),('", customerIdList)}');";
    }
    string queryByCustomerId =
        $@"IF OBJECT_ID('tempdb..#TempCustomerIds') IS NOT NULL DROP TABLE #TempCustomerIds;
CREATE TABLE #TempCustomerIds (CustomerId int);

{insertCustomerIdsQuery}

{CustomerQuery.SelectBase} WHERE c.CustomerId IN (SELECT CustomerId FROM #TempCustomerIds);

IF OBJECT_ID('tempdb..#TempCustomerIds') IS NOT NULL DROP TABLE #TempCustomerIds;";
    var customers = _repo.FindAll(queryByCustomerId);
    foreach (var customer in customers)
    {
        Cache.Add(customer.CustomerId, customer);
    }
}

A few things from the code snippet above:

  • ToPagedList() is an extension method that returns a list of lists of the number of items passed in. So .ToPagedList(500) will break down a list into multiple lists, each with 500 items. The idea is to use a number which is less than the 1000 row limit for inserts. You could achieve the same thing in different ways.
  • The string insertCustomerIdsQuery is the result of concatenating all the insert statements together.
  • CustomerQuery.SelectBase is the select statement that would have had the ‘select by id’ predicate, with that predicate removed.
  • The main SQL statement first checks whether the temp table exists, and then creates it if it doesn’t. We then insert all the ID’s into that table. Then we select all matching records where the ID’s are in the temp table, and finally delete the temp table.
  • Cache is a simple dictionary of customers by ID.

Using this method, each repository can have the data we expect to be present loaded into it before the request is made. It’s far more efficient to load these thousands of records in one go rather than making thousands of individual queries.

In our example, we are retrieving addresses and bank details by the ID’s retrieved on the Customer objects. To support this, we need to retrieve the bank detail ID’s and address ID’s from the cache of Customers before loading those caches. Then all subsequent logic will run, but pretty blindingly fast seeing as it’s only accessing memory and not having to make calls to the database.

Summing Up

The strategy for the fastest response, is probably to hit the database with one big query, but there are down sides to doing this. Specifically we don’t want to have lots of logic in a SQL query, and we’d like to re-use the code we’ve already written and tested for building individual records.

Loading all the ID’s from the database and iterating through the existing code one record at a time would work fine for small result sets where performance isn’t an issue, but if we’re expecting thousands of records and we want it to run in a few minutes then it’s not enough.

Caching the data using a few SQL queries is far more efficient and means we can re-use any logic easily. Even most of the SQL is refactored out of the existing queries.

Running things asynchronously will speed things up even more. If you’re careful with your use of database connections, then the largest improvement can probably be found by running the queries in parallel, as these will probably be your longest running processes.

 

Large JSON Responses

The long slog from a 15 year old legacy monolith system to an agile, microservice based system will almost inevitably include throwing some API’s in front of a big old database. Building a cleaner view of the domain allows for some cleaner lines to be drawn between concerns, each with their own service. But inside those services there’s usually a set of ridiculous SQL queries building the nice clean models being exposed. These ugly SQL queries add a bit of time to the responses, and can lead to a bit of complexity but this is the real world, often we can only do so much.

So there you are after a few months of work with a handful of services deployed to an ‘on premises’ production server. Most responses are pretty quick, never more than a couple of seconds. But now you’ve been asked to build a tool for generating an Excel document with several thousand rows in it. To get the data, a web app will make an http request to the on prem API. So far so good. But once you’ve written the API endpoint and requested some realistic datasets through it, you realise the response takes over half an hour. What’s more, while that response is being built, the API server runs pretty hot. If more than a couple of users request a new Excel doc at the same time then everything slows down.

Large responses from API calls are not always avoidable, but there are a couple of things we can do to lessen the impact they have on resources.

Chunked Response

Firstly, lets send the response a bit at a time. In .NET Web Api, this is pretty straight forward to implement. We start with a simple HttpMessageHandler:

public class ChunkedResponseHttpHandler : DelegatingHandler
{
    protected override async Task SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    {
        var response = await base.SendAsync(request, cancellationToken);
        response.Headers.TransferEncodingChunked = true;
        return response;
    }
}

Now we need to associate this handler with the controller action which is returning our large response. We can’t do that with attribute based routing, but we can be very specific with a routing template.

config.Routes.MapHttpRoute("CustomersLargeResponse",
                "customers",
                new { controller = "Customers", action = "GetByBirthYear" },
                new { httpMethod = new HttpMethodConstraint(HttpMethod.Get)  },
                new ChunkedResponseHttpHandler(config));

In this example, the url ‘customers’ points specifically at CustomersController.GetByBirthYear() and will only accept GET requests. The handler is assigned as the last parameter passed to MapHttpRoute().

The slightly tricky part comes when writing the controller action. Returning everything in chunks won’t help if you wait until you’ve loaded the entire response into memory before sending it. Also, streaming results isn’t something that many database systems natively support. So you need to be a bit creative about how you get data for your response.

Let’s assume you’re querying a database, and that your endpoint is returning a collection of resources which you already have a pretty ugly SQL query for retrieving by id. The scenario is not as contrived as you might think. Dynamically modifying the ‘where’ clause of the ‘select by id’ query and making it return all the results you want would probably give the fastest response time. It’s a valid approach, but if you know you’re going to have a lot of results then you’re risking memory issues which can impact other processes, plus you’re likely to end up with some mashing of SQL strings to share the bulk of the select statement and add different predicates which isn’t easily testable. The approach I’m outlining here is best achieved by breaking the processing into two steps. First, query for the ID’s for the entities you’re going to return. Secondly, use your ‘select by ID’ code to retrieve them one at a time, returning them in an enumerator, rather than a fully realised collection type. Let’s have a look at what a repository method might look like for this.

public IEnumerator GetByBirthYear(int birthYear)
{
    IEnumerable customerIds = _customersRepository.GetIdsForBirthYear(birthYear);
    foreach (var id in customerIds)
    {
        Customer customer;
        try
        {
            customer = Get(id);
        }
        catch (Exception e)
        {
            customer = new Customer
            {
                Id = id,
                CannotRetrieveException = e
            };
        }
        yield return customer;
    }
}

public Customer Get(int customerId)
{
    ...
}

The important things to notice here are:

  1. The first call is to retrieve the customer ID’s we’re interested in.
  2. Each customer is loaded from the same Get(int customerId) method that is used to return customers by ID.
  3. We don’t want to terminate the whole process just because one customer couldn’t be loaded. Equally, we need to do something to let the caller know there might be some missing data. In this example we simply return an empty customer record with the exception that was thrown while loading. You might not want to do this if your API is public, as you’re leaking internal details, but for this example let’s not worry.

The controller action which exposes this functionality might look a bit like this:

public IEnumerable GetByBirthYear(int birthYear)
{
    IEnumerator iterator;
    iterator = _customersServices.GetForBirthYear(birthYear);
    while (iterator.MoveNext())
    {
        yield return iterator.Current;
    }
}

Things to notice here are:

  1. There’s no attribute based routing in use here. Because we need to assign our HttpHandler to the action, we have to use convention based routing.
  2. At no point are the results loaded into a collection of any kind. We retrieve an enumerator and return one result at a time until there are no more results.

JSON Stream

Using this mechanism is enough to return the response in a chunked manner and start streaming the response as soon as there’s a single result ready to return. But there’s still one more piece to the puzzle for our service. Depending what language the calling client is written in, it can either be straight forward to consume the JSON response as we have here or it can be easier to consume what’s become known as a JSON stream. For a DotNet consumer, sending our stream as a comma delimited array is sufficient. If we’re expecting calls from a Ruby client then we should definitely consider converting our response to a JSON stream.

For our customers response, we might send a response which looks like this (but hopefully much bigger):

{"id":123,"name":"Fred Bloggs","birthYear":1977}
{"id":133,"name":"Frank Bruno","birthYear":1961}
{"id":218,"name":"Ann Frank","birthYear":1929}

This response is in a format called Line-Delimited JSON (LDJSON). There’s no opening square bracket to say this is a collection, because it isn’t a collection. This is a stream of individual records which can be processed without having to wait for the entire response to be evaluated. Which makes a lot of sense; just as we don’t want to have to load the entire response on the server, we also don’t want to load the entire response on the client.

A chunked response is something that most HTTP client packages will handle transparently. Unless the client application is coded specifically to receive each parsed object in each chunk, then there’s no difference on the client side to receiving an unchunked response. LDJSON breaks this flexibility, because the response is not valid JSON – one client will consume it easily, but another would struggle. At the time of writing, DotNet wants only standard JSON whereas it’s probably easier in Ruby to process LDJSON. That’s not to say it’s impossible to consume the collection in Ruby or LDJSON in DotNet, it just requires a little more effort for no real reward. To allow different clients to still consume the endpoint, we can add a MediaTypeFormatter specifically for the ‘application/x-json-stream’ media type (this isn’t an official media type, but it has become widely used). So any consumer can either expect JSON or an LDJSON stream.

public class JsonStreamFormatter : MediaTypeFormatter
{
    public JsonStreamFormatter()
    {
        SupportedMediaTypes.Add(new MediaTypeHeaderValue("application/x-json-stream"));
    }

    public override async Task WriteToStreamAsync(Type type, object value, Stream writeStream, HttpContent content,
        TransportContext transportContext)
    {
        using (var writer = new StreamWriter(writeStream))
        {
            var response = value as IEnumerable;
            if (response == null)
            {
                throw new NotSupportedException($"Cannot format for type {type.FullName}.");
            }
            foreach (var item in response)
            {
                string itemString = JsonConvert.SerializeObject(item, Formatting.None);
                await writer.WriteLineAsync(itemString);
            }
        }
    }

    public override bool CanReadType(Type type)
    {
        return false;
    }

    public override bool CanWriteType(Type type)
    {
        Type enumerableType = typeof(IEnumerable);
        return enumerableType.IsAssignableFrom(type);
    }
}

This formatter only works for types implementing IEnumerable, and uses Newtonsoft’s JsonConvert object to serialise each object in turn before pushing it into the response stream. Enable the formatter by adding it to the Formatters collection:

config.Formatters.Add(new JsonStreamFormatter());

DotNet Consumer

Let’s take a look at a DotNet consumer coded to expect a normal JSON array of objects delivered in a chunked response.

public class Client
{
    public IEnumerator Results()
    {
        var serializer = new JsonSerializer();
        using (var httpClient = new HttpClient())
        {
            httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
            using (var stream = httpClient.GetStreamAsync("http://somedomain.com/customers?birthYear=1977").Result)
            {
                using (var jReader = new JsonTextReader(new StreamReader(stream)))
                {
                    while (jReader.Read())
                    {
                        if (jReader.TokenType != JsonToken.StartArray && jReader.TokenType != JsonToken.EndArray)
                            yield return serializer.Deserialize(jReader);
                    }
                }
            }
        }
    }
}

Here we’re using the HttpClient class and we’re requesting a response as “application/json”. Instead of using a string version of the content, we’re working with a stream. The really cool part is that we don’t have to do much more than just throw that stream into a JsonTextReader (part of NewtonSoft.Json). We can yield the response of each JSON token as long as we ignore the first and last tokens, which are the open and closing square brackets of the JSON array. Calling jReader.Read() at the next level reads the whole content of that token, which is one full object in the JSON response.

This method allows each object to be returned for processing while the stream is still being received. The client can save on memory usage just as well as the service.

Ruby Consumer

I have only a year or so experience with Ruby on Rails, but I’ve found it to be an incredibly quick way to build services. In my opinion, there’s a trade off when considering speed of development vs complexity – because the language is dynamic the developer has to write more tests which can add quite an overhead as a service grows.

To consume our service from a Ruby client, we might write some code such as this:

def fetch_customers(birth_year)
  uri  = "http://somedomain.com/customers"
  opts = {
  query: { birthYear: birth_year },
    headers: {'Accept' => 'application/x-json-stream'},
    stream_body: true,
    timeout: 20000
  }

  parser.on_parse_complete = lambda do |customer|
      yield customer.deep_transform_keys(&:underscore).with_indifferent_access
  end

  HTTParty.get(uri, opts) do |chunk|
     parser << chunk
  end
end

private
def parser
  @parser ||= Yajl::Parser.new
end

Here we're using HTTParty to manage the request, passing it ‘stream_body: true’ in the options and ‘Accept’ => ‘application/x-json-stream’ set. This tells HTTParty that the response will be chunked, the header tells our service to respond with LDJSON.

From the HTTParty.get block, we see that each chunk is being passed to a JSON parser Yajl::Parser, which understands LDJSON. Each chunk may contain a full JSON object, or several, or partial objects. The parser will recognise when it has enough JSON for a full object to be deserialized and it will send it to the method assigned to parser.on_parse_complete where we’re simply returning the object as a hash with indifferent access.

The Result

Returning responses in a chunked fashion is more than just a neat trick, the amount of memory used by a service returning data in this fashion compared to loading the entire result set into memory before responding is tiny. This means more consumers can request these large result sets and other processes on the server are not impacted.

From my own experience, the Yajl library seems to become unstable after handling a response which streams for more than half an hour or so. I haven’t seen anyone else having the same issue, but on one project I’ve ended up removing Yajl and just receiving the entire response with HTTParty and parsing the collection fully in memory. It isn’t ideal, but it works. It also doesn’t stop the service from streaming the response, it’s just the client that waits for the response to load completely before parsing it.

It’s a nice strategy to understand and is useful in the right place, but in an upcoming post I’ll be explaining why I think it’s often better to avoid large JSON responses altogether and giving my preferred strategy and reasoning.

Getting FitNesse to Work

Sample code here.

Recently I’ve been looking into Specification by Example, which people keep defining to me as BDD done the right way. Specification by Example fully implemented includes the idea of an executable specification. A concept that has led me back to FitNesse having given it the cold shoulder for the last six or seven years.

I’ve always thought of FitNesse as a great idea but I struggled to see how to use it correctly when as a developer I was mainly focused on continuous delivery and Dev Ops. I didn’t see where in the development cycle it fit or how tests in a wiki could also be a part of a CD pipeline. Revisiting FitNesse with a focus on Specification by Example gave me the opportunity to work some of this out and I think I’m very likely to suggest using this tool in future projects.

Before it’s possible to talk about FitNesse in a CI environment, there are a few basics to master. I don’t want to go into a full breakdown of all functionality, but I do want to communicate enough detail to allow someone to get started. In this post I’ll concentrate on getting FitNesse working locally and using it to execute different classes of test. In a later post, I’ll cover how to make FitNesse work with CI/CD pipelines and introduce a more enterprise level approach.

This post will focus on using FitNesse with the Slim test runner against .NET code but should be relevant for a wider audience.

Installing FitNesse

Installing and running FitNesse locally is really easy and even if you’re intending to deploy to a server, getting things running locally is still important. Follow these steps:

  1. Install Java
  2. Follow these instructions to install and configure FitNesse
  3. Create a batch file to run FitNesse when you need it. My command looks like:
    java -jar .fitnesse-standalone.jar -p 9080
    

Once you’re running the FitNesse process, hit http://localhost:9080 (use whichever port you started it on) and you’ll find a lot of material to help you get to grips with things, including a full set of acceptance tests for FitNesse itself.

What FitNesse Isn’t

Before I get into talking about how I think FitNesse can be of use in agile development, I’d like to point out what it isn’t useful for.

Tracking Work

FitNesse is not a work tracking tool; it won’t replace a dedicated ticketing system such as Jira, Pivotal Tracker or TFS. Although it may be possible to see what has not yet been built by seeing what tests fail, that work cannot be assigned to an individual or team within FitNesse.

Unit Testing

FitNesse can definitely be used for unit testing and some tests executed from FitNesse will be unit tests, so this might seem a bit contradictory. When I say FitNesse isn’t for unit testing, I mean that it isn’t what a developer should be using for many unit tests. A lot of unit testing is focused on a much smaller unit than I would suggest FitNesse should be concerned with.

[TestFixture]
public class TestingSomething
{
    [Test]
    [ExpectedException(typeof(ArgumentNullException))]
    public void Constructor_NullConnectionString_ThrowsExpectedException()
    {
        var something = new Something(null);
    }
}

This test should never be executed from FitNesse. It’s a completely valid test but other than a developer, who cares? Tests like this could well be written in a TDD fashion right along with the code they’re testing. Introducing a huge layer of abstraction such as FitNesse would simply kill the developer’s flow. In any case, what would the wiki page look like?

Deploying Code

Fitnesse will execute code and code can do anything you want it to, including deploying other code. It is entirely possible to create a wiki page around each thing you want to deploy and have the test outcomes driven by successful deployments. But really, do you want to do that? Download Go or Ansible – they’re far better at this.

What FitNesse Is

OK, so now we’ve covered a few things that FitNesse is definitely not, let’s talk about how I think it can most usefully fit into the agile development process.

FitNesse closes the gap between specification and executable tests, creating an executable specification. This specification will live through the whole development process and into continued support processes. Let’s start with creating a specification.

Creating a Specification

A good FitNesse based specification should be all about behaviour. Break down the system under test into chunks of behaviour in exactly the same way as you would when writing stories for a SCRUM team. The behaviours that you identify should be added to a FitNesse wiki. Nesting pages for relating behaviours makes a lot of sense. If you’re defining how a service endpoint behaves during various exception states then a parent page of ‘Exception States’ with one child page per state could make some sense but you aren’t forced to adhere to that structure and what works for you could be completely different.

Giving a definition in pros is great for communicating a generalisation of what you’re wanting to define and gives context for someone reading the page. What pros don’t do well is define specfic examples – this is where specficiation by example can be useful. Work as a team (not just ‘3 amigos’) to generate sufficient specific examples of input and output to cover the complete behaviour you’re trying to define. The more people you have involved, the less likely you are to miss anything. Create a decision table in the wiki page for these examples. You now have the framework for your executable tests.

Different Types of Test

In any development project, there are a number of different levels of testing. Different people refer to these differently, but in general we have:

  • Unit Tests – testing a small unit of code ‘in process’ with the test runner
  • Component Tests – testing a running piece of software in isolation
  • Integration Tests – testing a running piece of software with other software in a live-like environment
  • UI Teststesting how a software’s user interface behaves, can be either a component test or integration test
  • Load Tests – testing how a running piece of software responds under load, usually an integration test
  • Manual Tests – testing things that aren’t easily quantifiable within an automated test or carrying out exploratory testing

Manual tests are by their nature not automated, so FitNesse is probably not the right tool to drive these. Also FitNesse does not natively support UI tests, so I won’t go into those here. Load tests are important but may require resources that would become expensive if run continuously, so although these might be triggered from a FitNesse page, they would probably be classified in such a way so they aren’t run constantly. Perhaps using a top level wiki page ‘Behaviour Under Load’.

In any case, load tests are a type of integration test, so we’re left with three different types of test we could automate from FitNesse. So which type of test should be used for which behaviour?

The test pyramid was conceived by Mike Cohn and has become very familiar to most software engineers. There are a few different examples floating around on the internet, this is one:

This diagram shows that ideally there should be more unit tests than component tests, and more integration tests than system tests etc. This assertion comes from trying to keep thing simple (which is a great principle to follow); some tests are really easy to write and to run whereas some tests take a long time to execute or even require someone to manually interact with the system. The preference should always be to test behaviour in the simplest way that proves success but when we come to convince ourselves of a specific piece of behaviour, a single class of test may not suffice. There could be extensive unit test coverage for a class but unless our tests also prove that class is being called, then we still don’t have our proof. So we’re left with the possibility that any given behaviour could require a mix of unit tests, component tests and integration tests to prove things are working.

Hooking up the Code

So, how do we get each of our different classes of tests to run from FitNesse? Our three primary types of test are unit, component and integration. Let’s look at unit tests first.

Unit Tests

A unit test executes directly against application code, in process with the test itself. External dependencies to the ‘unit’ are mocked using dependency injection and mocking frameworks. A unit test depends only on the code under test, calls will never be made to any external resources such as databases or file systems.

Let’s assume we’re building a support ticketing system. We might have some code which adds a comment onto a ticket.

namespace Ticketing
{
    public class CommentManager
    {
        private readonly Ticket _ticket;

        public CommentManager(Ticket ticket)
        {
            _ticket = ticket;
        }

        public void AddComment(string comment, DateTime commentedOn, string username)
        {
            _ticket.AddComment(new Comment(comment, username, commentedOn));
        }
    }
}

To test this, we can use a decision table which would be defined in FitNesse using Slim as:

|do comments get added|
|comment|username|commented on|comment was added?|
|A comment text|AUser|21-Jan-2012|true|

This will look for a class called DoCommentsGetAdded, set the properties Comment, Username and CommentedOn and call the method CommentWasAdded() from which it expects a boolean. There is only one test line in this test which tests the happy path, but others can be added. The idea should be to add enough examples to fully define the behaviour but not so many that people can’t see the wood for the trees.

We obviously have to create the class DoCommentsGetAdded and allow it to be called from FitNesse. I added the Ticketing.CommentManager to a solution called FitSharpTest, I’m now going to add another class library to that project called Ticketing.Test.Unit. I’ll add a single class called DoCommentsGetAdded.

namespace Ticketing.Test.Unit
{
    public class DoCommentsGetAdded
    {
        public string Comment { get; set; }
        public string Username { get; set; }
        public DateTime CommentedOn { get; set; }

        public bool CommentWasAdded()
        {
            var ticket = new Ticket();
            var manager = new CommentManager(ticket);
            int commentCount = ticket.Comments.Count;
            manager.AddComment(Comment, CommentedOn, Username);
            return ticket.Comments.Count == commentCount + 1;
        }
    }
}

To reference this class from FitNesse you’ll have to do three things:

  1. Install FitSharp to allow Slim to recognise .NET assemblies. If you use Java then you won’t need to do this part, FitNesse natively supports Java.
  2. Add the namespace of the class either in FitSharp’s config.xml or directly into the page using a Slim Import Table. I updated the config to the following:
    
    
      
        Ticketing.Test.Unit
      
    
    
  3. Add an !path declaration to your FitNesse Test page pointing at your test assembly.

To get your Test page working with FitSharp, the first four lines in edit mode should look like this:

!define TEST_SYSTEM {slim}
!define COMMAND_PATTERN {%m -r fitSharp.Slim.Service.Runner -c D:ProgramsFitnesseFitSharpconfig.xml %p}
!define TEST_RUNNER {D:ProgramsFitnesseFitSharpRunner.exe}
!path C:devFitSharpTestTicketing.Test.UnitbinDebugTicketing.Test.Unit.dll

Notice the !path entry (line 4). The last step is to add your truth table to the page, so the whole page in edit mode looks like this:

!define TEST_SYSTEM {slim}
!define COMMAND_PATTERN {%m -r fitSharp.Slim.Service.Runner -c D:ProgramsFitnesseFitSharpconfig.xml %p}
!define TEST_RUNNER {D:ProgramsFitnesseFitSharpRunner.exe}
!path C:devFitSharpTestTicketing.Test.UnitbinDebugTicketing.Test.Unit.dll

|do comments get added|
|comment|username|commented on|comment was added?|
|A comment text|AUser|21-Jan-2012|true|

Now make sure your solution is built (if you have referenced the debug assemblies then make sure you build in debug) save your changes in FitNesse and hit the ‘Test’ button at the top of the page.

 

There are of course lots of different types of tables you can use to orchestrate different types of tests. The full list is beyond the scope of this post but if you dip into the documentation under ‘Slim’ then you’ll find it quite easily.

The mechanism always remains the same, however – the diagram below outlines the general pattern.

 

 

This is not really any different to what you would see if you were to swap Slim for NUnit. You have the system driving the test (Slim or NUnit), the test assembly and the assembly under test. The difference here is that rather than executing the test within your IDE using Resharper (or another test runner) you’re executing the test from a wiki page. This is a trade off, but like I said at the beginning: not all tests should be in FitNesse, we’re only interested in behaviour which a BA might specify. There will no doubt be other unit tests executed in the usual manner.

Integration Tests

Integration tests run against a shared environment which contains running copies of all deployables and their dependencies. This environment is very like production but generally not geared up for high load or the same level of resilience.

The sequence diagram for unit tests is actually still relevant for integration tests. All we’re doing for the integration tests is making the test assembly call running endpoints instead of calling classes ‘in process’.

Let’s set up a running service for our ‘add comment’ logic and test it from FitNesse. For fun, let’s make this a Web API service running in OWIN and hosted with TopShelf. Follow these steps to get running:

  1. Add a new command line application to your solution. Call it TicketingService (I’m using .NET 4.6.1 for this).
  2. Add NuGet references so your packages file looks like this:
    
    
      
      
      
      
      
      
      
      
      
      
    
    
  3. Add a Startup class which should look like this:
    using System.Web.Http;
    using Owin;
    
    namespace TicketingService
    {
        public class Startup
        {
            public void Configuration(IAppBuilder appBuilder)
            {
                HttpConfiguration httpConfiguration = new HttpConfiguration();
                httpConfiguration.Routes.MapHttpRoute(
                       name: "DefaultApi",
                       routeTemplate: "api/{controller}/{id}",
                       defaults: new { id = RouteParameter.Optional }
                   );
                appBuilder.UseWebApi(httpConfiguration);
            }
        }
    }
    
  4. Add a Service class which looks like this:
    using System;
    using Microsoft.Owin.Hosting;
    
    namespace TicketingService
    {
        public class Service
        {
            private IDisposable _webApp;
    
            public void Start()
            {
                string baseAddress = "http://localhost:9000/";
    
                // Start OWIN host
                _webApp = WebApp.Start<Startup>(url: baseAddress);
            }
    
            public void Stop()
            {
                _webApp.Dispose();
            }
        }
    }
    
  5. Modify your Program class so it looks like this:
    using System;
    using Topshelf;
    
    namespace TicketingService
    {
        class Program
        {
            static void Main(string[] args)
            {
                HostFactory.Run(
                    c =>
                    {
    
                        c.Service<Service>(s =>
                        {
                            s.ConstructUsing(name => new Service()); 
    
                            s.WhenStarted(service => service.Start());
                            s.WhenStopped(service => service.Stop());
                        });
    
                        c.SetServiceName("TicketingService");
                        c.SetDisplayName("Ticketing Service");
                        c.SetDescription("Ticketing Service");
    
                        c.EnablePauseAndContinue();
                        c.EnableShutdown();
    
                        c.RunAsLocalSystem();
                        c.StartAutomatically();
                    });
    
                Console.ReadKey();
            }
        }
    }
    
  6. Make sure your TicketService project is referencing your Ticketing project.
  7. Add a fake back end service to provide some persistence:
    using Ticketing;
    
    namespace TicketingService
    {
        public static class BackEndService
        {
            public static Ticket Ticket { get; set; } = new Ticket();
        }
    }
    
  8. And finally, add a controller to handle the ‘add comment’ call and return a count (yes, I know this is a bit of a hack, but it’s just to demonstrate things)
    using System.Web.Http;
    using Ticketing;
    
    namespace TicketingService
    {
        public class CommentController : ApiController
        {
    
            public void Post(Comment comment)
            {
                BackEndService.Ticket.AddComment(comment);
            }
    
            public int GetCount()
            {
                return BackEndService.Ticket.Comments.Count;
            }
        }
    }
    

Right, run this service and you’ll see that you can add as many comments as you like and retrieve the count.

Going back to our sequence diagram, we have a page in FitNesse already, and now we have an assembly to test. What we need to do is create a test assembly to sit in the middle. I’m adding Ticketing.Test.Integration as a class library in my solution.

It’s important to be aware of what version of .NET FitSharp is built on. The version I’m using is built against .NET 4, so my Ticketing.Test.Integration will be a .NET 4 project.

I’ve added a class to the new project called DoCommentsGetAdded which looks like this:

using System;
using RestSharp;

namespace Ticketing.Test.Integration
{
    public class DoCommentsGetAdded
    {
        public string Comment { get; set; }
        public string Username { get; set; }
        public DateTime CommentedOn { get; set; }

        public bool CommentWasAdded()
        {
            var client = new RestClient("http://localhost:9000");
            var request = new RestRequest("api/comment", Method.POST);
            request.AddJsonBody(new {Text = Comment, Username = Username, CommentedOn = CommentedOn});
            request.AddHeader("Content-Type", "application/json");
            client.Execute(request);

            var checkRequest = new RestRequest("api/comment/count", Method.GET);
            IRestResponse checkResponse = client.Execute(checkRequest);

            return checkResponse.Content == "1";
        }
    }
}

I’ve also added RestSharp via NuGet.

There are now only two things I need to do to make my test page use the new test class.

  1. Add the namespace Ticketing.Test.Integration to FitSharp’s config file:
    
    
      
        Ticketing.Test.Unit
        Ticketing.Test.Integration
      
    
    
  2. Change the !path property in the test page to point at the right test assembly:
    !define TEST_SYSTEM {slim}
    !define COMMAND_PATTERN {%m -r fitSharp.Slim.Service.Runner -c D:ProgramsFitnesseFitSharpconfig.xml %p}
    !define TEST_RUNNER {D:ProgramsFitnesseFitSharpRunner.exe}
    !path C:devFitSharpTestTicketing.Test.IntegrationbinDebugTicketing.Test.Integration.dll
    
    |do comments get added|
    |comment|username|commented on|comment was added?|
    |A comment text|AUser|21-Jan-2012|true|
    

Now, make sure the service is running and hit the ‘Test’ button in FitNesse.

Obviously, this test was configured to execute against our localhost, but it could just as easily have been configured to execute against a deployed environment. Also, the code in the test class isn’t exactly what I would call first class, this is just to get stuff working.

The important thing to take from this is that the pattern in the original sequence diagram still holds true, so it’s quite possible to test this behaviour as either a unit test or as an integration test, or even both.

Component Tests

A component test is like an integration test in that it requires your code to be running in order to test. Instead of your code executing in an environment with other real life services, it’s tested against stubs which can be configured to respond with specific scenarios for specific tests, something which is very difficult to do in an integration environment where multiple sets of tests may be running simultaneously.

I recently wrote a short post about using Mountebank for component testing. The same solution can be applied here.

I’ve added the following controller to my TicketingService project:

using System.Net;
using System.Web;
using System.Web.Http;
using RestSharp;
using Ticketing;

namespace TicketingService
{
    public class TicketController : ApiController
    {
        public void Post(Ticket ticket)
        {
            var client = new RestClient("http://localhost:9999");
            var request = new RestRequest("ticketservice", Method.POST);
            request.AddHeader("Content-Type", "application/json");
            request.AddJsonBody(ticket);

            IRestResponse response = client.Execute(request);

            if (response.StatusCode != HttpStatusCode.Created)
            {
                throw new HttpException(500, "Failed to create ticket");
            }
        }
    }
}

There are so many things wrong with this code, but remember this is just for an example. The functioning of this controller depends on an HTTP endpoint which is not part of this solution. It cannot be tested by an integration test without that endpoint being available. If we want to test this as a component test, then we need something to pretend to be that endpoint at http://localhost:9999/ticketservice.

To do this, install Mountebank by following these instructions:

  1. Make sure you have an up to date version of node.js installed by downloading it from here.
  2. Run the Command Prompt for Visual Studio as an administrator.
  3. Execute:
    npm install -g mountebank
    
  4. Run Mountebank by executing:
    mb
    
  5. Test it’s running by visiting http://localhost:2525.

To create an imposter which will return a 201 response when data is POSTed to /ticketservice, use the following json:

{
    "port": 9999,
    "protocol": "http",
    "stubs": [{
        "responses": [
          { "is": { "statusCode": 201 }}
        ],
        "predicates": [
              {
                  "equals": {
                      "path": "/ticketservice",
                      "method": "POST",
                      "headers": {
                          "Content-Type": "application/json"
                      }
                  }
              }
            ]
    }]
}

Our heavily hacked TicketController class will function correctly only if this imposter returns 201, anything else and it will fail.

Now, I’m going to list the code I used to get this component test executing from FitNesse from a Script Table. I’m very certain that this code is not best practice – I’m trying to show the technicality of making the service run against Mountebank in order to make the test pass.

I’ve added a new project to my solution called Ticketing.Test.Component and I updated the config.xml file with the correct namespace. I have two files in that solution, one is called imposters.js which contains the json payload for configuring Mountebank, the other is a new version of DoCommentsGetAdded.cs which looks like this:

using System;
using System.IO;
using System.Net;
using RestSharp;

namespace Ticketing.Test.Component
{
    public class DoCommentsGetAdded
    {
        private HttpStatusCode _result;

        public string Comment { get; set; }
        public string Username { get; set; }
        public DateTime CommentedOn { get; set; }

        public void Setup(string imposterFile)
        {
            var client = new RestClient("http://localhost:2525");
            var request = new RestRequest("imposters", Method.POST);
            request.AddHeader("Content-Type", "application/json");
            using (FileStream imposterJsFs = File.OpenRead(imposterFile))
            {
                using (TextReader reader = new StreamReader(imposterJsFs))
                {
                    string imposterJs = reader.ReadToEnd();
                    request.AddParameter("application/json", imposterJs, ParameterType.RequestBody);
                }
            }
            client.Execute(request);
        }

        public void AddComment()
        {
            var client = new RestClient("http://localhost:9000");
            var request = new RestRequest("api/ticket", Method.POST);
            request.AddJsonBody(new { Number = "TicketABC" });
            request.AddHeader("Content-Type", "application/json");
            IRestResponse restResponse = client.Execute(request);
            _result = restResponse.StatusCode;
        }

        public bool CommentWasAdded()
        {
            return _result != HttpStatusCode.InternalServerError;
        }

        public void TearDown(string port)
        {
            var client = new RestClient("http://localhost:2525");
            var request = new RestRequest($"imposters/{port}", Method.DELETE);
            request.AddHeader("Content-Type", "application/json");
            client.Execute(request);
        }
    }
}

I’ve updated my FitNesse Test page to the following:

!define TEST_SYSTEM {slim}
!define COMMAND_PATTERN {%m -r fitSharp.Slim.Service.Runner -c D:ProgramsFitnesseFitSharpconfig.xml %p}
!define TEST_RUNNER {D:ProgramsFitnesseFitSharpRunner.exe}
!path C:devFitSharpTestTicketing.Test.ComponentbinDebugTicketing.Test.Component.dll

|Script:do comments get added|
|setup;|C:devFitSharpTestTicketing.Test.ComponentbinDebugimposter.js|
|add comment|
|check|comment was added|true|
|tear down;|9999|

A Script Table basically allows a number of methods to be executed in sequence and with various parameters. It also allows for ‘checks’ to be made (which are your assertions) at various stages – it looks very much like you would expect a test script to look.

In this table, we’re instantiating our DoCommentsGetAdded class, calling Setup() and passing the path to our imposters.js file, calling AddComment() to add a comment and then checking that CommentWasAdded() returns true. Then we’re calling TearDown().

Setup() and TearDown() are there specifically to configure Mountebank appropriately for the test and to destroy the Imposter afterwards. If you try to set up a new Imposter at the same port as an existing Imposter, Mountebank will throw an exception, so it’s important to clean up. Another option would have been to set up the Imposter in the DoCommentsGetAdded constructor and add a ~DoCommentsGetAdded() destructor to clean up – this would mean the second and last lines of our Script Table could be removed. I am a bit torn as to which approach I prefer, or whether a combination of both is appropriate. In any case, cleaning up is important if you want to avoid port conflicts.

Ok, so run your service, make sure Mountebank is also running and then run your test from FitNesse.

Again, this works because we have the pattern of our intermediary test assembly sat between FitNesse and the code under test. We can write pretty much whatever code we want here to call any environment we like or to just execute directly against an assembly.

Debugging

I spent a lot of time running my test code from NUnit in order to debug and the experience grated because it felt like an unnecessary step. Then I Googled to see what other people were doing and I found that by changing the test runner from:

!define TEST_RUNNER {D:ProgramsFitnesseFitSharpRunner.exe}

to:

!define TEST_RUNNER {D:ProgramsFitnesseFitSharpRunnerW.exe}

FitSharp helpfully popped up a .NET dialog box and waited for me to click ‘Go’ before running the test. This gives an opportunity to attach the debugger.

Summary

A high level view of what has been discussed here:

  • Three types of test which can be usefully run from FitNesse: Unit, Integration and Component.
  • FitNesse is concerned with behaviour – don’t throw NUnit out just yet.
  • The basic pattern is to create an intermediary assembly to sit between FitNesse and the code under test. Use this to abstract away technical implementation.
  • Clean up Mountebank Imposters.
  • You need FitSharp to run FitNesse against .NET code.
  • Debug with FitSharp by referencing the RunnerW.exe test runner and attaching a debugger to the dialog box when it appears.

Not Quite Enterprise

In this first post on FitNesse, I’ve outlined a few different types of test which can be executed and listed code snippets and instructions which should allow someone to get FitNesse running on their machine. This is only a small part of the picture. Having separate versions of the test suite sat on everyone’s machine is not a useful solution. A developer’s laptop can’t be referenced by a CI/CD platform. FitNesse can be deployed to a server. There are strategies for getting tests to execute as part of a DevOps pipeline.

This has been quite a lengthy post and I think these topics along with versioning and multi-environment scenarios will be best tackled in a subsequent post.

I also want to take a more process oriented look at how tests get created, who should be involved and when. So maybe I have a couple of new entries to work on.

User Secrets in asp.NET 5

Accidentally pushing credentials to a public repo has never happened to me, but I know a few people for whom it has. AWS have an excellent workaround for this by using credential stores that can be configured via the CLI or IDE but this technique only works for IAM user accounts, it doesn’t allow you to connect to anything outside of the AWS estate.

Welcome to User Secrets in asp.NET 5 – and they’re pretty cool.

User Secrets are a part of the new asp.NET configuration mechanism. If you open Visual Studio 2015 and create a new Web API project, for example, you’ll be presented with something somewhat different to previous versions. Configuration is carried out in Startup.cs, where we can conditionally load configuration from one or many sources including .config and .json files, environment variables and the User Secret store. To access User Secrets, you want to modify the constructor like so:

public Startup(IHostingEnvironment env, IApplicationEnvironment appEnv)
{
    var builder = new ConfigurationBuilder(appEnv.ApplicationBasePath)
        .AddJsonFile("config.json")
        .AddUserSecrets()
        .AddEnvironmentVariables();

    Configuration = builder.Build();
}

In this example, the order of calls to AddJsonFile(), AddUserSecrets() and AddEnvironmentVariables() makes a difference. If the property ‘Username’ is defined in config.json and also as a secret then the value in config.json will be ignored in favour of the secret. Similarly, if there is a ‘Username’ environment variable set, that would win over the other two. The order loaded dictates which wins.

To create a secret, first open a Developer Command Prompt for VS2015. This is all managed via the command line tool ‘user-secret’. To check if you have everything installed, at the prompt, type ‘user-secret -h’.

C:Program Files (x86)Microsoft Visual Studio 14.0>user-secret -h

If user-secret isn’t recognised then you may need to install the SecretManager command in the .NET Development Utilities (DNU). Do this by typing ‘dnu command install SecretManager’.

C:Program Files (x86)Microsoft Visual Studio 14.0>dnu command install SecretManager

In my case, this was again not recognised, even though I had just completed a full install of every component of Visual Studio 2015 Professional. If this is still not working for you, then you need to update the .NET Version Manager (DNVM). Do this by typing ‘dnvm upgrade’.

C:Program Files (x86)Microsoft Visual Studio 14.0>dnvm upgrade

Hopefully, you should get a similar response to this:

C:Program Files (x86)Microsoft Visual Studio 14.0>dnvm upgrade
Determining latest version
Downloading dnx-clr-win-x86.1.0.0-beta6 from https://www.nuget.org/api/v2
Installing to C:UsersPeter.dnxruntimesdnx-clr-win-x86.1.0.0-beta6
Adding C:UsersPeter.dnxruntimesdnx-clr-win-x86.1.0.0-beta6bin to process PATH
Adding C:UsersPeter.dnxruntimesdnx-clr-win-x86.1.0.0-beta6bin to user PATH
Native image generation (ngen) is skipped. Include -Ngen switch to turn on native image generation to improve application startup time.
Setting alias 'default' to 'dnx-clr-win-x86.1.0.0-beta6'

Now try installing the command. You should see all of your registered NuGet sources being queried for updates and then a whole host of System.* packages being installed. The very end of the response should look something like this:

Installed:
    10 package(s) to C:UsersPeter.dnxbinpackages
    56 package(s) to C:UsersPeter.dnxbinpackages
The following commands were installed: user-secret

Now when you run ‘user-secret -h’ you should get this:

Usage: user-secret [options] [command]

Options:
  -?|-h|--help  Show help information
  -v|--verbose  Verbose output

Commands:
  set     Sets the user secret to the specified value
  help    Show help information
  remove  Removes the specified user secret
  list    Lists all the application secrets
  clear   Deletes all the application secrets

Use "user-secret help [command]" for more information about a command.

You can see five possible commands listed, and getting help on any particular one is also explained. As an example, if you want to set a property ‘Username’ to ‘Guest’ then type this:

C:Program Files (x86)Microsoft Visual Studio 14.0>cd MyProjectFolder
C:MyProjectFolder>user-secret set Username Guest

Where MyProjectFolder is the location of a project.json file.

So there you have it. You’re ready to create secrets that can never be accidentally pushed into a public repo or shared anywhere they shouldn’t be. Just remember that emailing them to the dev sitting next to you might not be much better.

Useful links:

https://github.com/aspnet/Home/wiki/DNX-Secret-Configuration

http://stackoverflow.com/questions/30106225/where-to-find-dnu-command-in-windows

http://typecastexception.com/post/2015/05/17/DNVM-DNX-and-DNU-Understanding-the-ASPNET-5-Runtime-Options.aspx

Code Libraries and Dependencies

Nuget has made it really straight forward to share libraries across multiple applications. It’s really straight forward. Just add a nuspec file and run ‘nuget pack’. But before you do that next time, spare a thought for the poor dev who’s trying to fit your library in their project among a dozen others when any of them may make use of the same 3rd parties you’ve referenced as dependencies.

Breaking Changes

Breaking changes happen. Sometimes intentionally, sometimes not. A breaking change in a popular 3rd party library can be pretty tricky to deal with.

Not so long ago RestSharp made a breaking change. They changed the way a value was set from being a property setter to a function call. The two things are not the same and no amount of assembly redirection will make one version work in place of the other.

My client at the time had a large application with probably about two dozen references to the old version of RestSharp. When people started to use the new version (often without even realising that it was a new version) in their libraries, it took a while before someone hit on the problem of referencing one of these libraries from code that uses the older version.

Chain of Incompatibility

So some app uses two libraries, which both use different and incompatible versions of RestSharp. Ok, so let’s just upgrade the older version to the newer version in the library and everyone’s happy.

Then we find out that it isn’t that library which referenced RestSharp, it was a further library which was referenced. So we open that library and upgrade that. Build a nuget package, re-reference that in our original library and do the same to get that into our app. Great.

Then, a couple of days later, someone is now having trouble after extending the library we just changed with some new functionality. Because it now has an updated version of RestSharp, it won’t work when it’s updated in another app because again the app references the old version.

And so the dance continues…

Just Avoid It

The best way to deal with this? Just avoid it.

A library should only provide the logic that it relates to. Trying to make a library responsible for everything is a mistake.

3rd party dependencies are also a bad thing. They can change and different versions are not always backwards compatible.

A library is not well encapsulated if it depends on 3rd parties, it means those 3rd parties have to be versioned carefully and it’s possible for all developers in your enterprise to decide just to stick on the old version – legacy code in real time, nice.

To avoid it, simple create an interface in your library that defines the contract you need. Then leave it up to the people using your library as to how it should work. You can always provide some sample code in a readme.md or even an additional nuget with a preferred implementation. Give the consumer some options.

An example interface for RestSharp functionality could be as simple as:

public interface IRestClient
{
    void SetBaseUrl(string url);
    dynamic Send(string path, string verb, string payload);
}

Although it could be much better.

A Helpful Circuit Breaker in C#

Introduction

With the increasing popularity of SOA in the guise of ‘microservices’, circuit breakers are now a must have weapon in any developer’s arsenal. Services are rarely 100% reliable; outages happen, network connections get pulled, memory gets filled, routing tables get corrupted. In an environment where multiple services are each calling multiple other services, the result of an outage in a small, seemingly unimportant service can be a random slow down in response times in your web application that gradually leads to complete server lock up. (If you don’t believe me, read Release It by Micheal Nygard from the Pragmatic bookshelf).

The idea of a circuit breaker is to detect that a service is down and fail immediately for subsequent calls in an expected manner that your application can handle gracefully. Then, every so often, the breaker will attempt to close and allow a call to be sent to the troubled service. If that call is successful then the breaker starts allowing calls through, if that call fails then the breaker remains in an open state and continues to fail with an expected exception.

Helpful.CircuitBreaker is a simple implementation that allows a developer to be proactive about the way their code handles failures.

Usage

There are 2 primary ways that the circuit breaker can be used:

  1. Exceptions thrown from the code you wish to break on can trigger the breaker to open.
  2. A returned value from the code you wish to break on can trigger the breaker to open.

Here are some basic examples of each scenario.

In the following example, exceptions thrown from _client.Send(request) will cause the circuit breaker to react based on the injected configuration.

public class MakeProtectedCall
{
    private ICircuitBreaker _breaker;
    private ISomeServiceClient _client;

    public MakeProtectedCall(ICircuitBreaker breaker, ISomeServiceClient client)
    {
        _breaker = breaker;
        _client = client;
    }

    public Response ExecuteCall(Request request)
    {
        Response response = null;
        _breaker.Execute(() => response = _client.Send(request));
        return response;
    }
}

In the following example, exceptions thrown by _client.Send(request) will still trigger the exception handling logic of the breaker, but the lamda applies additional logic to examine the response and trigger the breaker without ever receiving an exception. This is particularly useful when using an HTTP based client that may return failures as error codes and strings instead of thrown exceptions.

public class MakeProtectedCall
{
    private ICircuitBreaker _breaker;
    private ISomeServiceClient _client;

    public MakeProtectedCall(ICircuitBreaker breaker, ISomeServiceClient client)
    {
        _breaker = breaker;
        _client = client;
    }

    public Response ExecuteCall(Request request)
    {
        Response response = null;
        _breaker.Execute(() => {
        response = _client.Send(request));
        return response.Status == "OK" ? ActionResponse.Good : ActionResult.Failure;
    }
}

Initialising

The scope of a circuit breaker must be considered first. When the breaker opens, subsequent calls will not succeed, but if your breaker is in the scope of an HTTP request then there may not be a subsequent request hitting that open breaker. The next request would hit a newly built, closed breaker.

The following code will initialise a basic circuit breaker which once open will not try to close until 1 minute has passed (60 seconds is the default timeout, so there’s no need to specify it).

CircuitBreakerConfig config = new CircuitBreakerConfig
{
    BreakerId = "Some unique and constant identifier that indicates the running instance and executing process"
};
CircuitBreaker circuitBreaker = new CircuitBreaker(config);

To inject a circuit breaker into class TargetClass using Ninject, try code similar to this:

Bind().ToMethod(c =&gt; new CircuitBreaker(new CircuitBreakerConfig
{
    BreakerId = string.Format("{0}-{1}-{2}", "Your breaker name", "TargetClass", Environment.MachineName)
})).WhenInjectedInto(typeof(TargetClass)).InSingletonScope();

The above code will reuse the same breaker for all instances of the given class, so the breaker continues to report state continuously across different threads. When opened by one use, all instances of TargetClass will have an open breaker.

Tracking Circuit Breaker State

The suggested method for tracking the state of the circuit breaker is to handle the breaker events. These are defined on the CircuitBreaker class as:

///
/// Raised when the circuit breaker enters the closed state
///
public event EventHandler ClosedCircuitBreaker;

///
/// Raised when the circuit breaker enters the opened state
///
public event EventHandler OpenedCircuitBreaker;

///
/// Raised when trying to close the circuit breaker
///
public event EventHandler TryingToCloseCircuitBreaker;

///
/// Raised when the breaker tries to open but remains closed due to tolerance
///
public event EventHandler ToleratedOpenCircuitBreaker;

///
/// Raised when the circuit breaker is disposed
///
public event EventHandler UnregisterCircuitBreaker;

///
/// Raised when a circuit breaker is first used
///
public event EventHandler RegisterCircuitBreaker;

Attach handlers to these events to send information about the event to a logging or monitoring system. In this way, sending state to Zabbix or logging to log4net is trivial.

CONFIGURATION OPTIONS
Make sure each circuit breaker has it’s own configuration injected using the CircuitBreakerConfig class.

using System;
using System.Collections.Generic;
using Helpful.CircuitBreaker.Events;

namespace Helpful.CircuitBreaker.Config
{
    /// <summary>
    ///
    /// </summary>
    [Serializable]
    public class CircuitBreakerConfig : ICircuitBreakerDefinition
    {
        /// <summary>
        /// Initializes a new instance of the <see cref="CircuitBreakerConfig"/> class.
        /// </summary>
        public CircuitBreakerConfig()
        {
            ExpectedExceptionList = new List<Type>();
            ExpectedExceptionListType = ExceptionListType.None;
            PermittedExceptionPassThrough = PermittedExceptionBehaviour.PassThrough;
            BreakerOpenPeriods = new[] { TimeSpan.FromSeconds(60) };
        }

        /// <summary>
        /// The number of times an exception can occur before the circuit breaker is opened
        /// </summary>
        /// <value>
        /// The open event tolerance.
        /// </value>
        public short OpenEventTolerance { get; set; }

        /// <summary>
        /// Gets or sets the list of periods the breaker should be kept open.
        /// The last value will be what is repeated until the breaker is successfully closed.
        /// If not set, a default of 60 seconds will be used for all breaker open periods.
        /// </summary>
        /// <value>
        /// The array of timespans representing the breaker open periods.
        /// </value>
        public TimeSpan[] BreakerOpenPeriods { get; set; }

        /// <summary>
        /// Gets or sets the expected type of the exception list. <see cref="ExceptionListType"/>
        /// </summary>
        /// <value>
        /// The expected type of the exception list.
        /// </value>
        public ExceptionListType ExpectedExceptionListType { get; set; }

        /// <summary>
        /// Gets or sets the expected exception list.
        /// </summary>
        /// <value>
        /// The expected exception list.
        /// </value>
        public List<Type> ExpectedExceptionList { get; set; }

        /// <summary>
        /// Gets or sets the timeout.
        /// </summary>
        /// <value>
        /// The timeout.
        /// </value>
        public TimeSpan Timeout { get; set; }

        /// <summary>
        /// Gets or sets a value indicating whether [use timeout].
        /// </summary>
        /// <value>
        ///   <c>true</c> if [use timeout]; otherwise, <c>false</c>.
        /// </value>
        public bool UseTimeout { get; set; }

        /// <summary>
        /// Gets or sets the breaker identifier.
        /// </summary>
        /// <value>
        /// The breaker identifier.
        /// </value>
        public string BreakerId { get; set; }

        /// <summary>
        /// Sets the behaviour for passing through exceptions that won't open the breaker
        /// </summary>
        public PermittedExceptionBehaviour PermittedExceptionPassThrough { get; set; }
    }
}

Conclusion

This library has helped me build resilient microservices that have remained stable when half the internet has been falling over. I hope it can help you as well.