Flexible Test Doubles in Go

It’s been a bit since I’ve written on this blog about anything other than containers, but I’ve been reading a lot of new (to me) Go code lately and wanted to discuss unit testing.

I’m pretty firmly in the camp that testing is critical for building reliable, maintainable systems, and unit testing is an important component of that (though I do not believe it is sufficient on its own; integration, functional, or end-to-end testing is also often just as important). Unit testing is a somewhat special form of testing though, since the goal is to test the smallest functional unit of a system. One tool used as a part of unit testing that has been popular for as long as I’ve been employed as a software engineer is that of a test double.

Test doubles are objects that are used as stand-ins for a real object (such as a dependency) for testing a particular unit of code. There are different names for test doubles, and the one I first encountered when I started my career was a “mock” (in the form of the Mockito mocking framework for Java). I now refer to them as test doubles following Martin Fowler’s retelling of the concept from Gerard Meszaros’s book, which introduced a few different categories of test doubles (which I’ve reproduced below):

Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.

Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an InMemoryTestDatabase is a good example).

Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what’s programmed in for the test.

Spies are stubs that also record some information based on how they were called. One form of this might be an email service that records how many messages it was sent.

Mocks are pre-programmed with expectations which form a specification of the calls they are expected to receive. They can throw an exception if they receive a call they don’t expect and are checked during verification to ensure they got all the calls they were expecting.

Martin Fowler has an essay diving much deeper into the differences, which is worth reading, but isn’t really what I’m going to focus on here.

When I moved from being primarily a Java programmer to being primarily a Go programmer (2015ish), I searched for a Go equivalent of Mockito and ended up using gomock for a couple years. However, there were some ergonomic issues that eventually pushed me away from gomock, such as the reliance on code-generation via its mockgen tool. Meanwhile, I I started making some of my first pull requests to Docker and one of the maintainers (who I now consider a friend) asked me to avoid a mocking framework and instead hand-write the mock (at least my memory is that this feedback came from Brian; it doesn’t appear to exist on GitHub so possibly we had been talking on Slack?). I ended up hand-writing a mock that used channels to pass expectations from the unit test to the mock, but that was fairly brittle and I was never happy with it.

Fast forward to a few years later, and I connected the dots that Go had first-class functions and I could build a test double using those instead. Since then, I’ve used that as the primary pattern whenever I needed a mock or a test double, and even went back to rip out the old channel-based mock and replace it in Docker. I’m sure others have also used this pattern, but I don’t believe I’ve seen it anywhere else or read about it elsewhere; if someone finds another example I’m happy to update this post and give credit where it is due!

Without further ado:

Using first-class functions for test doubles in Go

This pattern assumes that any dependencies used in the unit of code that we’re testing are modeled with Go interfaces. Using an interfaces allows us to inject a different implementation such as our test double.

The general idea here is to create a struct that fulfills the interface of a dependency of the code you’re testing. This struct needs to have every method defined on the interface in order to fulfill it, but does not need to share any behavior with the real dependency. Instead, it should have a member field that is a func of the same signature as each method that allows the test author to provide the implementation as a part of the test body.

I think that’s not super clear, so let’s look at it with an example. Suppose we have a struct responsible for computing Pi to some degree of precision specified by the caller. Something like this:

type PiComputer interface {
    Calculate(precision int) float64
}

(For the sake of this example, let’s ignore all the fun things around this, like the fact that fixed-width floating point numbers have limited precision, or that it might be useful to only compute these things once and store the result, and so forth. This is just for fun as an example.)

We might then have a function that calls this method, perhaps as part of a pretty printer.

type myFormatter struct {
	computer PiComputer
}

var colors = map[int]string{
	0: "black",
	1: "red",
	2: "aqua",
	3: "yellow",
	4: "green",
	5: "blue",
	6: "purple",
	7: "lime",
	8: "silver",
	9: "fuchsia",
}

func (m *myFormatter) Pi(w io.Writer, precision int) {
	// error-checking omitted
	pi := m.computer.Calculate(precision)
	str := strconv.FormatFloat(pi, 'f', -1, 64)
	for _, c := range str {
		if c == '.' {
			fmt.Fprint(w, string(c))
			continue
		}
		color := colors[int(c-'0')]
		fmt.Fprintf(w, `<font color="%s">%s</font>`, color, string(c))
	}
}

Given a function like this, we might want to write a unit test that just tests the Pi function logic, without depending on the real PiComputer implementation. This is where a test double comes in handy.

For our test double, we can build a struct like this:

type TestDoublePiComputer struct {
    // CalculateFn is used to fulfill Calculate(int) float64
    CalculateFn func(int) float64
}

// assert that we fulfill PiComputer
var _ PiComputer = (*TestDoublePiComputer)(nil)

// Calculate is the method we replace
func (t *TestDoublePiComputer) Calculate(precision int) float64 {
    // call t.CalculateFn to use user-supplied behavior
    return t.CalculateFn(precision)
}

There are three things to note about this code above:

We define a struct with a member that is a func. For convenience, it’s named the same thing as the function we want to replace, with Fn on the end, but the name isn’t really important.
We have a type-assertion to enforce a compile-time error in case the interface changes. This will help us remember to update the test double in the future if the PiComputer interface changes.
The function body of the Calculate method just invokes our passed-in CalculateFn. This makes it so we can replace the body at runtime.

Let’s look at a test using this double, using the fairly common xUnit-style of setup, execution, assertion, and teardown.

func TestPi(t *testing.T) {
	// 1. Setup
	// Construct our test double
	computer := &TestDoublePiComputer{
		CalculateFn: func(int) float64 {
			return 0
		},
	}
	// Construct the struct we're testing, and inject the test double
	f := &myFormatter{computer: computer}

	// 2. Execution
	buf := &bytes.Buffer{}
	f.Pi(buf, 0)

	// 3. Verify
	if buf.String() != `<font color="black">0</font>` {
		t.Errorf("wrong output: %q", buf.String())
	}

	// 4. Teardown - nothing to do
}

The example above shows how a test double function body can be written in-line with the test, and is an example of a stub, since it provides a canned response to the test input.

There are a few reasons I like this pattern:

It keeps the test double logic close to the test body. I can look directly above the execution or verification sections to see what exactly the double does, rather than guessing.
It’s just Go. A future developer familiar with Go can read the code without having to learn a new test framework or library. Similarly, it can be added to a new codebase without adding any dependencies; no generator is even needed. (It does have some boilerplate, but the boilerplate itself is fairly straightforward.)
Any of the 5 different test double categories can be implemented this way. Let’s dig into this a bit more.

Dummies

A dummy is mostly about the usage (an object passed around but never used). A dummy using this pattern of the above example would be:

// no need to define any funcs if they're not called
dummy := &TestDoublePiComputer{}

Fakes

A fake has a working implementation, but that implementation doesn’t need to be complete. We might have a fake PiComputer like this:

func TestPi2(t *testing.T) {
	fake := &TestDoublePiComputer{
		CalculateFn: func(i int) float64 {
			switch i {
			case 0:
				return 0
			case 1:
				return 3
			case 2:
				return 3.1
			case 3:
				return 3.14
			}
			return 3.141
		},
	}
	f := &myFormatter{computer: fake}

	buf := &bytes.Buffer{}
	f.Pi(buf, 2)

	if buf.String() != `<font color="yellow">3</font>.<font color="red">1</font>` {
		t.Errorf("wrong output: %q", buf.String())
	}
}

Spies

A spy records information about how it was called so that the calls can be verified later in the test. An example of a spy could be:

func TestPi3(t *testing.T) {
	calls := []int{}
	spy := &TestDoublePiComputer{
		CalculateFn: func(i int) float64 {
			calls = append(calls, i)
			return 0
		},
	}
	f := &myFormatter{computer: spy}

	buf := &bytes.Buffer{}
	f.Pi(buf, 0)
	f.Pi(buf, 1)
	f.Pi(buf, 3)

	if len(calls) != 3 {
		t.Errorf("wrong number of calls: %d, expected: %d", len(calls), 3)
	}
	expected := []int{0, 1, 3}
	for i := range expected {
		if calls[i] != expected[i] {
			t.Errorf("wrong value for call %d. got: %d, expected: %d", i, calls[i], expected[i])
		}
	}
}

This example is interesting to look at because the calls variable was defined in the scope of the test body, not in the scope of the fake itself. Since Go allows first-class functions that retain the scope of variables defined outside, we can update calls without having to explicitly pass it. This allows the test double itself to communicate with the test body, and for interesting verification patterns to be used.

Mocks

A mock has pre-programmed expectations. We can write one using this same pattern, and expand on the idea of passing variables between the test body and the mock logic here too:

func TestPi4(t *testing.T) {
	callCount := 0
	expected := []int{0, 1, 3}
	mock := &TestDoublePiComputer{
		CalculateFn: func(i int) float64 {
			callCount++
			if callCount > len(expected) {
				t.Errorf("called too many times: %d", callCount)
				return 0
			}
			if i != expected[callCount-1] {
				t.Errorf("wrong argument, got: %d, expected: %d", i, expected[callCount-1])
			}
			return 0
		},
	}
	f := &myFormatter{computer: mock}

	buf := &bytes.Buffer{}
	f.Pi(buf, 0)
	f.Pi(buf, 1)
	f.Pi(buf, 3)

	if callCount != len(expected) {
		t.Errorf("call count incorrect, got: %d, expected: %d", callCount, len(expected))
	}
}

In this example, the mock participates in validation both through the expected number of calls and through argument validation.

I really like this technique as I think it makes tests more readable and maintainable to have the test double logic easily controlled and close to the test body itself. I hope this article has been helpful, and I look forward to seeing this pattern show up in more codebases.