Refactoring Legacy Code In Practice – Iteration 2 – Building a Golden Master

Building a test suite for legacy code can be daunting, so most of us usually approach legacy code in two different ways:

  1. If it ain’t broke don’t fix it
  2. Refactor without automated tests and hope for the best

But there is a third option that can give us the safety net provided by a good test suite and isn’t as slow as manually building a characterization test: using the Golden Master Technique to build the characterization tests.

Iteration Objective

In the earlier post I discussed that the first iteration of the legacy code retreat in which you have to explore the code and understand it’s purpose. As you may have guessed, the purpose of the second iteration is to build a test suite using the Golden Master technique. It’s a smaller iteration as it should last just 20 minutes, but I ran into some issues and it took me longer to figure out how to build the tests, probably because I’m doing these exercises alone and not pair programming as I should. Anyway, here is the procedure I followed to build the Golden Master:

  1. Find all possible inputs, both direct and indirect
  2. Find all possible outputs, both direct and indirect
  3. Find a way to record every output for any given input
  4. Play all inputs and verify that the output correspond to what has been previously recorded.

If you want to follow along you can find the code here.

Find all the possible inputs

All inputs to the Game object are produced by the System.Random library so we have two ways to approach this. We can either record all the values generated by the Random object or we can find a way to have every time the same sequence of random numbers generated by it.  The former approach has two disadvantages:

  1.  Too many inputs to log
  2. I have to exclude the call to rand.GetNext() when testing

These two aren’t really huge impediments, but there is a way to solve the problem with fewer changes to the code: seeding the random number generator. How does this work? the System.Random class is a pseudo-random number generator and one of its properties is that given a seed, the sequence of numbers returned is deterministic so, if during our tests we seed the generator with the same number used while recording the data we can be sure that the random object will return the same sequence and thus the roll() method will be called with exactly the same values.

// GoldenMaster.cs
// The call to GameRunner
for (int i = 0; i < 100; i++)
args[1] = i.ToString();
// GameRunner.cs
// the
var seed = int.Parse(args[1]);
var rand = new Random(seed);

The for loop in the code above is where I build the input sequence calling the Main function 100 times with different seeds for the random generator, in this way I have to log just the seed and not every number generated.

Find all possible direct or indirect outputs

To be able to verify that for any given input the program creates the same output I need to record all the outputs of the system. Luckily, the program doesn’t do any call to external libraries and doesn’t write anything to a database or hard drive, it just prints strings to the console, it has just one type of output: Console.Writeline.

Find a way to record every output for any given input

Knowing that all possible outputs are just calls to Console.Writeline simplifies a lot the step of recording the outputs for every given input. As we know Console.Writeline. is a static method and static methods are hard to stub, normally when you encounter a static method that you want to stub you wrap it in an isolated class or in a protected method of the same class and the you stub it in your tests. Isolating all calls to Console.WriteLine can lead you to introduce some bug in the program if you don’t have any test to check what you are doing.

Luckily there is a better way to stub the call to Console.WriteLine and it is to use to change the default output stream so it would write to a log file instead of printing to the console:

var file = File.CreateText("gold.txt");
// call to GameRunner.Main

Play all inputs and check the results

After we changed the output to write to a text file, running our test will record all the inputs and outputs of 100 runs of the game and the resulting file will look like this:

Chet was added
They are player number 1
Pat was added
They are player number 2
Sue was added
They are player number 3
Chet is the current player
They have rolled a 4
Chet's new location is 4
The category is Pop
Pop Question 0
Question was incorrectly answered
Chet was sent to the penalty box
Pat is the current player
They have rolled a 4
Pat's new location is 4
The category is Pop
Pop Question 1
Answer was corrent!!!!
The category is Science
Science Question 4
Answer was corrent!!!!
Sue now has 6 Gold Coins.

You can notice that in the beginning a game log start with a series of asterisks, then our seed for the random number generator then a series of dashes and then it starts recording all the outputs of the game. When the game ends I close the block with a series of underscores (which are probably useless but, who cares at this point?)

Testing the recorded data

After building our golden master we have to build a test suite that uses the recorded data to test that the system correctly behaves. The test built during this step is going to help us refactor the code giving us the safety net that ensures us that we didn’t break the current behavior while refactoring: we have a good amount of data representing what the behavior of the system should be and at every run, if the tests pass we are sure that the software is maintaining such behavior.

Like while building the golden master, during this test we want to access all the outputs for every possible inputs so we change console’s output stream, this times we don’t need to log anything so we just use a memory stream.

var stream = new MemoryStream();
var writer = new StreamWriter(stream);

Then we read the first input from the log and we execute the program

GameRunner.Main(new[] { "false", currentSeed });

Later we iterate through the log entries checking that the output recorded in the memory stream equals the one recorded in the file.

stream.Position = 0;
var reader = new StreamReader(stream);
foreach (var line in file)
// ...
Assert.AreEqual(line, reader.ReadLine(), "Seed " + currentSeed);
// ...

We iterate through the whole file and if we have no errors the creation of the Golden Master was successful.

The next step

In the next iteration I’m going to have some real fun as I can finally refactor with a little more confidence. I think I’m going to practice Subclass to Test as suggested in this blog post;

In the meantime you can download the source here.

And if you don’t want to miss the next post SUBSCRIBE or a kitten will die somewhere.

Author: Daniele Pozzobon

Daniele is an aspiring software craftsman and Scrum Master with more that ten years of experience in the software industry.
He is currently working on amazing solutions in the manufacturing industry helping with the development of a DevOps culture.
He constantly annoys his friends by talking about software and is passionate about Agile methodologies and DevOps, which gives him more opportunities to talk annoy his friends even more.
When there are no friends around to annoy, he blogs on CodeCleaners and in his free he time loves go hiking with his wife and two daughters.