Recently I’ve been maintaining a project with poorly written code that, to make it worst, is the core to most of the activities of its users. The project follows the big ball of mud pattern, almost every method is a huge static method that sometimes take even 20 parameters and that uses the session as a global variable container.
When I started working with this thing I decided to read Working Effectively With Legacy Code by Michael Feathers (which I HIGHLY recommend if you have been working with bad code lately) and it gave me a very good guidance on how to tackle this beast. Thanks to the suggestions on the book I’ve developed a workflow that have given me a way to clean the code one step at a time and reduce the huge methods into more manageable classes, without breaking too many things.
I may need to read again the book to freshen up the ideas and to tweak my refactoring process but, for now, this is how I’m curing this patient.
1. Isolate the Megamoth
As Uncle bob suggests, a huge method oftentimes is just a class in disguise, the idea being that classes should do one thing, and mega methods normally do one or more things.
The first step then is to extract the megamoth to a class with one public function to have all the code in one smaller and more manageable file. Obviously, I have to give a name to this new class but, at this point of the refactoring phase, it doesn’t really matters if the name is ugly, long or not really descriptive: it’s just a placeholder. Of course, I try my best to come up with a good name, but if I don’t understand the class really well I use the name of the method it was extracted from, maybe rewording it a little, just to get things going, I’ll rename it later when I have a better understanding of the purpose of the class.
Another point to notice is that the constructor will not have any parameters: all parameters are going to be passed to the only public method of the class
2. Group together functionality as protected methods
The second step I do is to extract protected methods from the main method in order to group together pieces of functionality. The reason to do so is to have a clear view of what the public method does.
How do I decide on how to group functionality?
- first, I look for uses of the keyword region in the code. The previous developers used regions, instead of method extraction, to group functionality inside a method.
- second I look for the comments that were used to try to explain what the code does
- finally I use try, for and if blocks.
This procedure also helps me when naming the extracted methods. Regions already have a name that explains what they are grouping.
Comments too help me understand the intent of the block of code, or the reason why it’s there, or the subject and date of the email that requested the change… or the name of the guy that made the request…. ok, comments tend to be pretty dumb but at least, sometimes, they give me some help in coming up with a name.
At this point the method signature will have a huge parameter list with both output and reference parameters. I don’t care, this parameters will be helpful later, so keep on reading.
Finally all method are extracted as protected virtual to be able to override them because of reasons…
3. Isolate dependencies
Once again I extract protected virtual methods but this time I extract just one or two lines of code wrapping calls to methods exposed by external libraries. The end goal of this step is to isolate all the code over which I don’t have any control that hurts the testability of the code.
If I’m very lucky the dependency wont be a call to a static method but a call to a method of an actual object. In this case, I extract the object and inject it in the constructor. If I’m very very lucky, the class implements an interface, whenever this happens I celebrate with a glass of whisky when I get back home… sadly I haven’t drank any whisky lately.
4. Start building a test harness
Now that I have divided all the code into manageable sizes I start building a test harness for the whole class. By having covered the class with tests I’m able to refactor it later knowing that I didn’t introduce any regression.
I start by extending the class I’m going to test and exposing the method that I’m willing to test with a public version of it. Then I override all of its dependencies and use this overridden methods as stubs or spies depending on the need. I build the characterization tests for the chosen method by forcing the execution of all its branches.
A mocking framework like Moq is really helpful at this stage
5. Improving the methods
After I covered a good part of the class and am confident of not breaking functionality I start cleaning it up a little.
First of all I try to remove all duplication by, you guessed it, extracting private methods and using them in all the parts of the code that were clearly built copy pasting wildly.
Secondly, I start renaming variables and method name to explain better the intent of the code, now that I have built a test harness I also have a better understanding of what the code does and thus am able to come up with better names.
Then I try to improve the readability and the structure of the code by limiting its level of indentation and using LINQ whenever I feel it’s more readable this way.
Output parameters are grouped into DTOs that become the return type of the method.
One final step is to group together methods that have a lot of input parameters in common. If two methods share most of their parameters this means that they share the same state and they belong in a class that encloses this state.
This step has the added benefit of allowing you to delete the parameters from the methods signature and promote them to public fields.
6. Removing Globals
Using global variables has been considered harmful for a long time now, but you can still find some legacy code that uses them and this is certainly the case for the codebase I’m currently maintaining.
Every time the extracted class uses a global variable I extract it so that the refactored code doesn’t directly depends on it. If the new class only uses the value held by the global variable I create a public field in the class and assign this value in the caller method after the class gets initialized.
If the refactored class sets the value of the global variable, i tend to wrap the global variable inside a class whose purpose is to manage crud operations on it and then I pass this wrapper to the extracted class. In this way the dependency on the global variable is hidden and managed in a higher level of abstraction and eventually, over time, I’m going to be able to get rid completely of the them.
7. Renaming everything and commenting
Now that I have a better understanding of the extracted class I can rename the methods and the classes accordingly.
During this phase I also add the documentation comments to help future maintainers of this code base.
Working with really huge methods
Sometimes I find myself dealing with a really big method that is so huge that just one small section of it could become a class all by itself. Or maybe worst: a single method that have the same 200 lines of code repeated over and over again.
In this cases I tend to extract classes from just this pieces of functionality and apply to then all the steps above. This give me the advantage of a more focused and faster refactoring, meaning that I can refactor smaller pieces of code more often maintaining a good rate between implementing new stuff and cleaning the old one.
Other quick wins
If I don’t have time to invest in a complex refactoring like the one I described earlier, I try, anyways, to make little improvements to the code by checking in cleaner code that the one I checked out. Some simple refactoring are:
- renaming variables and methods to clarify its behavior
- deleting commented out code
Working with messy code can be difficult, but if you were condemned by your manager to maintain a legacy application you have nothing to do but to bite the bullet and try you best to simplify your future work. The steps described above can help you to cut the complexity of the system safely but if you have no time at all do a complete refactoring of a method you could still do the first steps and leave the tests and more aggressive refactoring to another moment when you’ll have more time.
The important thing to remember is that all new code should be tested and that it’s important to clean the code constantly and, if necessary, without asking permission.
If you share the pain of working with megamoths and want to get tips on how to clean up your code don’t lose a second more and subscribe to our mailing list!
Author: Daniele Pozzobon
He constantly annoys his friends by talking about software and is passionate about Agile methodologies, which gives him more opportunities to talk annoy his friends even more.
When there are no friends around to annoy, he blogs on CodeCleaners and in his free he time loves go hiking with his wife and two daughters.