Working with legacy code

Working with legacy code

How to work effectively with legacy code

I finished reading the book Working Effectively with Legacy Code by Michael Feathers.

In this article, I cover the approach I would take when working with legacy code and some other aspects.

What is legacy code?

In the industry, legacy code is often used as a word for code that is hard to work with, difficult to understand, and hasn't been touched in a while.

Eli Lopian, CEO of Typemock, has defined it as "code that developers are afraid to change".

Michael Feathers defines legacy code simply as code without tests. You may wonder what tests have to do whether code is good or bad?

Let him enlighten us:

Code without tests is bad code. It doesn’t matter how well written it is; it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don’t know if our code is getting better or worse.

I agree with this statement. With tests we know for sure our software does what it should when the users use it, and if we want to improve the existing code itself, we can refactor it with confidence, only if tests are in place.

Refactoring is a powerful technique for improving the existing code, I suggest you read my last article which is about refactoring and its importance, and I also recommend reading the book "Refactoring: Improving the Design of Existing Code", the 2nd edition is phenomenal and even includes examples in JavaScript.

Legacy code and customers

Users depend on the behavior of our software. They may be happy if we add behavior, solving yet another problem they have, but if we remove or change existing behavior that was not desired (introduce bugs), they stop trusting us.

Preserving the existing behavior of software is one of the greatest challenges in software development and something that is very important.

How do we work with legacy code

As someone who works as a Frontend Developer, working heavily with JavaScript and the UI library ReactJS, I won't take the approach when working with legacy code that the book introduced.

If I was to work with code that is difficult to understand, has no tests, and is hard to work with, perhaps add a new feature to it, this is how I would approach it:

  1. Understand the code and architecture

  2. Preserve existing behavior

  3. Improve existing code

  4. Add the new feature

Let's go over each step, one by one.

Understanding code and architecture

It can be difficult at first, to understand the code and the architecture of the software.

There are various things that can help us with that, but two techniques I personally found very useful that I want to mention here.

Notes/Sketching

If the code you are reading is difficult to understand it pays off to start drawing pictures and making notes.

If you see a function, write it down, if the function calls another function, check what that function does and draw a line on the picture resembling their relationship.

By sketching and making notes, we can often see things in another way. It’s also a good way of maintaining our mental state when we are trying to understand something particularly complicated.

A great site you can use to sketch is Excalidraw.

Scratch refactoring

One of the best techniques to start understanding code is just to play around with it. Get in there and start refactoring, moving code around, extracting functions, and try making the code clearer.

The primary goal here is to better understand the code, eventually, we will have to undo our changes, because we really don't know if we have broken something somewhere in the software.

Just keep refactoring the code and try making it clearer, don't even worry about tests, via GIT, we can easily discard all of the changes we've made.

Preserving existing behavior

Higher-level tests

As mentioned above, preserving the existing behavior of software is extremely important, we don't want to lose the trust of our customers.

As a Frontend Developer, I would personally start with higher-level tests that resemble the user, meaning integration and E2E Tests. I wouldn't add tests for everything, rather around the area where I want to add the new feature.

Now, in some instances that could give us enough confidence, depending on how small or large the software is, but in some, it is not enough.

Characterization Tests

In order to preserve the existing functionality of a piece of code, I would write characterization tests.

The book describes such tests as:

The tests that we need when we want to preserve behavior are what I call characterization tests. A characterization test is a test that characterizes the actual behavior of a piece of code. There’s no “Well, it should do this” or “I think it does that.” The tests document the actual current behavior of the system.

Now, you may think, one approach we could take in order to preserve the existing behavior, is to have a look at the user stories that were previously written, and that way somehow write not just any tests, but proper tests with the desired outcome. There is a problem with that.

If we write tests based on our assumption of what the system is supposed to do, we’re back to bug finding again.

Our goal is not to get back to bug finding, rather strive to preserve the existing behavior so that we can refactor the code with confidence (improve the existing code) before adding the new feature.

The book introduces an algorithm for writing characterization tests:

  1. Use a piece of code in a test harness.

  2. Write an assertion that you know will fail.

  3. Let the failure tell you what the behavior is.

  4. Change the test so that it expects the behavior that the code produces.

  5. Repeat.

When we can see what the pieces do, we can use that knowledge along with our knowledge of what the system is supposed to do to make changes.

If you do find behavior that could be bugs, mark the tests as suspicious and later with the knowledge of what the software is supposed to do, you can then fix those bugs. Our current tests as mentioned, are just there to preserve how the software currently behaves.

The code itself can give us ideas about what it does, and if we have questions, tests are an ideal way of asking them.

Through characterization tests, we will also have a better understanding of how the software currently works.

The Method/Function Use Rule

This rule as a Frontend Developer I'm not a fan of, because it would be testing the implementation details, and not really how our software is being used, which gives us less confidence compared to resembling our users.

Nevertheless, I thought of mentioning it still, not everyone is a Frontend Developer, and I do think this rule can be useful for many other types of developers.

The rule described in the book:

Before you use a method in a legacy system, check to see if there are tests for it. If there aren’t, write them. When you do this consistently, you use tests as a medium of communication. People can look at them and get a sense of what they can and cannot expect from the method. The act of making a class testable in itself tends to increase code quality. People can find out what works and how; they can change it, correct bugs, and move forward.

Improving existing code

I redirect you to my previous article on refactoring, which covers various aspects of refactoring, the art of improving the existing code without changing the behavior of the software.

Adding the new feature

When it comes to adding the new feature, I love TDD, Test Driven Development. I especially love how it allows us to focus on one thing, we are either refactoring or adding the new feature, not doing both at once. It's tough trying to do multiple things at the same time, keeping them in your head, wasting mental energy, and you will likely not do them well.

As a Frontend Developer, I love Cypress Driven Development. Writing a test in Cypress, seeing it fail as it resembles the user, adding the feature, after the test pass I then refactor the code I've added.

Conclusion

Working with legacy code is tough, but it is not impossible!

I hope this article could somehow be beneficial, I surely benefitted from writing it and will likely come back to it if I find myself working with code that has no tests and is difficult to understand.