Code cleaning - waste of time or good investment?

Refactorization

Before we get into the code cleaning, I should talk a bit about refactorization.

So what is refactorization? It’s a way of modifying your code so it will meet new standards, APIs, or will just look different while still doing the same thing. When you are refactoring code your program should not change in terms of functionality - the only difference should be made in code.

The main reason of refactorization is to make your code future proofed. It can give you more possibilities by providing an additional API for your classes, or do something complete opposite: remove unused parts of API to make your code more encapsulated.

Aging of code

Some developers may say that you should write your code in a way that you won't need the help of refactorization. They are right, but it does not mean that your code will always be so perfect. We are only humans, and humans make mistakes. As we know, problems have a tendency to grow bigger and bigger - a small mistake that seems trivial during the early stages of development may bite you back a few years later, when it’s too late to make changes.

Another thing you should remember is that people’s needs change over time. You may be making a simple shop application that only accepts credit cards, but it does not mean it will be like that all the time. With the growth of the app your client may also want to support bank transfers or Paypal - then you'll have a problem of how to make this work. Most people will just make a dirty patch that will add new payment system, but as I said, you might regret it in the future.

Future proofing

So what is a solution for that problem? One of the solutions is “future-proof programming”. When you plan ahead and consider what features your app may need in the future, then you can prepare required APIs. But let’s be honest - you can't plan everything, and you are definitely not able to predict everything that the future brings.

Code clean

So what can we do? The answer to that question is “code cleaning”.

Let's have a look at few examples and think what to do about specific situations:

Example 1: A car shop

Let's say we are working on a car shop. We started by making a car class and a few classes that extend Car:

class Car
{
public function getColor();
public function getEngine();
public function getNumberOfDoors();
}

class SportsCar extends Car
{
public function getMaxSpeed();
}

class Truck extends Car
{
public function getCapacity();
}

Most programmers will call that a very decent piece of code. Let's say that after a year our shop started selling motorcycles as separate products.

class Motorcycle
{
public function getColor();
}

And after one more year, they also started to sell bicycles

class Bicycle
{
public function getColor();
public function hasBicycleCarrier();
}

During each of those steps, we created new additions to the code that are nice and clean on their own, but is the quality of it as a whole as good as it was before? Let's say that we need to add getBrand() to it, what can cause us some problems.

This is just a simple example, but I am sure most programmers have already been in situations when their code kept changing. Over time they surely found out that their perfect code is just weak.

How can we solve it? Well, one way is to add this function to each class and doing three times more work - next time something similar happens it will cost us even more time. Or we can just do code cleaning:

class Vehicle
{
// Now we can easily add this function here
public function getBrand();

// Each of the vehicles had same function, so why not make it shared?
public function getColor();
}

class Car extends Vehicle
{
public function getEngine();
public function getNumberOfDoors();
}

class SportsCar extends Car
{
public function getMaxSpeed();
}

class Truck extends Car
{
public function getCapacity();
}

class Motorcycle extends Vehicle
{
}

class Bicycle extends Vehicle
{
public function hasBicycleCarrier();
}

As you can see, we haven’t changed the functionality at all. What we did was future-proof our code. With that we were able to put getBrand() in one place while reducing the amount of work both we and the computer had to do. As a side effect, we also merged all 3 getColor() functions into one, what reduced the required amount of computation power and/or RAM .

Example 2: Templates

Let's use another example.

We are making a website that has 3 sections, each with two different pages. This is how its file structure might look (">" means extends):

core.php
section1/template.php > ../core.php
section1/page1.php > section1/template.php
section1/page2.php > section1/template.php
section2/template.php > ../core.php
section2/page1.php > section2/template.php
section2/page2.php > section2/template.php
section3/template.php > ../core.php
section3/page1.php > section3/template.php
section3/page2.php > section3/template.php

Someone decided to slightly modify template 2 and template 3, so now both look like the template 1. Even though we have not copied any code nor remade anything after something that had already existed, we are now violating a DRY (Don’t Repeat Yourself) rule.

The solution is to remove templates 2 and 3 and then refactor pages in sections 2 and 3 to point to template 1. What's more, you can completely remove those sections because they don’t make any difference.

Situations like the ones described in my examples are happening quite often during programming, therefore it's a good idea to know how to deal with them.

When to clean a code?

Described examples are simplified, so it may seem obvious how and when to code clean in those scenarios. In real programs it won't be that easy - you will often encounter issues when it’s already too late to do anything. To prevent that it's a good idea to do code cleaning on a regular basis. I typically do code cleaning after approximately 50 to 100 hours of programming. If you are working using an Agile style, you can consider code cleaning to be an extra phase between programming and testing.

How to clean a code?

Now it's time for the hardest questions - what exactly should be code cleaned and how to do it?

Typical code cleaning process can be split into phases:

Remove unused code

A first step in every code cleaning should be a removal of everything you don’t need. Every line of code that does nothing will only cause confusion in both development and later phases of code cleaning. If you are using unit tests with software that shows you code coverage of tests, that may help you to pinpoint code that needs to be removed.

Be aware that when the code is no longer used by your application, but is a part of your software’s API, then removing it may break whatever third party applications are using your code. Instead of removing that code you should mark it as deprecated. If it's possible in your programming language move it into a separate place (file, folder or bottom of the class) and use IDE markers to hide it (you don’t need to see it during development).

Find and Extract interfaces

Now that we solved a problem of dead code, we need to find fragments of code that may be doing the same thing, just like shown in previous examples. Typically you will create a common parent for those classes, being either an interface or an abstract class. Other places you may want to search are methods and functions. You can often find parts that do the same thing that could be extracted into an external Utils class or just moved to a more convenient place where the same code can be reused instead of being repeated.

Stabilize API

After moving fragments of code, you should look at the effect of your work and plan a new API. More often than not changes you made work as a great base for it. If you extracted a common class, then make your functions accept it (instead of each of subclasses). If you extracted a method or a function, it would probably have to become a part of a new API.

Prepare for a backward compatibility

Remember the code you've hidden in the first phase? Now it's time to look at it again and redesign it so it can use new functions and be compatible with the old API. This way you will not break any old applications while using your API. This is called “being backward compatible”.

Test and seal

After all that work it’s time to check whether your app changes. At first, you need to remove unit tests from code that was eliminated in phase 1 (but keep the ones that are handling the deprecated parts). Then make new unit tests for code that was just created. Make sure all unit tests ended successfully - this is the first moment you can rerun those tests, because during previous phases your code was unstable and some unit tests could've failed, but now all should be successful again. At the end do some manual testing and when everything works well then you can release your code as a new version.

Is code cleaning worth the effort?

Now let's go to the money question -is it worth the effort? The answer is - it depends. Typical code cleaning can increase future programming speed. Personally, I am able to work up to 4 times faster - changes that would normally take me an hour now can be made in about 15 minutes. But that depends on a project itself, how complicated it is and how messy it got since the last code cleaning. So if you are working on an app you plan to finish in 3 hours and will never see again, spending 5h on refactorization and code cleaning may not seem too useful. On the other hand, if you are in the middle of a 400-hour project, then that time is a very good investment.

Conclusion

Even if you won't directly increase your programming performance, a good looking code will be perceived better by your clients. Keeping your code clean is a very good way of improving not only the quality of your application, but also your programming performance. If you want your code to work and look good, you should keep it in good shape. Code cleaning, next to unit tests, is the way of achieving that effect.

Navigate the changing IT landscape

Some highlighted content that we want to draw attention to to link to our other resources. It usually contains a link .