OVO Tech Blog

Good housekeeping: Graceful handling of legacy services

Introduction

Andy Summers


Good housekeeping: Graceful handling of legacy services

Posted by Andy Summers on .
Featured

Good housekeeping: Graceful handling of legacy services

Posted by Andy Summers on .

In this post I'm going to roughly group together a bunch of common scenarios from software engineering with one common theme: what happens when you are responsible for code that is running in production but not in active development. I have used the blanket term Legacy for this, but there are scenarios where the code might not be regarded as legacy.

Some example scenarios

  • You worked on a service for a project some time ago, deployed it to production with the necessary monitoring, then moved onto other exciting projects.
  • Your team is responsible for dozens of microservices. You have not got the resource nor have any need to work on all of them in each development iteration. Several of them haven't required any changes for the last few months.
  • Another team in your company implemented a service then moved onto other things. It has been decided that the service sits closest to your team's domain so you have been assigned ownership.
  • A third-party consultancy implemented this service then the decision was made to in-house the codebase and terminate the maintenance contract to save money.

The danger lurking within

The service may be happily trundling along in production for a significant length of time, whilst you worked on other exciting features. During this time the last developers who worked on it may have left the company, the documentation and test environments for it may have moved around.

Suddenly one day - and I trust you can picture the scene as if this were an action movie - your CTO or someone equally important bursts into the room in a panic and at the same time your monitoring dashboards catch fire. The service has fallen over in production, or there's been a sudden security breach, or your best customer has unearthed an old bug and it made them angry - either way, after the initial scramble you discover that turning it off and on again doesn't solve it. You're going to have to make a change to the service and deploy it as soon as possible.

The code change itself

Let's say for argument's sake that the fix is just a one-liner, as easy to implement as fixes get. Marvellous.

But before you and your team could confidently arrive at that conclusion you had to refamiliarise yourself with the code. What if the original developers are no longer at the company, or all the documentation from the handover (there was a handover right??) took you ages to find? Is the service even written in a language/technology you are familiar with?

Ok we think we fixed it, how do we deploy?

So you have your fix in place and you want to test it with a view to deploying it to production. It would be prudent to ask and answer the following questions.

  • Was there a CI pipeline in place for this codebase?
  • How long ago did it last run?
  • Does the service have automated tests? Unit, integration, etc?
  • There are tests, great. Do they rely on fixtures or infrastructure in order to pass and does all of that still exist?
  • If we have a dev/UAT/staging environment, is it in a usable state?
  • Is there any manual testing required? Are there testers with the necessary knowledge available or at least a document of test scenarios?
  • Where do build artifacts get stored? Is the version currently in production available here?
  • Where is production hosted?
  • If we manage to release our fix but need to rollback, can we do so easily?

If you had difficulty answering any of these, chances are your one line fix took several times longer to get into production than if this was a service you only worked on last week. But, eventually you managed to jump through these hoops and get your fix deployed, everything is peaceful once again.

How could we have avoided that pain?

It's pretty clear by now that leaving code running in production for prolonged periods without any change is a dangerous practice. It goes against the ideas presented by the continuous delivery methodology, however it is easy trap to fall into if there is simply no business requirement to change the code continuously.

Here's some actions you could take that would make this process easier. If this was a service you inherited from someone else, you may want to do these at the point of handover, or if it's something you worked on yourself a while ago you may want to do some of these periodically.

Smallest change with minimum impact

Periodically, commit to making one small change regardless of business need - upgrade a dependency, write some more tests, or just refactor a method - and when you are happy with it deploy the change. At least then you build some confidence in the change and deployment process. Once you are in the habit of doing this, try to do it as often as possible.

Make it conform

If the whole CI/CD process for this service differs to everything else that you work on then it might be beneficial to make it consistent with the rest. For example, it may build on a random Jenkins box and the test environment hosted on a server in the cupboard, whereas your more recent stuff builds on CircleCI and is deployed to GCP. Take the necessary steps to migrate it so that you are more comfortable with making changes.

Modernise the testing

If this was something that required a bunch of manual testing whenever any change was made, then find ways to add automated tests as far as possible. Fully automating it might be a large task, but you can do this iteratively.

Modernise the release process

If after initial assessment, you find that changes cannot be deployed without huge customer impact, e.g. downtime, then perform the necessary changes so that it can be deployed continuously.

Your product manager or CTO may ask why these are necessary when everything has been running fine and you have other priorities, but show them the above two sections and explain how these new projects will have to stop completely if this situation happens. Much like with the concept of technical debt, you are just borrowing time from the future by continuing to do nothing.

Conclusion

In many walks of life, your ability to deal with change is inversely proportional to the amount of time since the last one. It is very easy to simply leave something behind because it appears to be fine and your focus is on newer exciting projects. However this is likely to be something you regret when the time comes that you do need to make a change.

There are several tactics you can use to make incremental changes that, whilst they may not have any direct business benefit, will save you in the long run. If you find yourself in any of the scenarios listed at the start of this post, be nice to your future self and invest a little time in them.

Andy Summers

View Comments...