A Bug Report
Picture the following scenario:
You are in charge of maintenance for a system that has been deployed in production at many different locations for over a year. A new bug report comes in saying that one location is having difficulties with a specific request. This particular site has been running well for several months up to this point, but one new request has started causing issues. After a little investigation, you realize that the problem is due to a longstanding bug in your software that no one has encountered before. Your server application is sending an ampersand ('&') character that is getting misinterpreted by the client application. The client is simply incapable of handling this message and crashes every time.
What do you do?
Well, you can start by fixing the client application so that this is not an issue going forward for any new installations and for anybody who is capable of updating their application software. Sometimes, this is not good enough.
(By the way, why does software always go months without issue before receiving three bug reports in the same week?)
In certain circumstances the client application is outside of your control and may be impossible to update at all the sites for various reasons: the corporate policymakers have frozen the solution and will not sign off on any updates; the software is installed at a remote location and nobody has the necessary permissions or skills to perform an update; some hardware governing body requires a drawn-out review process for you to release your client application; or many other absurd reasons.
Now what do you do?
Fixing Your Own Bugs
The key to designing for backwards compatibility is to embrace the fact that you cannot change the past. Since you cannot change the software that already exists in the field, you need to update the software that you do control to compensate.
Before releasing your updated client application, add a new version indicator to the message format (if it doesn't exist already). Modify the server application to distinguish between the old client application (version 1) and the new client application (version 2). When the server is ready to send a version 1 response, it will replace any ampersands with a suitably-safe substitute. When it encounters a version 2 client, no substitution is required.
The following example C++ code snippet demonstrates this logic:
// Generate a new response object for the given request
Response* ServerApp::handleRequest( const Request& request ) {
Response* response = new Response();
...
string content = this->getResponseContent( request );
// Remove ampersands for clients below version 2
// See bug 19243
if( request.getVersion() < 2 ) {
content = this->replaceAmpersands( content );
}
response->setContent( content );
...
return response;
}
This update allows the existing client applications to continue operating in a limited way (sans-ampersand) while allowing new clients to use the full features of the application.
Fixing Other People's Bugs
Not to spark a religious operating system debate, but I feel that I have to mention the fantastic efforts of Microsoft to maintain backwards compatibility when developing Windows 95. As described by Microsoft employee and blogger Raymond Chen, the Windows 95 development team went to great lengths to make sure old software still ran seamlessly on their new operating system. They knew that most businesses and even individual users have one or more deal-breaker application that must work when they agree to upgrade their software. If those essential applications failed, Windows 95 would not fly.
The engineers at Microsoft came up with a clever way to build backwards compatibility directly into Windows. They created Application Compatibility Shims - named, as I understand, after door shims used in construction. Just like a door shim allows for adjustments between the framing wall and the door frame, these software shims allow for adjustments between Windows and incompatible software applications. Microsoft wrote many custom shims to compensate for subtle and not-so-subtle bugs found when they tested old software running on their new operating system.
When Windows 95 finally launched, there were surprisingly few programs with serious compatibility issues. When issues were discovered, there was rarely a need to dig into the scary underbelly of the operating system. Instead, Microsoft could just configure an existing shim or whip up a new one to fix the bug. This design was so successful that very few people ever gave a second thought to upgrading Windows. This gave Microsoft market dominance of both the corporate and personal desktop - a position that other platforms are still struggling to pry away.
Something to Think About
As I explained earlier, the key to backwards compatibility is embracing the fact that you cannot always change what already exists. By cleverly designing new software to compensate for old mistakes, we can mitigate their issues and sometimes even pretend that the old mistakes never existed. All of this comes at a cost, but that's a topic that I will have to address in a future post.
Cheers,
Joshua Ganes
This post was the second part of a series on the topic of backwards compatibility. My next article on this topic: Backwards Compatibility III - Planning For The Future.
No comments:
Post a Comment