What are best practices for versioning in SOA governance? What features should versioning tools have?
Versioning is a challenge that everyone will face over the lifetime of a service. The first and most important best practice is to always assume things will change. Then strive to define processes and standards from the very beginning to handle the changes when they occur. Gartner Inc.'s Roy Schulte, when talking about SOA back in the early to mid-2000s, pointed out that systems were built to last. But, nowadays, we need to build systems for change.
For example, will you allow for multiple versions of a service to run into production? If your answer is yes (and it should be), how will consumers get directed to the right version? Will consumers explicitly specify the version they desire, or will it be implicitly determined? Where I've been involved with SOA, we always wound up with major versions (e.g., 1.0 versus 2.0) explicitly specified on the request in one way or another, while minor versions (1.1 versus 1.2) were not.
The first and most important best practice is to always assume things will change.
The difference between a major version and a minor version was that minor versions were always backward compatible, while major versions might not be. We normally strive for backward compatibility because it's the least impactful change. If we've chosen to break backward compatibility, there was likely a good reason for doing so. Trying to get around it via transformations is likely to introduce significant complexity in the middle and give consumers an excuse not to change, which is exactly the behavior we don't want. Systems will change for both consumers and providers.
The next best practice when it comes to versioning tools is to set a limit on how many versions you will run in production. My recommendation is no more than three major versions, and always only one minor version of each. So, we could have 1.0, 2.0 and 3.0, but we should never have 1.1 and 1.2, except possibly for some short duration after 1.2 is released, in case there is a need to roll back or perform A/B testing.
Why three versions?
Part of the answer depends on the rate of change. If the rate of change of your services is different than the rate of change of the consuming systems, things may not sync up. You have to decide what a realistic expectation is for each consuming system to get modified. If every consuming application has a major release once a year, you should be able to factor upgrades to services in those releases and get by with no more than three versions in production. If you're making more than three major versions in a year, that's a sign of poor interface design when things are continually not backward compatible.
You also can't put a restriction on the service provider without a corresponding requirement for the service consumer. Once a new version is put in place, be sure to have policies that dictate how long service consumers have before they must upgrade.
It's very easy to get an organization into the habit of doing some foundational work with every release in addition to the feature enhancements required by the business. If consumers don't accept this as standard practice, the service provider is left maintaining many, many versions of the service which will raise the costs associated with maintaining it.
You'll probably need to have different policies for internal versus external consumers. Internal consumers are under the control of one organization, while it's very difficult to tell external consumers how to spend their money. This means the time required to upgrade will likely need to be a lot longer, making backward compatibility all the more important.
Speaking of backward compatibility, regression testing must be at the heart of any versioning strategy. All it takes is a consumer getting burned once by a change that was supposed to be transparent to cause major trust issues. In order to build trust, service providers should do everything they can to show a service consumer whether they will be affected by a change.
This begins with collecting regression tests from consumers when they begin using the service. Then, with every release, the service provider can execute the tests provided by each consumer to demonstrate whether the change impacts them or not. Getting tests from each consumer is very important.
If the tests were created solely by the service provider, the service consumer has no skin in the game and can always claim, "You didn't execute my tests, so I still don't trust that you won't break my app." For a service with a lot of consumers, these test suites can get large, so it's important to have solid automated testing to keep the costs of this regression testing low.
Finally, if you're not tracking service consumers, then you're in a big hole. Whether you do this in an Excel spreadsheet, an Access database or a formal SOA repository, you have to track the consumers, and always have a representation of identity on your messages. As you try to decommission earlier versions, if you can't look at run-time traffic and easily determine what consumers they are coming from, it's going to be very difficult.
There are additional roles that technology can play, especially if you use an intermediary like an ESB, XML appliance or a home-grown gateway. You can use these to apply transformations and perform message-based routing for particular scenarios, but I've found explicit versioning policies and expectations work better in keeping things under control rather than trying to allow everyone to do what is best for their particular system and then attempt to manage the chaos with some technology in the middle.
That's not to say these technologies aren't useful, they are. Use versioning tools to support the policies, rather than as a replacement for policies.
This was first published in April 2013