Building a software system de novo is the baseline way that software engineering is taught and understood. Use cases are identified, architectures and patterns are designed, and then software is implemented and deployed. Users are onboarded. This kind of green-field development can be exhilarating opportunity to create anew. Upgrading an existing system is a second and more frequent type of development as for any given system there is only one initial release but many subsequent releases. While upgrades primarily focus on incremental improvements, it is arguably a more complex case as upgrades are primary risks of outages and functional regressions, whereas with the baseline case there is nothing else in place at the time of initial release.
But what if there is a prior operational system in place? Specifically, one that is being replaced.
System conversions represent a third distinct type of development. The project scope now includes all of the effort of an initial software release plus an entirely new set of complexities. The prior system is often caught in a downward spiral - technical constraints may exist which make upgrades difficult, which in turn can diminish the organizational will to improve the system, which in turn reduces system performance and viability. The prior system, however, must be kept alive long enough to transition the functionality as well as support the data conversion to a new platform. This can become an anxiety-inducing software "race against time." As an example of life imitating art, the 1994 action movie Speed with Keanu Reeves offers some surprisingly insightful lessons and how this situation can be managed.
Lesson #1: The Bus Couldn't Slow Down
In the movie, a transit bus is wired with a bomb and cannot go below 50 MPH without dire consequences. From a software standpoint, if an existing system is highly utilized and still running critical functions but not well maintained it can feel like this. There may be multiple factors all pulling on the existing system to slow it down: an outdated and non-scalable architecture, an outdated codebase, and perhaps even a lack of developers to support the aforementioned items. Ignoring the current system though only makes the problem worse.
Lesson #2: A Second Bus Was Required
To save the initial bus, a second bus had to be obtained. In the software world, the "second bus" represents the new system and the development team to create that system. This could either be managed as one team with two major responsibilities (support old system, build new system) or two teams, but one thing is clear: there is effectively twice as much work. A key mistake of system conversion development is only budgeting for the "new" development.
Lesson #3: The Second Bus Was Accelerated To Catch The First Bus
Achieving functional parity is one of the most difficult aspects of system conversions especially when the first bus has a 100-mile head start, metaphorically speaking. The "chasing system" needs a long enough runway both in terms of time and budget, complicated by the fact that the prior system may still continue to evolve at the same time and not be a static target. Even the most well-intentioned projects can get tripped up on this. This type of development could take multiple fiscal quarters or years, and one of biggest issues is executive expectation management.
Lesson #4: The Passengers Are Rescued
In the movie the passengers are rescued in dramatic fashion, and anyone that has lived through a large system conversion will recognize that this is pretty much what it feels like. To rescue the passengers both buses must be operating not just at high speed but also close proximity, re-emphasizing the importance of feature parity. Having a second bus running 50 MPH but 5 miles distant and receding doesn't help.
Additionally, software to assist in conversions – particularly large-scale data migrations – is required and is a special art. Such software still needs to adhere to software engineering best practices, but also needs to be fast (as conversion windows are always under a time crunch), explainable (as conversions are always being asked to explain exactly what happened), and automatable (as the best conversions are always heavily practiced).
The management of conversions is an important aspect of software engineering and not for the faint of heart. The process represents the bridge from the old to the new.
Lesson #5: The First Bus Was Retired
In the movie, the first bus exploded spectacularly after the passengers were rescued. In real life, such kinetic outcomes are not generally desirable. Shutdown processes informed by contractual or regulatory provisions are important considerations, such as saving the existing system state for a required period of time and potentially leaving the system online in a read-only state. If a system state is saved as a backup, confirming that the backup can actually be restored is advised.
System conversions are a hard problem and will be ever-present in the software world as today's blue-sky development efforts become tomorrow's legacy code. Reasons for system-rot are myriad: technological obsolesce of frameworks or languages are one set of causes, but more than a few systems with reasonably current architectures have been undercut by boom-and-bust budgeting behaviors as systems are deployed with an enthusiastic initial release and then lay fallow. Technology leaders must actively manage every system in a portfolio. It's a lot of work to do this, but the alternative is worse.
Doug Meil is a software architect in healthcare data management and analytics. He also founded the Cleveland Big Data Meetup in 2010. More of his BLOG@CACM posts can be found at https://www.linkedin.com/pulse/publications-doug-meil