Way back in 2006, when I was a younger and prettier fella, I was tasked with being the lead in several large Disaster Recovery projects and tests for EDS. In those ancient times an actual disaster recovery, or even a test, was a real nightmare for all participants. DR was really expensive and really hard. The typical recovery involved some really weird stuff: magnetic tapes that had to be dragged from an abandoned mine (seriously kids – we used to store our most critical data on a tape that we shoved about a mile underground); a very expensive room where smart people could type on keyboards while sitting next to one another; a truckload of rented hardware; and more shell-shocked techies than you can shake a stick at. Oh, and fun fact: Most DRs, when cost and actual success were considered, were a huge failure. That was then – this is now.
DR has come a long way, baby. The world of cloud and automation and the advent of DevOps tools and methodologies have drastically changed the way organizations implement disaster recovery. One of the most significant but unsung benefits of DevOps is how it can be used as the new model for DR in the 21st century. Implementing DevOps provides practice and experience with moving files and applications among different environments, including testing and development. These skills can be put to service for quick and efficient response to downtime and service interruptions.
#1 Automation Tools
Tools that automate the DevOps lifecycle are also useful in disaster recovery. Today’s open-source tools automatically create, launch, and deploy virtual machine instances with proper configuration, as well as cross-security boundaries to function within the cloud, data centers, and even personal laptops. That sounds a lot like a recovery plan to me.
Not only do these tools automatically deploy code written by developers, but they also allow backup environments to be deployed quickly. These same tools allow software processes like building and testing to be automated. This provides major benefits in failover and disaster recovery.
The old way of doing a recovery was to rebuild a bunch of environments, reinstall a ton of applications, restart dozens of services, and then pray. With new technologies like OS containers, microservices, and API usage, the DevOps world has accidentally created a workflow that could be fairly described as “DR-on-demand”!
#2 Rethinking the Recovery Workspace
Historically, larger companies have invested significantly in backup infrastructure that is similar or identical to their production environments. This infrastructure exists primarily in standby mode and is kept idle except for occasional testing or planned failovers, which occur rarely.
This doesn’t make sense with today’s ever-shrinking IT hardware budgets. Why do cost-conscious businesses pay this enormous safety tax? The reasoning was based on the idea of protection against a regional failure that is immediate and catastrophic. Those kinds of failures have always been rare, but in the new Cloud and DevOps world our servers, applications, and services are probably distributed to several different geographic regions, if not continents. This old-school spending should be considered cost-cutting target #1 in the new-school way of doing business.
So what do you do with that hardware? Well, disaster recovery environments represent an ideal workspace for developers to test code with the help of virtualization technologies and advanced networking (see #3).
#3 Ability to Tap that Idle Capacity
Developers are kind of like teenagers: they are always hungry for more resources, and they like to work, talk, and create stuff while everyone else wants to sleep. This is a terrible combination when trying to get support from a typical IT or Operations team. But if you adopt a DevOps/Cloud model into your DR plan, you can maximize your infrastructure investment by finding ways to use the backup capacity for development purposes without bothering those guys and their spreadsheets.
When needed for an actual disaster or failover, the virtual machines can be disabled and traffic can be rerouted automatically, allowing the disaster recovery environment to handle a real-life disruption. Once the service disruption has been resolved, automatic scripts can manually restore access for production purposes.
#4 Forces an Implementation Method
Continuous improvement is not just a DevOps buzzword; it is a necessity for competing in today’s business landscape. The effort that is put into making CI a reality for your business should not be limited to just the development teams. Use that methodology and many of the same tools to implement continuous disaster recovery, first within your most critical business units and then eventually to your whole company.
Start your Journey to Continuous DR with a few questions:
• Who will control procedures?
• Who is responsible for automation?
• How can failure of everything from applications to infrastructure be simulated?
• How much does our current process cost? Is it more than we would lose to a fine or a failure?
• What really matters? How can we rebuild, re-factor, redesign or apps and services to be more easily recovered?
#5 Map the Minefield
In the recent past, some applications were created from an amalgam of “different.” A myriad of teams, people, and tools were the building blocks of most “mission critical” applications. No one knew how the “thing” worked or even what it needed to thrive (or survive). In most organizations those vital services are left to run – “don’t touch it, it’s working!” — with little or no management or improvement.
This is always the largest problem in DR and DevOps and it needs to be solved before anything else can be improved. By adding a layer of thought to these projects based on the question of what happens if this all goes KaBlooey, we can create a method that essentially maps these internal minefields. With this new minefield map you will have the requisite knowledge to shift the application into the cloud and eventually re-factor or redesign it as needed.
The future is now and software and technological advancements in DevOps, cloud, and automation have led us to a new world where anything is possible and the problems of the past are finally beginning to fall like dominoes. (DR is only one example.) In stark contrast to the silos and diverse methodologies that built yesterday’s applications, the new DevOps model is built upon the melding of development and operations into one delivery team. This shouldn’t be a change that only benefits a specific application or business unit; this should be a roadmap to “Tomorrowland” for your entire company.
Let me know what you think about this list, and feel free to share your own DevOps or Cloud stories with me… TheCloudGuy@Coda.Global.