You Guys Can Just Restore My VMs to Your Environment for Disaster Recovery, Right?
Quick Answer: Sure we can! But…
This question often comes up when a client is looking for offsite backup, but wants the ability to do offsite disaster recovery (DR) without having to pay for our DR as a Service (DRaaS). They want to have DR capabilities, but not the associated costs (unless a major event happened). The question is logical.
Since Managecast would hold a copy of the client’s data in the form of backups it would be possible to restore these backups into our DR environment. However, there are some cautions which should be clearly understood.
The first is that without actually going through the process to test, the end result will not be clear. Maybe it all works great, maybe it does not. This is why we recommend DR testing. Using backups to perform DR is a manual and time-intensive process, so testing becomes expensive and the client is asking the question because they are trying to minimize cost, so most who ask this question are not looking to pay to have DR tested. Understandable.
Then a disaster hits, maybe a year or two down the road, and the boss is screaming to get the systems operational. Everyone is on edge and demanding the systems be up ASAP. Definitely a bad time to be testing DR – during an actual DR event (which is what is happening.) Testing is being performed when time is of the essence. Bad combination!
Here is an example outcome:
Say it takes 12 hours to restore the client’s VM’s, they turn on, but the main application does not work properly. After spending a few hours investigating we find that a dependency, maybe a DNS server, is not in the customer’s backups (maybe it was running on an onsite router), which is causing the application not to run. It takes another few hours to set up a DNS server and re-populate the entries with the needed settings for the application to run.
We are now at 20 hours. We got the main application to run, but a secondary application is still not running. We spend more time to figure out why that application is not running. We finally get that application running. Then we find there is a web server that needs to be accessed from the public side. We spend time on the firewall, adding the necessary public IP’s, NAT rules, and firewall exceptions. Ok, most core applications are now running and we have been working non-stop for 30 hours.
So, how are users going to access this environment? Let’s try a client to site VPN. This results in very poor performance as users wait for their applications to download over a relatively slow link. Users are upset because it takes forever to do anything. Maybe client VPN is not the best solution, but the only thing we can quickly deploy. Maybe we can manually set up a terminal server to help, then instruct users on how to access the terminal server? Then we need to install the applications to the terminal server…
How is the boss going to react to getting a bill for 30+ hours at a time and a half (because of course the disaster hit on a Friday and we had to work all weekend)? The boss gets a significant bill for something that only halfway works for their users, will they gladly pay it?
Of course, some simple environments may just work great on the first try, but more commonly doing DR testing during an actual DR event is NOT recommended for the above reasons. It leads to disappointment, frustration and unexpectedly high costs.
Does it ever make sense to test DR during an actual DR event? Sure. It makes a lot of sense if you are willing to live with the risks and pay for it no matter the result because the alternative is far worse. Just have expectations set properly!
It is for these reasons we encourage clients to perform at least annual DR tests. These DR tests can be planned for a convenient time and typically does not require working after-hours or the weekends. When there are significant issues encountered it is good to re-test until the issues are worked out and a higher confidence level is achieved. It also allows testing of user access to the DR environment to ensure feasibility.