Your client calls in a panic. Something’s gone wrong with a server and their web store is down. You get there fast, run to the server and determine that it has suffered a hard drive failure. You collect your thoughts and think carefully about the procedure for restoring this piece of equipment quickly, but you draw a blank. The clock is ticking. Downtime is piling up, and your client’s face is reddening with anger because she’s not sure you know what you’re doing. You don’t tell her, but you’re not sure you know either.
This is the last situation you want to find yourself in. As a client’s frustration mounts, her patience thins, her wallet empties, and her trust in you erodes. There’s only one thing that can stop this from happening, and it goes beyond having a backup and recovery plan. You need to make sure your plans work effectively and you can only do this by testing them. Remember, you’re not just testing a backup, you’re testing yourself on your ability to recover so you don’t end up testing your client’s patience.
In order to make backup and recovery testing effective, there are some questions you will want to ask yourself. The following should help you gather information you need to create a testing strategy that’s a regular part of your process so when the time comes, you’re not just “kind of sure” you can recover—you’re absolutely positive.
1. Why should I test?
In order to deliver on promises laid out in your service agreement and in order to meet client recovery time objectives (RTOs), testing is crucial. Testing helps you gain the knowledge you need to recover quickly because when you practice the processes involved in a good recovery, you’ll make it happen swiftly when a problem pops up. Testing also builds trust with clients because it allows you to show them that you don’t just think you can recover them, you know you can and you can prove it. Seeing is believing—especially when it comes to a recovery.
2. What processes and equipment should I test?
The best thing you could do is test every backup thoroughly, but that can be outside the realm of possibility when IT providers have so much they need to do in a day. Instead, it’s important to think about the most crucial equipment and how that equipment fits within the whole infrastructure. Since every failure is different, one server will have dependencies another does not, and some are far more critical to operations. Prioritize which equipment is essential, and which is less essential, then think about how exactly you’ll want to test these.
3. How do I intend to test?
Many backup solutions automatically verify the integrity of a backup image for each piece of equipment. This is something you can rely on to an extent, but it’s not a testing method that will allow you to really determine how quickly you can recover. Consider whether you ought to do something simple like a file and folder recovery to see that a backup image functions, or spin up a backup as a virtual machine to verify that it can actually boot up and run as a full VM. Testing methods depend on how crucial the equipment is and how difficult it would be to recover it if it failed. As we’ve noted, the circumstances of each failure will be different.
4. Who is ultimately responsible for testing clients’ backup and recovery plans?
If you’re an MSP, it’s likely you have a number of employees. It can be wise to dedicate one person to managing the who, what, when, and why of testing because it’s easy to lose track of which client has been tested thoroughly and which may have been neglected. It’s also possible that multiple people will be responsible for tests, but whatever the case, it’s important to make sure no client is being overlooked and that essential equipment is being tested thoroughly.
5. When was the last time I tested and when should I test again?
Getting to the bottom of this might involve consulting information you already have (like client maintenance records; testing can be a part of these), or it can involve creating a simple spreadsheet that helps you keep track of what client was tested, how they were tested, when they were last tested, and what the results were. Whatever the case, keeping track of testing records is as important as keeping track of any other maintenance procedure, so be sure to carefully document what’s been tested and carefully follow the strategy you create for testing in the future.
Get to it
As we noted in our recent ebook, the last place you want to test your mettle (and your backups) is when you have your first disaster. The more you can familiarize yourself with recovery processes and scenarios, the better equipped you’ll be to help clients avoid downtime, which should be a top priority for any savvy service provider. Once backup and recovery plans are in place, test them so you know recovery inside and out. Don’t be caught off guard. Be ready to recover.