Only one way to know how restores will perform: Complete them

Backups are only done for recovery, but when we are targeted against a backup success rate, it's easy to lose sight of that. Most backup environments are designed to place as much data to tape as possible in as short a time as possible. But will a backup that takes 10 hours to complete take 10 hours to restore?

It's impossible to provide a generic answer to that question. Some jobs will restore much faster than it took to back them up; others will take much longer. When a restore is taking longer than you expected, things tend to get a little hot under the collar. Recently, while at a customer site, I witnessed a backup that took eight hours to complete and two days to recover. The backup administrator was not the man of the moment!

The restore target server, the network and disk it is attached to, the backup software, the operating systems involved, the backup strategy, the tape, the drive and the type of data all play a part in determining not only backup speed but also restore speed. The only sure way of knowing how your restores will perform is to complete the restores.

When your restore looks like it won't complete for a week, you might find that the buck has stopped with you, and when that happens you'll wish that you shouted louder.
David Boyd
senior consultantGlassHouse Technologies (UK)
Regular restore testing is a good idea. I doubt if anyone would argue from a conceptual standpoint, but when was the last time you performed a full system restore? In many organisations the answer is worrying, and there are several reasons for it.

Backup administrators are often overworked, and simply don't have the time to routinely and systematically perform restore tests. They are too busy just trying to achieve the 99.5% backup success rate that management said they had to maintain.

Secondly, test restores require test hardware. Test hardware that provides no direct return on investment is not always easy to find. Take it one step further and to truly test your restore performance properly, you need test on hardware that matches, or nearly matches, your production environment. That includes disk of the same tier, extra NICs, HBAs, switch ports, etc. All of this hardware is an expenditure, and can depend upon the storage and server teams willing to help.

Several organisations offsite their media to a remote location and retrieving tape can involve third parties or travel journeys, ultimately adding additional incurred cost and time. Furthermore, as stated above, backups are only done for recovery, not just restore. It is not sufficient to simply restore if what is being restored isn't good enough. A recovery isn't complete until the application owners have tested it and proved that it works as expected, and that takes more commitment from people who are already overworked.

For the above reasons regular restore testing is often not high on our agenda and if managerial buy-in to the process does not exist, then it simply will not happen. So how can you protect yourself from being caught in a restore trap if the willpower is not there? Internal auditors are always a good start and quoting regulatory rules can help. However to obtain sufficient hardware and human resources requires commitment from the top down.

Putting together the necessary components so that you can regularly perform recovery tests on all your applications sounds like an ordeal in itself. However, when the heat is on and your restore is looking like it won't complete until a week on Tuesday, you might find that the buck has stopped with you, and when that happens you will wish that you shouted louder.

About the author: David Boyd is a senior consultant at Glasshouse Technologies (UK), a global provider of IT infrastructure services, with over 7 years experience in backup and storage, with a major focus on designing and implementing backup solutions for blue chip companies.

Read more on Data protection, backup and archiving