Backup and data protection: The right mix

Backup is no longer the only data protection game in town – now, with replication, snapshots and CDP, you need to choose the right combination of approaches.

Does Doctor Who keep backups in the Tardis? The concept certainly wouldn't be strange to him because data protection is a journey in time. At one end there's the archive view of how your data looked years ago, in the middle is the backup of the system from last weekend, and at the other end is a copy of how it is right now.

With replication, snapshots, deduplication, continuous data protection (CDP) and more, few areas of information technology have changed as much as data protection. Where once the tape drive reigned supreme, there's now a sometimes confusing mass of different technologies jostling for recognition.

You need a range of different types of data protection, depending on what you're trying to do.
Tony Lock
analystFreeform Dynamics

Backup software must now deal with different targets, including virtual tape, disk and Internet-attached, third-party remote storage. In addition, newer methods of data protection, such as replication, have expanded to answer calls for ever higher system availability and to deal with the challenges presented by the likes of server virtualisation.

For most users, replication and backup will be complementary, not competitive. That's because schemes such as replication and mirroring, which keep a second native copy of your system (backups tend to be in a proprietary format), can get your applications back online more quickly after a failure. However, if primary data is corrupted or deleted, the secondary copy will be as well.

Having said that, a backup or snapshot is taken at a point in time so you can recover an older version of a file or system image from before it was damaged. Also in the backup camp is CDP, which logs all changes to the protected data as they occur, allowing you to roll back to pretty much any point.

Another relevant technology is deduplication, which can save space by removing repeated elements, whether from file-based backups or system image-based snapshots. Some data sets contain more duplication than others, but it's not unusual for users to report compression ratios of 10:1 and 20:1 or more.

Deduplication has proven particularly effective when used on snapshots of virtual servers, as these typically have a lot of elements in common.

"You need a range of different types of data protection, depending on what you're trying to do," explains Tony Lock, programme director at Freeform Dynamics Ltd., which is based in Hampshire, U.K.

"The first thing is to make sure you know what systems and data you have out there," he says, adding that virtual server sprawl is a growing issue. "Server virtualisation makes it easy to create new systems, but they might not be on your backup schedule."

The second thing to ask, he says, is "How important is the data to your business? This affects the frequency of backup you'll require and how quickly you need to recover. It's a balance between those two. And thirdly, how much can you afford to spend?"

Fast recovery is why London-based financial services company Matrix Group Limited uses Double-Take Software's Double-Take for VMware Infrastructure to protect its VMware systems by mirroring the virtual servers to other machines, according to group IT manager Laurence Duff.

"The whole point of Double-Take isn't elegant long-term protection. It's so that if we have a system failure we can be back up and running in 30 minutes," he says. "It's like having a spare car in the garage instead of having to go out and buy another one."

Only the Matrix Group's critical systems are replicated to its disaster recovery site, which is based at IT services provider Oncore IT's data and network operations centre. Its back-office systems are backed up overnight by Oncore IT using backup tools from Asigra Inc. Duff notes that for the Matrix Group, as with so many other businesses, where "critical" once meant the accounting system, it now means email, too.

Duff adds that his critical systems are also backed up online, in case his team needs to do a bare-metal restore or retrieve a deleted file. "We've had online backup for around four years. It's mainly for convenience and the speed of restores," he says.

Online backup is particularly useful for distributed and remote users, says Scott Kelly, IT helpdesk engineer at Coventry, U.K.-based flooring supplier Amtico International Limited, which recently signed up with Iron Mountain Digital for its Connected Backup for PC Web-based backup service.

More significantly, it means that Amtico's laptop-reliant sales reps and directors, as well as the staff in its smaller branch offices, don't need to connect backup drives or change tapes.

"Most of our sales reps aren't in the office regularly, so it's difficult to back up their data in the traditional manner," says Kelly. "The branch offices have their own backup servers. But if there are only four people with the minimum of local infrastructure, as there is in our Stockholm office, say, we'd go with Iron Mountain. It's a case of how much data they have to protect."

He adds that it's only recently that services like this have become a practical proposition. "Five years ago, you were lucky to get 256kbps at home which, while it's technically broadband, isn't exactly quick."

However, Kelly warns that there are still issues to consider before you commit to an online backup service. The biggest one is that, while most backups will be quite short (as they send only the changed data), a full system restore over the Web can be a big task.

"We can restore files here from tape within 15 minutes, but if you have to download a gigabyte folder over the Web [to rebuild a lost laptop] that can take a while," he explains. "You need enough bandwidth from your third party for restores."

Increasingly, backup -- whether local or online -- goes first and foremost to disk, with tape relegated to a second tier and used only when a long-term backup or archive is needed. A big part of this is speed of recovery, says Mark Parsons, IT server manager at the Wolverhampton City Primary Care Trust (PCT), which uses BakBone Software Inc.'s NetVault:Backup to back up first to disk and then to tape.

"Tape is still our long-term data protection, but the disk element allows recoveries to be much quicker. In fact, we're now looking at expanding our disk capacity with this year's budget," says Parsons. "We get eight days on the disk array, and we hope to get a month with the new kit."

He adds that backups are much faster -- even on the worst weekday nights, the daily backups now finish with four hours to spare.

Parsons says he'd like to use the new kit to store snapshots of the Wolverhampton City PCT's VMware servers, which are currently backed up traditionally via an agent running on each virtual machine. Snapshots allow a server to be imaged instantaneously, with the image then mounted and restarted when needed.

"We're also looking at using deduplication on the snapshots," he says. Wolverhampton City PCT replicates various key systems, such as its SQL database and Exchange servers, says Parsons, and it plans to use backup's close cousin archiving on its file servers to reduce the load on its main storage.

Archiving is another technique which, thanks to disk storage, has grown significantly in popularity. A key reason for this is the ability it brings to access archived data online.

Archiving has enabled London courier company Addison Lee Plc to use a disk-based Mimosa Systems Inc.'s NearPoint archive to protect its email systems, says IT manager Paul Caney. As well as saving a copy of everything that comes in (to guard against accidental deletion), the archive allows older messages to be removed from Addison Lee's three Exchange servers, dramatically cutting storage and backup costs.

Caney had initially proposed a short-term fix of imposing a limit on everyone's mailbox size. "This approach just wouldn't work, since no one wanted to delete their emails," he says.

Users had also complained about their mailboxes running slowly, especially remote workers and those who had been with the company longest and therefore had larger mailboxes, often more than 2 GB in size. "Once a user's mailbox goes over a certain limit, Outlook keeps disconnecting from the Exchange server, leading to performance degradation," says Caney.

Archiving to disk rather than tape means the Mimosa store can be accessed seamlessly from Exchange, he says. "Archived items simply appear as another folder in the same familiar Outlook client. Because the archive is accessible by each employee, the administrative burden of recovering past messages is negligible."

As well as hardware, software and management savings, Addison Lee expects to shrink its Exchange server mailboxes from 75 GB to as little as 5 GB each. Caney says that should cut backup time from approximately 12 hours to just one hour; and because there's so much less to restore, recovering from server crashes will also be quicker.

So with online backups, disk-based archives, deduplicated snapshots and replicated virtual servers already here, what's likely to be the next big trend in data protection?

"We're starting to see CDP coming in; it's making a note of every time a file or database changes, so you can always rebuild the original. Essentially, it's a backup that's always up to date," says Lock at Freeform Dynamics.

Snapshots can do some of this, he says, but they need the system to be paused or frozen briefly to ensure that the image or copy is consistent and mountable. By contrast, CDP works at the file-system level.

"With CDP, it's just a change log, so you should always be able to come back to a consistent copy -- in theory, at least," he explains. However, he warns that "in practice, it might not quite be ready for some complex database applications."

Read more on Data protection, backup and archiving