What you will learn from this tip: Many experts say CDP will replace traditional backup. But before you take the plunge into CDP, here are some key points to consider.
At some point you will find yourself considering implementing a continuous data protection (CDP) product for critical application data. CDP provides a recovery point objective (RPO) and recovery time objective (RTO) of essentially zero data loss and essentially instant recovery. Defined another way, CDP is a time-stamped backup stored on secondary disk. The appeal of CDP is its ability to quickly rewind applications to any point in time to find a consistent image of the data.
Don't confuse CDP with mirroring or "fine-grain" snapshots. Mirroring provides data protection only from hardware failures. If data is corrupted or deleted on the primary system, it will be on the mirrored copy, too. CDP provides protection from both hardware and data failures. Snapshot products capture changes as points in time, with every snapshot checked for consistency before the next one is taken. There are time gaps between snapshots, but CDP products capture changes continuously -- without any gaps or missing data.
CDP ensures there aren't any gaps because it captures every file, block or table change as it occurs. And while it's possible to restore data to any point in time with CDP, there's often a consistency issue of determining which point in time to rewind to. Many CDP products aren't able to identify the most recent point in time when the data was verifiably consistent.
This can make CDP restores a trial-and-error process; an administrator must guess a point in time to restore from. If the guess ends up being a recovery point post corruption, the data must be recovered again from an earlier point in time -- greatly increasing the recovery time. If the administrator plays it safe and chooses a recovery point too far back before the known corruption, the CDP recovery time can be worse than current methods. This guessing game can nullify the fast recovery CDP is supposed to provide.
Five CDP vendors -- Asempra Technologies, Asigra Inc., Mendocino Software, Atempo/Storactive and XOsoft Inc. -- offer a feature known as enhanced recovery management, to address this problem. Mendocino (including its OEM vendors EMC Corp. and Hewlett-Packard [HP] Co.) inserts event markers into the collection process, monitored by a policy engine. By incorporating CDP-awareness of business processes and events (such as the quarterly close), eliminates guesswork for consistent data recovery.
XOsoft's CDP continuously captures application- or database-specific writes and update events. XOsoft creates a journal entry for each write and event for recovery purposes. When corruptions occur, damaged data is rewound to the last consistent state. Because only changes since the latest consistent state are rolled back from the journal, recovery is fast. Storactive's CDP is Windows environment (Microsoft Exchange and file systems) specific with similar technology to XOsoft's.
Asempra's CDP is transaction-aware and application-specific. Asempra's CDP technology communicates directly with Exchange, SQL Server and Windows file systems (including CIFS). Before a transaction is copied, it checks the integrity of all of the data prior to forwarding it to the recovery server. This allows data corruption detection as it occurs providing a marker to determine the best recovery point.
Asigra's CDP methodology is a two-stage continuous backup that agentlessly backs up any changes on Windows servers to a local collector as they occur. The local collector aggregates the changes, deduplicates, compresses, encrypts and then sends them to the central collector. The central collector automatically checks and verifies data for consistency and recoverability. If it determines a file can't be recovered, it automatically asks for it again from the local collector. This provides a known consistency point for all recoveries, again providing a quick recovery time.
Asempra, Asigra, Storactive and XOsoft also allow an application (such as Exchange) to be recovered first, and to be up, running and writing transactions even if all the data, mailboxes and transactions haven't been recovered. As the data is being recovered, the CDP system continues to protect the live application that's running.
CDP and Microsoft Exchange
CDP is designed to provide the highest level of data protection for applications that can't afford to lose any data, such as database management systems, point-of-sale systems, financial transaction systems and email.
The main application driving CDP deployment is Exchange, which is extraordinarily difficult to restore with most data-protection applications. Restoring Exchange is complex and frustrating. It can be extraordinarily time-consuming depending on number of mailboxes and messages needing restoration. These are typical steps to restore Exchange:
- Apply last full backup.
- Apply transaction logs (if available).
- Restore messages and transactions to each individual mailbox; this is time-consuming and often skipped, leaving a lot of data that's never restored.
- During the restoration process, Exchange is down and a temporary server is required.
CDP is ideal for restoring Exchange; recoveries are painless and very fast. It first rewinds Exchange back to the last known consistency point and gets it running in seconds or minutes. After Exchange is back up, the CDP application allows point-and-click restoration of messages and transactions back to individual mailboxes.
Evaluating competing CDP products
When evaluating a CDP product, you want to be certain that it can resolve consistency issues. Other capabilities to look for in a CDP product include:
- Support for your server operating systems and critical applications.
- Ability to scale to accommodate three years of data growth.
- Data-retention policies that match your organization's primary data-retention policies.
- Elimination of protected data based on time or policy rules while providing digital certification.
- Capability to automatically roll up older data copies into aggregated master copies, reducing storage space and restore time.
- Ability to encrypt CDP data.
The best place to roll out CDP is with Microsoft Exchange. Once CDP for Exchange is successfully deployed, look to expand its use throughout your storage environment to simplify data protection.
About the author: Marc Staimer is president and CDS of Dragon Slayer Consulting in Beaverton, Oregon. He is widely known as one of the leading storage market analysts in the network storage and storage management industries. His consulting practice of six plus years provides consulting to the end-user and vendor communities.