By Danny Bradbury
Remote replication is a well-established practice that can replace traditional backup and enhance disaster recovery planning. But, remote replication can be deployed in a variety of ways, so what data replication methods are available?
Remote replication copies data to a secondary site as part of a disaster recovery plan; it traditionally involved backing up application data, but it is now possible to replicate entire virtual machines too. This can be useful to maintain server images with the latest configuration, including operating system and application security patches that are all set to be made live in case of a serious outage at the primary site.
Broadly speaking, remote replication breaks down into two main types: synchronous and asynchronous. Synchronous replication copies data to a secondary site (or sites) as soon as the data is created. Asynchronous replication does it in non-real-time, after the fact.
The advantage of synchronous replication is that it eliminates the risk of accidental data loss. The downside is that it requires low-latency communication because the secondary site must confirm that each packet has been received without error. The farther the secondary site is from the primary, the harder that is to accomplish.
Asynchronous replication will theoretically work over any distance, but your risk of running behind the primary site grows as the distance between sites increases. Asynchronous replication also risks data loss in the event that the primary site goes down in between replication sessions. There are some workarounds, however. These include multi-hop replication, in which an intermediary SAN within a viable distance is written to using synchronous replication, followed by asynchronous replication to a secondary site much farther away.
Another alternative is enhanced physical protection at the local site. For example, Axxana sells a replication appliance hardened to withstand heat, pressure and water. In the event of a disaster, it retains unreplicated data until the owner is ready to recover it. Axxana claims this brings the benefits of synchronous replication to asynchronous networks.
Beyond these two broad models of replication, however, there are other categories. In addition to simply copying data from one site to another, companies can also implement continuous data protection (CDP). CDP tools are much like journalling products. They copy each change to a data set across the WAN rather than copying the data itself. That makes it possible to roll back to any single point in time.
Snapshots provide a less functional solution (but again, one that is more workable across long distances). An individual snapshot of a data set is taken at set intervals and sent to the secondary site. This often gives a less granular set of rollback options. Snapshotting can copy the entire data set, but doing so can be bandwidth-intensive. The alternative is to copy the delta between the last snapshot and the next one, capturing only the changed data.
These different approaches are available in multiple data replication methods, ranging from host-based to array-based to network-based replication.
Host-based replication is specific to a particular server. Tools in this class include Vision Solutions' Double-Take RecoverNow, which offers functionality including CDP. The downside of these solutions is that they can be difficult to scale and can cause orchestration problems if implemented piecemeal inside a large organisation. As Clive Longbottom, co-founder of analyst group Quocirca, said, "Individually replicating 20 Windows 2003 servers can lead to a lot of data duplication. But for single-server branch-office situations, such solutions can be an option."
Host-based replication can be a good way to improve service-level agreements at key points in an organisation. Martin Clarkson, technology lead at integrator Computacenter, recalled a financial services client that relied on array-based replication in which an entire storage array was copied to a remote site at the block level. One division, he said, needed faster recovery options than array-based replication could offer.
"That division's data still sat in the same data centre as the rest of the organisation, but its systems were replicated using host-based tools. That made them available within minutes, whereas using the normal array-based replication, it would have to wait for its place in the queue," he said.
Array-based replication tools, such as EMC's SRDF (Symmetrix Remote Data Facility) and NetApp's SnapMirror, do have their advantages, however. They replicate a whole storage array at once, which can make the replication process easier to manage. The downside of this data replication method is that these tools are often vendor-specific, which reduces customer choice when buying equipment.
The third option is network-based replication, generally using an appliance that sits at the edge of the network. These tools, such as EMC's RecoverPoint, have the advantage of being able to manage heterogeneous arrays and servers. The other advantage of this data replication method is that it makes it easier to orchestrate replication policies that take multiple arrays and servers into account.
Handling replication at this level also makes data deduplication tasks easier. Dieter Orth, operational improvement manager with the UK arm of mortgage asset management company firm GMAC-RFC, said his team used deduplication appliances from Data Domain (now part of EMC) to replicate data faster to a DR site.
In early 2008 Orth found himself backing up 10 TB of data to tape, which would then be transported to a disaster recovery data centre 15 miles away. The goal was to complete the backups overnight by the start of the next working day.
"Backups to tape were running slowly, and we had lots of failures and frequently missed our target to move them off-site," he said, recalling that when the company moved its offices and data centre to a new site it had the opportunity to rethink its architecture.
The team opted for an asynchronous replication solution that worked from a local backup. The team takes a full local backup from its 70 TB SAN every Friday, followed by incremental ones throughout the week. "We do the backups on a daily basis in the evening or overnight. Once it's been backed up to disk, then the Data Domain device at the disaster recovery site talks to the other Data Domain device, and, with a 10-minute delay, the data is at the disaster recovery site," Orth said.
Thanks to deduplication, the team is able to able to support the 70 TB SAN along with hundreds of servers over a 1 Gbps leased line from BT. "When the Data Domain device replicates, we only copy a 10th of what we have as data," he said, explaining that despite that, the deduplicated data load has risen to 15 TB in three years.
Orth's experience shows the importance of layering additional functions into the replication process. Data deduplication at the point of replication can save on bandwidth. Perhaps even more importantly, it can enable companies to fulfill replication requirements within time windows that are increasingly tight as data loads increase and as globalised workforces require access to systems at different times of the day.