HaywireMedia - stock.adobe.com

Cloud DR: Key choices in cloud disaster recovery

Flexibility and low cost make the cloud well-suited to disaster recovery, but there is no one-size-fits-all route to cloud disaster recovery. We look at the key choices

This article can also be found in the Premium Editorial Download: Computer Weekly: Can China’s Alibaba successfully take on the US public cloud giants?

Only the luckiest IT professionals will go through their careers without experiencing a major systems failure, outage or other incident that forces them to invoke a disaster recovery (DR) plan.

According to Gartner’s 2017 Security and Risk Survey, 80% of IT leaders experienced such an incident in the past two years.

As businesses become less tolerant of outages and demand quicker recovery times, firms small and large have increased spend on disaster recovery and business continuity – and much of that spend is moving to cloud disaster recovery, with 23% of companies now saying the cloud is their preferred DR location.

Organisations often already use cloud-based archiving or cloud-based backup and recovery services. These might be dedicated services, bundled with existing backup software, or set up by local IT teams with generic, public cloud storage. However, true disaster recovery goes beyond backup and archiving.

In a DR scenario, the cloud environment needs to host the operating system, application stack and data. Typically, businesses do this by copying virtual machines (VMs) to the cloud, and IT can then spin up these VMs if local systems fail.

By using a well-planned cloud DR system, staff and customers might notice little, if any, service interruption if there is an outage. Critically, though, firms need to keep their cloud data copies in sync with primary systems, and plan for how to restore to a local computing environment after the incident.

Making in-cloud DR work

The broad scope of suppliers, systems integrators and resellers that offer a cloud-based option for disaster recovery means there is no single, standardised approach.

As with local disaster recovery systems, the starting point is usually server replication and data synchronisation.

Depending on the size of the customer’s IT environment, a software agent might perform this task or it might be done through an appliance or a local staging drive, typically a NAS box that is then shipped to the cloud provider’s datacentre.

Some suppliers offer a straightforward replication of local IT to the cloud – an approach best suited to datacentre environments.

Others copy deduplicated data, which are typically services with their origins in data backup. Using deduplicated data can put performance limitations on the replicated cloud environment. As a result, some suppliers – such as Druva – use deduplicated and non-deduped images in a recovery scenario.

Regardless of the technical approach to backup, once data replication is complete, locally based DR software then syncs any changes to local data. If the client’s IT team invokes the DR plan, the cloned server becomes the live server.

But suppliers are not the only route to cloud-based DR, as larger enterprises could consider building their own cloud DR environment that uses either public or private cloud storage.

Whether this is viable will depend on the capacity and skills of the IT team, the types of systems and workloads the business runs, and the extent to which the local IT environment is – or can be – virtualised. Private cloud-based DR systems will share many of the same limitations as commercial environments.

Cloud is not for all

Despite its growing popularity among CIOs, cloud disaster recovery does have some limitations. These are technical, practical and financial.

Cloud-based DR will not handle all legacy, custom or bespoke applications. Mainframe and Unix environments are not transferable “as-is” to a virtual machine in the cloud, although data backup is possible. There are providers that can offer recovery options for legacy applications, but the choice is much more limited than for modern, virtualised workloads.

Nor is cloud-based DR well-suited to very data-centric systems – from video storage to large analytics applications. Big data projects are especially hard to accommodate in the cloud, as they involve large data sets which also change frequently. Systems that use local input/output (I/O), such as manufacturing or control systems, might also be hard to recover to the cloud.

The greatest limitations, though, may well be operational, as organisations need to move their operating environments and data to the cloud.

This requires good IT housekeeping, including a good understanding of where data is stored. It also requires time. Some suppliers will “seed” systems by copying data to physical media or through a conventional, on-site, duplicated server. Few providers currently support data ingest to a cloud environment from tape.

Bandwidth considerations might force CIOs to pick suppliers that use seeding or an appliance, or to abandon cloud-based DR, or look instead at a private cloud product.

Bandwidth from the local premises is one factor – another is the bandwidth into a public cloud service. Private clouds do at least give firms the option of picking locations with sufficient bandwidth, even when that comes at the cost of more local management of the DR process. Copying terabytes of data to the cloud can take weeks, and will affect the rest of the business’s IT operations.

But the real bandwidth issues come when it comes to restoring local systems. At present, few cloud-based providers can copy files back to a physical drive once the client has invoked the DR plan. Extracting data from the cloud is likely to be a slow process as the company will need all its data back, not just incremental updates.

It is important to note that cloud DR systems work to the level of the virtual machine. Firms will be able to run systems in the cloud after the incident, but they still need to make arrangements to restore systems and data, and in the case of a physical disaster, obtain new datacentre locations and hardware to run them.

Storing large volumes of data in the cloud for long periods of time becomes expensive, and businesses also need to allow for usage charges to run cloud-based VMs. It is easy to underestimate these costs, not least because few companies will know exactly how long they will need to recover their IT systems until they try.

A question of measurement

Conventional IT disaster recovery is measured by well-known and well-understood metrics known as recovery point objective (RPO) and recovery time objective (RTO).

Suppliers of on-premise or datacentre-based DR systems have had many years to calculate what is needed to meet clients’ needs for the data they recover, and how quickly.

Cloud service providers might not give service-level agreements (SLAs) that are clear enough to calculate whether SLAs can meet DR needs, or be able to tell clients the readiness of recovery systems at any one point in time. Nor is it always possible to test cloud-based DR environments at will.

Against this, though, is the increasing sophistication of disaster recovery as a service (DRaaS), and the potentially lower costs of DR in the cloud. Like insurance, companies hope they never have to call on DR. And at least with cloud-based DR, costs are relatively low unless the worst does happen.

Cloud disaster recovery and DRaaS providers

Acronis: Acronis Disaster Recovery Service, formerly nScaled DRaaS, is an appliance-based DR system that replicates data to the cloud. It allows file, database and system-level recovery.

Bluelock: Tailored and managed DRaaS services, with managed recovery.

Carbonite: Tiered data protection, including disaster recovery and high availability (business continuity) services, as well as simple in-cloud backup and plans for small businesses.

CommVault: Automated disaster recovery for on-site, datacentre and cloud enviroments. Failover and failback as well as recovery testing from established DR supplier.

Iland: DRaaS from secure hosting provider, working with technology from Veeam, Zerto and VMWare. Ability to protect physical and virtual environments. Offers DIY testing.

Recovery Point: Mainframe and mid-range DRaaS including IBM mainframe support, as well as conventional disk-to-disk and disk-to-tape for archival and secondary recovery.

Sungard: Support for hybrid (cloud and on-premises) IT systems. Managed services for DRaaS as well as a portfolio of conventional disaster recovery and business continuity services.

Zerto: Virtual replication service for a range of VM technologies that operate at hypervisor level. Used by a number of DRaaS services as well as IT departments.

A wide range of service providers and value-added resellers, including firms such as IBM and HPE, and telecoms firms, including Telefonica and NTT, also offer DRaaS services.

Read more about cloud disaster recovery

  • Disaster recovery methods: Legacy DR vs the cloud. Cost, RTO and expertise are important factors in selecting the right disaster recovery method.
  • Get your cloud DR test plan in order with this group of guidelines. Find out what your DRaaS supplier has to offer in terms of testing and how to make the most of it.

Next Steps

Commvault vs. Zerto: How do their DR products compare?

Read more on Cloud storage

Data Center
Data Management