VMware disaster recovery using VMware onboard features

Find out how to prepare for VMware disaster recovery using features such as Site Recovery Manager, vSphere Replication, VMware High Availability and VMware Data Recovery.

Server virtualisation changes everything for disaster recovery (DR). The abstraction of virtual servers away from the physical server opens up a range of features and capabilities that make protection against and recovery from unplanned outages a lot easier than it used to be.

VMware disaster recovery products such as Site Recovery Manager offer a number of ways of preparing for and recovering from an unplanned outage. VMware disaster recovery can also be enabled by use of VMware capabilities that protect against hardware failure. These include vmnics -- network cards that are bound to virtual switches in ESX -- as well as features such as VMware High Availability and the backup virtual appliance VMware Data Recovery (vDR).

In this article we’ll start with VMware disaster recovery measures that can be taken in vSphere and using the backup tool vDR, and then we’ll move on to look at the capabilities of Site Recovery Manager and vSphere Replication.

Protect against hardware failures

The step to limiting the scale of any disaster is by enabling availability features that are part of the vSphere platform.

Let’s start with networking. Virtual machines (VMs) are not plugged into physical switches but virtual switches. Any vSwitch you have that supports a load that’s important to your infrastructure should be backed by two physical network adapters called “vmnics” in the vSphere Client.

Physical network cards (vmnic1 and vmnic2) have been patched to the vSwitch, offering out-of-the-box load balancing and redundancy without the complexity demands of most operating systems.

The great thing about this configuration is that the nightmare of having to have the right NIC driver installed in Windows with different configuration tools is over. Any VM connected to the port groups vlan20, vlan21 or vlan23 is protected by the two physical vmnics backing the virtual switch.

In an ideal world, vmnic1 and vmnic2 would be plugged into two different physical switches to protect from a switch failure and into different network backbones to protect from a backbone failure. But at least this configuration protects you from the more common network card failure or someone accidentally pulling a cable from the back of the server.

Many vSphere-related SMB offerings now also include VMware High Availability (HA) as part of their feature sets. If you have shared storage and redundancy in the network, creating a VMware HA cluster is an almost trivial task; right-click your Datacenter icon and select Cluster.

For VMware HA to work, all you need is shared storage presented to the hosts and some network redundancy. These requirements are pretty much standard in most vSphere deployments.

These two technologies -- the use of vmnics and VMware HA -- will protect you from the most common outages: component failure and server failure. However, as we all know there are outages and threats to the business that go well beyond a faulty network card or server crash.

So, as well as enabling features that are baked into the vSphere product there are two other products -- VMware Data Recovery (vDR) and VMware Site Recovery Manager (SRM) -- that are central to VMware DR.

Backup with VMware Data Recovery (vDR)

Backup is a key consideration when it comes to VMware disaster recovery. There are, of course, many third-party backup products available with VM backup capability. But VMware has for some time had a backup product called vDR, which debuted in vSphere 4.

The vDR view is similar to other views in vCenter, except its UI is focused around the backup and restore of VMs.

vDR is a virtual appliance -- a VM with dedicate purpose -- so there’s no extra hardware to budget for.  Once you have deployed the virtual appliance and installed a plug-in to your management PC, you are ready to go. 

vDR supports a number of destinations for backup data, including virtual disk, raw device mapping (RDM) directly to storage, as well as conventional network-attached storage (NAS).  Once configured, vDR presents to the administrator a view of the infrastructure that looks and feels very much like the Hosts and Clusters view in vCenter.

This interface can be used to select every VM in the data centre or, using the resource pool object, you can pick out VMs that reflect particular workloads.

In vDR’s Backup Window module, the default is that the backup process happens only during out-of-office hours.

VMware hasn’t reinvented the wheel with vDR, so many of the options you would expect in backup software, such as the ability to schedule backup jobs, are similar to those features in traditional backup apps.

The restore process is also simple. You can overwrite an existing virtual disk that makes up a VM or restore the virtual disk to a different SCSI controller. 

The yellow exclamation mark in the Destination Selection portion of the Virtual Machine Restore Wizard shows that the restore process would overwrite the existing virtual disk. The pull-down list on the right allows you to set a different SCSI controller identity.

Of course, the need to restore entire VMs is pretty rare, and no one really wants to restore gigabytes of data. The vast majority of restores are of relatively small amounts of data, and for this VMware has a self-service utility called the VMware Data Recovery Client, which ships in Windows and Linux formats. This utility can be copied into the operating system and allows for the mounting of the backed-up virtual disk. This allows you to copy the files from the mounted virtual disk directly back into the system where the files that were lost or corrupted once were.

Site Recovery Manager and vSphere Replication

Site Recovery Manager (SRM) is an automation tool that creates a point-and-click method of bringing up VMs in another site. Its primary function is as a VMware disaster recovery tool, but it can serve other use cases too, such as “moving” VMs from one site to another if the business wants to relocate operations.

As with vDR, SRM extends the interface of vCenter with menus and buttons that guide you through the process. Coupled to this core automation is a feature to SRM that debuted last year called vSphere Replication (VR). As with vDR, it’s a set of virtual appliances that allow you to replicate your VMs to another site without the need for expensive storage array-based replication.

The virtual appliance sits on an ESX host at each of your sites -- the protected site and the recovery site -- and captures the changes taking place inside your protected VMs and replicates them over to storage in the recovery site.

vSphere Replication sits so high in the layers of the stack that the appliance doesn’t care what storage you are using. So, for example, you could be using arrays from the likes of NetApp or EMC and replicate to the recovery site to storage that is entry-level from a lower-tier vendor.

Being vendor-agnostic in this way enables cost savings approaches to DR that were previously not possible. For example, it makes possible partnering agreements where two organisations agree to act as each other’s secondary site, an approach that some public sector organisations have used. Of course, vSphere Replication can also be used to replicate to rack space in a commercial colocation facility.

vSphere Replication adds the option to protect individual VMs from a simple right-click menu option.

One of the benefits of vSphere Replication is that it can offer per-VM protection. Previously, the smallest unit of VMware replication was the data store with storage array-based replication.

The problem with replicating at the data store level is that it can lead to situations where VMs are replicated unnecessarily. For instance, you might have 20 VMs on a data store, but only 12 of them need protection. That isn’t particularly flexible. After all, no one wants to replicate eight VMs for no reason, and no one wants to spend time moving those eight VMs around to different data stores using Storage vMotion.

SRM Recovery Plans allow you to test your DR plans during production hours and ensure your VMs start up in the right order to meet any service dependencies.

vSphere Replication's granularity means you can right-click a VM and protect it (or a group of VMs if you wish) without wasting precious bandwidth or storage resources at the DR location.Of course, vSphere Replication is just the engine room; the whole point of technology like SRM is to aid in the recovery of a VM should a disaster occur. The chief raison d’etre is to allow you to bring up VMs in the correct order for all services to work using SRM recovery plans.

Mike Laverick is a VMware forum moderator and member of the London VMware User Group. He is also the man behind the virtualization website and blog RTFM Education, where he publishes free guides and utilities for VMware customers.

Read more on Disaster recovery