Traditional disaster recovery provision in which the disaster recovery site is a replica of the primary site is expensive, and when seen from the point of view of a virtualised server environment, it is far more clunky and prone to problems.
In a traditional disaster recovery setup, a physical copy of the primary environment must be maintained; that means duplicating applications, as well as ensuring that hardware and software is updated simultaneously at both sites. Because of the expense and hassle involved, many companies actually prioritise the small number of servers they consider most critical.
The physical disaster recovery approach relies upon inputting backups to waiting servers that are already configured. This can involve a lot of people because numerous hardware and software issues have to be dealt with to get the business up and running again.
When it comes to disaster recovery in a virtualised environment, the key difference is that the recovery procedure is based more on process than knowledge. That's because the entire primary environment is encapsulated in hardware-independent software and as data that resides in storage. You can copy virtual machines (VMs) using backup tools to a secondary site or, if business criticality dictates or budgets allow, the primary environment can be replicated to a second site to provide failover cover.
Testing is also far easier in a virtualised environment as cloned VMs and data can be run up easily at a second site/test environment. For example, VMware has made testing a simple matter with its vCenter Site Recovery Manager (SRM) tool. VMware vCenter SRM allows "fire drill" testing of failover between one site and the other.
Server virtualisation and disaster recovery: Key points to remember
1. Developing a comprehensive disaster recovery plan is the most important component.
2. Application dependencies are the main worry, so you should copy dependent systems simultaneously to make sure there are no data inconsistencies when it comes time to make restores. Replication technology should be integrated at the hypervisor and application level to ensure consistency upon restores.
3. Remember the order in which systems will need to be brought back online. Prioritisation should be carried out that places VMs that comprise a service in the same LUN to enable failover and recovery within the required recovery time objective (RTO) and recovery point objective (RPO).
4. Beware of server sprawl. New VMs are very easy to create, so make sure the backup team knows that they have to be backed up. If they're not backed up, they're not covered if a disaster should strike.
5. You may also need to consider your data storage infrastructure when embarking on such a project. Smaller organisations can get away with direct-attached storage (DAS) for small development and test installations, but the larger a virtualised environment is, the more attractive shared storage as a repository for virtual machine system files and data becomes. Basically, it helps if everything you need is held in one place.
This was first published in March 2010