VMware data protection: An overview of how it works

There is a variety of tools that improve VMware data protection, with advantages in the areas of VM backup, replication and snapshots, compared with a traditional server environment.

This article can also be found in the Premium Editorial Download: IT in Europe: Data protection: Preparing for new EU regulations

If you are set to embark on review of VMware data protection options, there are a couple of areas you need to consider before approaching management with new plans.

Primarily, you have to recognise server virtualisation fundamentally challenges the old ways of protecting data. 

As you probably know, virtual machines (VMs) are encapsulated into a series of discrete files called “virtual disks.” These are presented to the operating system within the VM as if they were physical, but they are in fact files that reside in the virtualisation vendor’s chosen file system.

As they are just files, they can be copied around the network in archive format, which introduces a whole new way of backing up without the need for in-guest backup agents. This can be extremely advantageous because most legacy backup vendors that historically backed up physical servers charge on a per-agent licensing system. In addition, virtualisation has had a big positive influence on other data protection schemes, such as replication and snapshots.


It is fair to say that VM backup in the early days lacked the same granularity that legacy systems possessed. In the early days, a backup of the virtual disk was essentially a “normal” backup every day regardless of the rate of data changes taking place during business hours.

Similarly, restoring a VM from virtual disk backup often meant restoring the entire backup just to gain access to a couple of kilobytes of data. Fortunately, these limitations were overcome some time ago.

Most in the industry and community now rate VM backup as good as if not superior to the agent-based backups of yore; many vendors are able to bring a backed-up VM onto the network in a matter of minutes.

This is achieved by making the storage that acts as the backup target available across the network, thus negating the need to restore data to the original storage. Backup storage is mounted directly to production virtualisation hosts, the backup VMs registered to the system and then powered on.

After this, it’s up to the administrator whether to destroy the original and use the backup copy -- or simply locate the lost data in the backup VM and copy individual files back. Another method adds the backed-up virtual disk into the VM that needs files restored to it, and so lost data can be copied from, say, a temporary T drive to D data drive. None of these new methods could have been achieved without the power of virtualisation.


Besides backup, another way of protecting data is to replicate VMs from one system to another, which is frequently done as part of a disaster recovery strategy. VMware, as well as other virtualisation vendors, now offers its own built-in replication technologies. Currently, these are available only if you buy into the wider DR automation tools, but it is likely these will be decoupled from these bundles and made available independently.

Even before the virtualisation vendors got into the replication game, the third-party ecosystem was already offering replication technologies, such as Veeam Software’s Veeam Backup & Replication and Quest Software’s vReplicator. The interesting aspect of these VM-aware technologies is they are storage vendor-independent -- and therefore it’s possible to replicate VMs from one type of storage or storage vendor to another without worrying about incompatibilities.

That means that a business could choose to replicate data from its high-end storage from one vendor at one location to cheaper, low-end storage at another from a totally different vendor. This leaves IT administrators free to think about what really matters: how much network bandwidth they have and how that may limit the frequency of replication and affect their recovery point objectives.


Of course, this doesn’t mean you should forget about the data protection features your storage vendor offers you. Most have for some time had the capacity to snapshot LUNs or volumes at specified intervals. This allows the storage administrator to present the previous state of those storage units to the virtualisation host.

The big advantage of snapshots is they can be generally done at a much more frequent rate than replication alone allows for. Additionally, the functionality can be extended to all VMs regardless of their business critically. With replication we normally have to triage the VMs, separating the wheat from the chaff to reduce the load on the network.

Storage vendor snapshots are free of network cost -- with their real costs being storage wasted holding redundant copies of data should the business need to roll back to a previous copy. Fortunately, with the rise of data deduplication and policy systems that roll up out-of-date snapshot data, this burden has been significantly reduced.

The real advantage of snapshots is that they can massively reduce restore time should you have a catastrophic error such as a LUN or volume corruption or the mass infection of VMs by a virus. With the snapshot literally being minutes behind production data, you can roll back with minimal data loss and in a fraction of the time that conventional backups could provide. That’s not to say that backups aren’t still key to data protection -- but they can be enhanced and augmented by the use of replication and snapshots to offer the business the flexibility and choices it will demand when data loss occurs.

Factors to weigh in VMware data protection decisions

Before embarking on a process of making decisions around VMware data protection, you should look at your existing backup vendor and see what integration points it has to your chosen virtualisation platform. Many legacy backup vendors have now started to offer improved integration to the virtualisation layer.

Remember that if you’re new to server virtualisation, you will have a commitment to physical servers for some time to come, and now might not be the optimal moment to switch to a radically new method of backing up. With that said, it could be that your previous backup and data protection strategy left something to be desired and the virtualisation project will offer a chance to question existing procedures. If that’s the case, you should look to your virtualisation vendor for its built-in backup and replication technologies before bringing in yet another vendor to manage, maintain and update.

For example, the VMware vSphere platform includes a free-to-use Data Recovery appliance, which integrates directly into the main management platform (vCenter). It’s a rather modest system currently -- a single appliance is limited to backing up just 100 VMs (you can have multiple appliances) -- but many industry experts expect that in the future it will receive major updates from VMware. 

Mike Laverick is a VMware forum moderator and member of the London VMware User Group. He is also the man behind the virtualization website and blog RTFM Education, where he publishes free guides and utilities for VMware customers. 

Read more on Disaster recovery