Three ways to backup and restore under VMware

Backing up a virtualised environment is tricky - but we have three ways to make it work!

It's pretty safe to say that server virtualization, specifically in the form of VMware, has been a boon to IT folks. From what we can see, it's resolving issues related to server sprawl, resource consumption, server provisioning, and even power consumption and high availability. It's also allowed us more time to address other more pressing matters with our colleagues, such as that upcoming enterprise resource planning upgrade, that on-again off-again storage migration project or why Star Trek XI was pushed back until 2009.

However, fundamental improvements in the realm of data protection still pose challenges in spite of the benefits of encapsulation and abstraction that VMware offers. Even with the advent of VMware virtualization, the backup guy is still the surliest of IT dudes. The most prevalent of these challenges is ensuring data consistency and addressing excessive consumption of VMware's underlying physical resources.

Because VMware can encapsulate physical servers into a handful of large hard disk image files -- virtual machine disk format (VMDK) files -- it's very tempting to think that backing up an entire server should be as easy as backing up these underlying VMDK files (and the handful of associated configuration files, of course).

But in most situations, this isn't the case. Unless the virtual machine (VM) is shut down, backing up a VM in its running state doesn't ensure that all in-flight activity is fully accounted for. In other words, this type of backup doesn't ensure that the data is consistent and, therefore, doesn't ensure that the restored VM contains enough accurate information to declare the restoration of the server as fully successful.

With respect to the challenge of excessive resource consumption, this is a side effect of virtualization. One of the key reasons to virtualize systems using VMware is to concentrate resource consumption onto fewer physical servers, thereby reducing the amount of idle cycles that most IT server infrastructures suffer from. However, in doing so, the unfortunate side effect is the inability to find enough resources to allow data backups to run unhindered.

This is also compounded by the fact that backups hit the most vulnerable points within VMware: Its narrow ability to handle excessive disk and network I/O. In fact, the decision to virtualize or not to virtualize a physical server most times is based on the intensity of disk and/or network I/O present on the physical server. Needless to say, a backup load is one of the worst loads for a VMware server to accommodate.

However, methods do exist to address these issues and provide benefits that, in some cases, are superior to standard physical server backup and recovery. But misunderstandings about each of these methods have ensued, as have the misconceptions around the implementations taken by many third-party backup/recovery products. Indeed, for many administrators, the most effective method to backup and recover VMware still escapes them; it's relegated to the realm of frustration and mystery.

Method 1: Local backup agent within each VM

How it works: This is a traditional approach to backups where the backup software agent is installed inside each VM exactly like would have been on a physical server. As outlined in the diagram below, data flows over the LAN to the backup/recovery infrastructure just as it would have if the agent was installed locally on a physical server.

Click here to see a diagram of a local backup agent installed within each VM.

The advantages of this method include:

  • Backup agent installation and configuration is identical to the procedures one would have to follow for the installation and configuration on a physical server, so no special skills or procedural changes is required.
  • The restore process is unchanged and would be identical to processes that one would follow for a file-level recovery back onto a physical server.
  • On that note, file-level recovery is possible; the significance of this point will become clearer as we elaborate on the other methods.
  • Full and incremental backups are also possible, again the significance of which will become clearer as we discuss the other methods.
  • If you also use specialized application-aware backup agents, such as agents for SQL or Exchange, this helps preserve the consistency of any application data; the resulting backup is application-consistent.

The disadvantages of this method include:

  • Because all the backups are running over a single server, you have to be careful not to overtax the VMware host's resources.
  • Although the servers are now encapsulated into a handful of large VMDK files, the backup agent isn't aware of this and, therefore, doesn't take advantage of this to provide a more rapid backup or restore capability, so there's minimal value from a disaster recovery standpoint where rapid full server recovery is desirable.

Deployment tips

Running data backups simultaneously may work well for physical servers, which will likely have an abundance of idle resources, but for VMware virtual infrastructure, where idle resources are purposely consumed, there's a greater likelihood that multiple backup operations will choke the underlying physical server. Thus, after virtualization, backup schedules should be adjusted so that jobs are staggered through the course of your backup window to avoid excessive overlapping of jobs.

Only allow a single data stream per VM. Since the VM's VMDK files usually reside on a single VMFS volume, it's very easy to overwhelm this underlying file system with a multi-streamed job. So unless the VMDKs can separate themselves on to individual volumes, (RDMs, iSCSI LUNs or separate VMFS volumes) backups should run single-streamed rather than multi-streamed.

Method 2: Backup agent in ESX Service Console

How it works: This method involves installing a backup software agent right within the ESX Service Console and backing up each VM's underlying set of VMDK files as outlined in the diagram below. Because the Service Console is a Red Hat Linux OS, a Linux backup agent can be utilized for this.

Click here to see a diagram of a backup agent installed in the ESX Service Console.

The advantages of this method include:

  • A single backup agent is required to backup your VMs, rather than an agent per VM.
  • Through this method, all of your VM's can be backed up in their entirety by simply backing up a few large VMDK files.
  • Fast image-level recovery is possible now because recovery involves streaming back a large image file rather than seeking for many small files.

The disadvantages of this method include:

  • Scripting is required to automate the shutdown, snapshot and starting up of the VMs. This is necessary in order to ensure application consistency during the backup process.
  • No file-level recovery is possible; this is exclusively an image-level backup/recovery method. Additionally, this means that no incremental backups are possible either.
  • VMware has indicated that its developmental roadmap involves removing the Service Console from ESX Server. VMware's ESX Server 3i is the first step in that direction.

Deployment tips

In order to ensure application consistency, the VMs should be shut down before the VMDKs are backed up:

  • VMDK files are static during the backup window.
  • Unfortunately, the VM is down for the duration of the backup.
  • The VMDK files are backed up using backup agent on Service Console.
  • If downtime isn't an option, then utilize VMware snapshots of a running VM to obtain a point-in-time backup.
  • The backup is crash-consistent, so it doesn't guarantee application data consistency.
  • As well, this also requires scripting in order to automate.
  • This method isn't necessarily supported by all backup applications, so you would need to investigate this before attempting this method.
  • For application-data consistent backup, utilize VSS to quiesce supported applications prior to backup. However, this does requires very complex scripting.

You can utilize VCB utilities on ESX Service Console to obtain snapshots of a running VM:

vcbMounter utility:

  • Creates a quiesced snapshot of the VM.
  • Exports the snapshot into a set of files to either a directory local to the console or a remote directory over the LAN.
  • The files at this location can be backed up and restored using any backup software supported on the ESX console.

vcbRestore utility:

  • Restores a VM to its original location or an alternate location from an export.

If you decide to venture into scripting, you'll find that error-checking and correct back-out is the most difficult aspect of scripting, usually encompassing most of the code.

Method 3: VMware Consolidated Backup (VCB-Proxy)

How it works: This method involves a VMware-developed set of utilities collectively known as VMware Consolidated Backup. This method enables LAN-free backups of VMs from a centralized Windows 2003 proxy server connected to the same SAN volumes as the ESX Server. The data is then presented to the proxy for subsequent backup by a supported third-party backup application. This method is more complicated that the first two methods and involves the following components:

Backup proxy server:

  • A server that has SAN access to the same volumes as the VMware host.
  • Image of the VMDK file is mounted/exported onto this proxy system.
  • This mounted/exported image is backed up by the backup application residing on the proxy system.

VCB framework:

  • A "sync driver" on ESX server flushes the file systems and creates the snapshot.
  • A "vLUN driver" on the VCB Proxy Server allows for the presentation of VMDKs to the Proxy.
  • Command-line utilities (vcbMounter/vcbRestore) assist with automation of the VCB workflow.

Backup Software Integration Module:

  • A module that integrates into the VCB Framework's components
  • This module is developed and supplied either by VMware or by the backup application.
  • Ease of integration and use varies between backup applications.

Click here to see a diagram of the VMware Consolidated Backup with proxy server.

VMware Consolidated Backup with the Proxy Server has the ability to conduct LAN-free file-level backups as well as LAN-free image level backups. However, these two capabilities utilize very different processes to achieve their ends.

VCB file-level backup/restore mounts the VMDK file onto the VCB Proxy Server and follows the subsequent steps:

  1. The backup job calls the VCB Framework to obtain a snapshot the VM and mounts the VM snapshot from the SAN to C:\mnt on the VCB Proxy Server.
  2. The directory/files are backed up (full, incremental or differential) using the backup application.
  3. The backup application calls the VCB Framework to unmount the VM snapshot and take the virtual machine out of snapshot mode.
  4. Individual files are restored to the original VM over the LAN via a backup agent that's installed within the VM.

Click here to see a diagram of a VCB-Proxy workflow for file-level backup and restore.

VCB image-level backup/restore exports the VMDK file onto the VCB Proxy Server and follows the subsequent steps:

  1. The backup job calls VCB Framework to obtain a snapshot of the VM and exports the VM snapshot from the SAN to C:\mnt on the VCB Proxy Server.
  2. The exported image files, including system files, are backed up using the backup application.
  3. The backup software then calls the VCB Framework to unmount the VM snapshot and take the VM out of snapshot mode.
  4. Restoration is accomplished by restoring the exported VM image using the backup application to a temporary area accessible to the VMware host such as a temporary location on the either the Proxy Server or the ESX Service Console.
  5. The VM image is then imported into the desired location on the ESX host.

Click here to see a diagram of a VCB-Proxy workflow for image-level backup and restore.

The advantages of this method include:

  • You can utilize a single backup agent on the VCB Proxy to back up your VMs vs. an agent per VM.
  • Through this method, all your VMs can be backed up in their entirety by simply backing up a few large VMDK files.
  • Fast image-level recovery is possible now because recovery involves streaming back a large image file rather than seeking for many small files.
  • By offloading the backup process onto the VCB Proxy Server, it reduces the overhead on the ESX server.
  • This method is a LAN-free SAN-enabled backup approach, which theoretically can provide a faster backup than a strictly LAN-based backup method.

The disadvantages of this method include:

  • Automation and ease of use is heavily dependant on third-party backup software's capabilities.
  • This can be complex to implement without some form of backup software integration to simply the process.
  • If you want file-level recovery directly back to the VM, then this will still require a backup software agent installed in the VM.
  • For Windows, without VSS integration, the image-level backup offered by VCB is, at best, crash-consistent.
  • VCB doesn't provide a mechanism for Windows System State backups; although successful full server recovery may be possible, it is by no means guaranteed if the system state was in flux when the VM was snapped.

Deployment tips

  • Remember, VCB isn't a backup/restore application; it's a set of utilities to enable integration to third-party backup applications.
  • The Proxy Server cannot be a virtual machine.
  • VCB can't be licensed and installed on the VirtualCenter server.
  • Windows 2003 Server, SP1 or R2 is required on the Proxy Server.
  • The Proxy Server must be zoned to the same LUN(s) that are zoned for your ESX Servers.
  • Multipathing on the VCB Proxy Server isn't supported.
  • If file-level recovery is desired, but you don't want to install an agent in every VM, you can create a recovery-only VM that includes the backup/restore agent; restore to it and then move the file via a network share to the correct destination VM.

About the author: Ashley D'Costa architects and designs advanced computer solutions and has technical experience with a broad spectrum of IT infrastructures.

Read more on Data protection, backup and archiving