What are the problems associated with virtual server backup>
How they can be mitigated to ensure smooth and efficient data protection>?
Data protection is an essential part of all IT operations.
Until recently, it has been achieved by directly backing up physical servers over the network.
But the move to virtual server environments has changed forever the landscape for successful backup of applications data.
This presents a number of challenges.
The move from physical to virtual servers provided many IT organisations with the opportunity to consolidate and reduce the amount of hardware resources needed. This was one of the main selling points of the first wave of virtualisation – consolidation to fewer servers because most of them weren’t fully utilised.
Virtual server backup: The performance problem
But the backup infrastructure is an area that has always struggled with performance issues, even when there is a dedicated backup network. Therefore, backing up virtual servers using physical server infrastructure and methods has often resulted in big problems.
Where once one app on one server was backed up, now multiple virtual servers in a single box require protection. For that reason, virtual server backups can experience severe bottlenecks when using traditional backup methods that copy data from each virtual machine (VM) as if it were a physical server.
The answer here is to avoid backing up data from the guest VM and instead to deploy backup applications that can copy directly from the host using backup-specific application programming interfaces (APIs) such as VMware’s vStorage APIs for Data Protection (VADP). All VM-aware backup products are capable of using these APIs to back up data without having to access each guest.
One benefit of running host-based backups is that traditional backup agents can be eliminated, removing a whole set of maintenance and management tasks needed to keep the agents up to date.
More articles on virtual server backups
Virtual server backup: The tracking problem
In the physical server world, the server is clearly identifiable and tracked through an IP address and/or DNS name. Servers rarely move or change IP address, so a backup that fails due to an inability to contact the server can be easily resolved.
In the virtual world, things aren’t as simple. While it is true to say most virtual servers don’t change their IP address, most also aren’t backed up directly, but backed up through the host hypervisor.
Virtual machines can easily be migrated between physical servers and storage, so keeping track of each VM in the backup infrastructure becomes more complex. The result is that a VM migration may well cause the next backup to fail.
The answer is to reference a virtual machine, not through the physical host on which it resides, but via a more abstract reference to the group of physical servers that support the VM, such as the cluster name or, in the case of VMware vSphere, the datacentre object. By abstracting the reference to the VM, both backup and restore processes are no longer dependent on the physical host hardware, which provides operational benefits by reducing the work involved in restores for clusters that have been physically or logically reconfigured.
Virtual server backup: The granularity problem
During data recovery, most restore requests are for individual files, a directory, or for data within an application such as an email attachment. It is rare that an entire server needs to be recovered. Most restores are therefore very granular in nature, and require the recovery of a small piece of the data that constitutes a server or application.
Virtual machines can easily be migrated between physical servers and storage, so keeping track of each VM in the backup infrastructure becomes more complex
Virtual server backups that simply back up the files that comprise the VM may have problems restoring individual pieces of data unless the software is aware of the contents of the backup and is able to understand virtual machine disk formats. Worse still, if the backup software cannot decode the contents of the backup, it may be necessary to restore the entire VM, albeit to a temporary location, to recover a single file, resulting in restore delays and unnecessary network traffic.
Backup software needs to be able to understand the content of the backup and restore objects from within backup files directly, without having to restore more backup data than necessary. Today’s more advanced products are able to understand the format of application data – email systems and databases, for example – and offer restores of individual application objects. Obviously, these technologies need to be used with care, as restoring parts of data into an application could lead to logical corruption.
Virtual server backup: The media problem
Contemporary backup technology uses techniques such as changed block tracking to back up virtual machines. These systems are well-suited to storing backup data on disk, as they require access to the initial backup plus all data changes to perform restores.
But backup subsystems that rely solely on disk come with some caveats. Disk-based backup targets aren’t necessarily scalable – at least not in a way that is economically desirable – and don’t offer easy portability to take data offsite for full disaster recovery, for example.
The solution is to look at backup systems that are capable of supporting multiple media types, including tape, and those that offer the ability to create synthetic backups, such as a full system backup based on the original backup plus all subsequent incremental block changes.
Virtual server backup: The process problem
Virtual server backup and data protection can be achieved through various methods and technologies. As well as backup software, there are other ways to secure virtual machines that rely on the fact that VMs are stored as files on disk. This means backups can be made via snapshots or replication on shared storage.
Although array-based replication and snapshot functionality can work well, care has to be taken to ensure that using these methods will result in a consistent and comprehensive backup policy.
For example, snapshots don’t cover the scenario of total array failure, such as could be experienced through fire or flood, and replication may not provide the right level of granularity for recovery when the minimum recovery point is a logical unit number (LUN).
That leads to the conclusion that virtual server data protection is best implemented using a variety of techniques.