A virtual server backup today is for real. Given the needs and scope of the IT industry, no infrastructure is complete without such a system. Because this is top priority, experts C Kajwadkar and Pankaj Nath offer valuable advice to organizations that wish to make a choice. Well-versed in the dynamics of IT infrastructure, Kajwadkar and Nath, both from Availability Services @ Netmagic Solutions, know that data loss can cause havoc. Accordingly, they recommend a four-point process in the transition from the physical to the virtual server backup.
1) Backup tools
Virtualization technology vendors offer tools that help in backing up virtual servers, but when choosing a third party tool for virtual server backups, one needs to look at a tool which can differentiate between a physical and virtualized environment. Pankaj Nath, senior manager (solutions) and an old hand in virtual private clouds, points out that a virtual machine (VM) exists as an image file on a physical server, encapsulating an OS with configuration settings, applications and the corresponding data.
“If a VM file is allocated 100GB on the hard drive and only 50GB is used, a virtualization ready backup tool will recognize it as a VM file and back up only that 50 GB of data,” he says. “A traditional backup tool, on the other hand, would just recognize it as any other file and back it up as a 100GB file. Third-party tools such as Symantec Backup Exec 12.5, EMC Avamar, and vRanger Pro from Quest software also offer capabilities good enough to meet virtual server backup requirements.”
2) Allocating storage
While a virtual server environment might have a storage area network (SAN) storing the VM files, it would be a good virtual server backup practice to perform a disk-to-disk backup to inexpensive storage disks. Backup tools take a point-in-time snapshot of the VM image. The corresponding backups only have the changes in the original getting synchronized with the backup copy.
“Therefore,” says Kajwadkar, chief architect and vice-president, “storage requirements for backup storage would not increase by big margins over the period and adding more disks would not prove to be a costly affair. Though SAN-to-SAN or LAN-free backup can offer better performance, deploying a SAN just for virtual server backup can be expensive. More disks to meet increasing storage demands can further add to the costs.”
3) Compute and network
Most virtual server backup tools create a backup VM or file and synchronize changes to the original file with the backup copy in real-time. Therefore the first step would be to determine the resource requirements for handling peak workloads across all VMs. Kajwadar suggests that the next step would be to oversize CPU allocation such that backups do not eat into the resources allocated to applications. A good compute sizing practice would a CPU configuration such that 20% more CPU than that required to handle the total peak workload is be allocated on the physical server.
“As opposed to the physical world, wherein each physical server has a dedicated network interface card (NIC) for I/O operations, on a physical server with VMs, I/O from all VMs is expected to flow across a single NIC creating an I/O bottleneck,” he says. “While the server can have an NIC which supports speeds of up to 1GBPS, the organization needs to ensure that the LAN between this server and the backup storage can support up to 10 Gigabit speeds.”
The best practice is to measure the total I/O throughput requirement for backing up VMs before starting your virtualization project. If the I/O load is high then it is advisable to go with SAN-based virtual server backup by putting an HBA card. Nath believes that though an HBA card will require an additional investment of approximately Rs 50,000; it can provide up to 2Gbps or 4Gbps or 6Gbps of speed depending on the HBA and SAN switch being used.
4) Compression & de-duplication
Both Kajwadkar and Nath maintain that data compression, when performing virtual server backups, should be source-based while de-duplication should be target-based. If the VM data gets backed up to a remote SAN over a WAN, bandwidth usage can be optimized if data is compressed at the source. Data de-duplication, on the other hand, should be destination-based since data can be classified as original or duplicate and be de-duplicated only on comparing it with the existing backup copies.