Symantec reveals data deduplication plans

Symantec will make PureDisk part of NetBackup 7 and Backup Exec 2010 and add client-side data deduplication to its backup wares.

Symantec has laid out its data deduplication strategy, disclosing plans to integrate its PureDisk data deduplication software into its Veritas NetBackup and Backup Exec backup products and add client-side data reduction to PureDisk's capabilities.

Symantec also made NetBackup 6.5.4 available today, with granular backup for Microsoft Hyper-V virtual servers, a new software agent for Symantec Enterprise Vault and support for virtual synthetic full backups on OpenStorage API (OST)-integrated backup target devices.

NetBackup, Backup Exec to meld with PureDisk

Symantec will lay the groundwork for the full integration of PureDisk with NetBackup and Backup Exec later this year with its PureDisk 6.6, the last standalone release of the source-based dedupe product. Version 6.6 will boost the supported capacity from 8 TB to 16 TB per PureDisk server node and offer a virtual appliance option.

While PureDisk already supports VMware, the next version will add content-awareness to make deduplication more efficient. With 6.6, PureDisk will provide visibility into the VMware Virtual Machine Disk File (VMDK) and Virtual Machine File System (VMFS) file structure. That will allow the dedupe algorithm to better identify new data encapsulated by VMware's file format.

PureDisk will be fully integrated with Backup Exec 2010 and NetBackup 7, said Matt Kixmoeller, VP of product management for the NetBackup platform. Backup Exec 2010 is due by the end of this year with NetBackup 7 to follow early next year. Windows-based Backup Exec 2010 will also add support for OST, available now only for NetBackup.

Symantec customer Al Schipani, manager of server engineering for Westchester Medical Center, said the integration will prove helpful. "Having one source for everything just makes our lives that much easier," he said.

Schipani also said support for 16 TB per PureDisk node will help cut down on overhead, but he's eager for Symantec to boost the capacity up to 32 TB – the current size of his PureDisk infrastructure.

"Right now we have two content routers with a third for redundancy," he said, referring to the PureDisk term for dedupe engines. He said he'd like to reduce that number to two to cut down on hardware and administration costs.

When the fully integrated versions of PureDisk begin shipping, the products will also begin to support deduplication from the data source at both the client and the media server level. That will be similar to the architecture of some competitive backup-software based data deduplication products such as EMC Corp.'s Avamar, which performs client-side dedupe and has already been integrated with EMC's NetWorker backup product. Other competitors such as CA's ARCserve 12.5, and CommVault Systems' Simpana 8, have already integrated dedupe with backup software clients.

Those products are all available today, while Symantec's integration is still months out. Perhaps that's why Symantec is making its deduplication roadmap public well before customers can take full advantage.

"I've been following Symantec a long time and they always announce the next version. They need to remind people -- don't jump ship just now, we will have what might be looking for by six months, which is how long it would take you to change backup software anyway," backup and recovery expert W. Curtis Preston said. "There may be a perception that they are behind CommVault and CA in the dedupe space, but those products don't have dedupe at the source."

Preston added that PureDisk can dedupe at the client and media server level, but those capabilities haven't been fully integrated with NetBackup.

Kixmoeller said Symantec's differentiation against competing deduplication products will be in performance with scalability. "Today PureDisk supports up to 16 TB per node," he said. "We feel good about the comparison."

NetBackup 6.5.4

The latest NetBackup version adds support for snapshot-based granular recovery technology (GRT) options to Hyper-V backups. NetBackup already has GRT support for VMware and Microsoft Exchange.

GRT requires a customer prerequisites such as disk-based backup that uses the NFS file system. These details were disclosed at last year's Symantec Vision, when NetBackup 6.5.2 first integrated GRT. While Microsoft SharePoint and Exchange granular recovery runs on top of the LAN-based standard client for NetBackup, the Hyper-V version is based on an integration between Symantec's FlashBackup snapshots and Microsoft's Volume Shadow Copy Services (VSS) to quiesce Windows applications.

The new synthetic full options for OST-integrated target devices cut out some network traffic and processing time previously required to write synthetic full backups off to tape from back-end disk-based backup targets. Previously, data had to be fed back through the NetBackup media server in order to be written to tape in proper format. The new version can rebuild a synthetic full backup from pointers on the back-end device, eliminating the hop back to the media server.

OST-integrated devices will also now be able to write that data directly to tape while keeping the NetBackup catalog updated. This kind of direct tape creation with backup catalog awareness had been a point of contention for disk-based backup products.

Data Domain, EMC, FalconStor Software and Quantum support OST, but Preston pointed out, "The only thing Symantec doesn't mention is that none of the companies that support OST have said anything yet about their plans to support that release."

Read more on Data quality management and governance