Storage data deduplication vendors duke it out

Avamar and Diligent stake their claim to users' datacentres in the deduplication market where competition is growing red hot.

As more and more users deploy data deduplication in secondary storage, competition among vendors in the market is growing intense.

Executives from dedupe players Avamar Technologies Inc., which was acquired by EMC  last October for $165 million, and Diligent Technologies  hit the road recently to meet with media and analysts, and make their stance known on the major debates surrounding this technology.

There's little these two players agree on, in the end, except that the competition is passionate. "It's difficult to have a 30 second elevator pitch in this market," said Jedidiah Yueh, founder of Avamar and currently a vice president of product management within EMC. "The technologies can be very different between competitors, but you have to peel back a few layers of the onion before you can really start to see it."

"There's a lot of fertile ground here," said Neville Yates, chief technology officer (CTO) for Diligent. "There's a lot in this market to be gained -- and lost."

More backup process info
Symantec tightens NetBackup, PureDisk integration

Buzz builds around data reduction for primary storage

Users discuss data deduplication doubts
According to Arun Taneja, founder and analyst with the Taneja Group, dedupe stands to change the face of the entire storage industry over the next several years. "Data protection is changing in a fundamental way for the first time in 25 years," he said. "I think you'll see entirely new leadership in the industry three years from now -- that's how dramatic a change we're talking about."

Right now, as the customer base for dedupe products grows, every company with a product in the game is trying to become one of those leaders -- hence the squabbling among vendors that can ultimately confuse users when it comes to evaluating products for purchase, particularly in the case of Avamar and Diligent.

The two products, though both are addressing the same problem and the same potential customer base, "are apples and oranges," according to Taneja. The primary difference -- and primary battleground between the two -- is where deduplication should live in the overall secondary storage environment.

Target vs. source

Avamar's Axiom product deduplicates at the source, through an agent on the application server, sending only changes across the network to the backup target. Diligent, meanwhile, deduplicates data using a proprietary algorithm while the data is on its way in to the backup target, which in Diligent's case is a virtual tape library (VTL).

Let the games begin.

"We believe that there is a place for deduplication at the target, especially over the next few years," said Yueh, adding that EMC has plans to integrate Avamar's dedupe into its own VTL product, the EMC Disk Library, in the short term.

"There's low-lying fruit there, obviously, since we already have the product," Yueh said.

However, EMC/Avamar is betting that ultimately, the more disruptive approach of deduping at the source will be the one that wins out. Avamar's software necessitates a rip-and-replace of the user's existing backup environment, something Yueh acknowledged can make it a tough sell right now.

"We work with customers, particularly in large environments, to transition them gradually to our product." Doesn't that mean the user ends up managing two backup environments, at least for a while? "Yes, but if you understand tape backup, our interface is very intuitive," Yueh said. More to the point, he added, more than 400 users have already taken the plunge.

"Avamar struggled early on in the game trying to pitch those first 10 or 15 customers," Taneja recalled. "Reception has always been good to their concept, but actual traction took a long time."

And, of course, Diligent is still harping with all its might on Avamar's disruptiveness. "They say they have 400 customers, but how many are in the Fortune 1000?" Yates asked. (Diligent itself has 150 customers; Avamar declined to comment on its number of Fortune 1000 installs). "How much data do they have under management?"

Two physical petabytes (PB), according to Yueh -- which can restore to over 60 PB. "Just one of my customers purchased 2 PB usable storage from me in the fourth quarter last year," Yates scoffed. "Avamar is not suited to large enterprise environments or large enterprise customers who are not going to uninstall an established backup player."

In addition to going back and forth about backup disruption, the two competitors are also deep in debate over deduplication ratios. Yates claims Avamar doesn't count the "prime" or first full backup against its deduplication ratio. For instance, a 1 terabyte (TB) volume would need to be backed up in full before changes could be compared against it, Yates said. "Avamar doesn't incorporate that when it calculates deduplication ratios -- it's misleading," he said.

"That's how his solution works," Yueh fired back. "We don't even have a first-time full backup -- even the prime is usually a third to half the size of the original because we eliminate redundant blocks the first time, too."

What's the bottom line?

There are further points of contention, all around the nitty-gritty of performance numbers and the merits of different algorithms, but the first question for users to consider as the first-time adoption phase for dedupe continues is the place it should occupy in the environment, Taneja said. "Everything else, from performance to dedupe ratios, is a secondary consideration," he said. (For a deeper analysis of dedupe products, see Storage magazine's feature, The skinny on data deduplication, Jan. 2007.)

"If the majority of my applications are in one centralised, local environment right now, I'd probably go the VTL route," Taneja said. (Other deduping VTLs include FalconStor Software Inc.'s VirtualTape Library and Sepaton's DeltaStor product. Another notable name in this area is Data Domain Inc.)

"Meanwhile, Avamar also reduces the amount of information transferred over a network because it dedupes at the source. If I have lots of remote offices, it can reduce not only backups but bandwidth demands dramatically," making a product like Avamar or Symantec Corp.'s NetBackup PureDisk potentially more appealing, Taneja said.

Ultimately, however, Taneja came down on the side of Avamar when it came to the long-term resting place for dedupe and said that major players, like Symantec and EMC, are proving it by folding dedupe into backup software products. Symantec announced that it plans to integrate PureDisk into its main NetBackup product last week, and "I can guarantee you there are engineers at EMC right now working on swapping out the back end of Legato Networker with Avamar," Taneja said.

Read more on IT risk management