Data deduplication hot despite hard drive explosion

The latest research from TheInfoPro shows data deduplication topping the hot technologies index, but IDC's hard drive forecast says it won't be enough to stem a rising tide.

Data deduplication has shot up the charts on TheInfoPro's hot technology index, released this week, replacing file virtualisation as the storage technology generating the most interest among some 152 respondents from Fortune 1000 companies.

Robert Stevenson, managing director of TheInfoPro's storage sector, said he's rarely seen a technology hit No. 1 so quickly. "It's a very rapid move up -- 10 slots in six months -- and much faster than I anticipated."

File virtualisation, which topped the heat index six months ago, remains in the top three, and block-based virtualisation also "moved up substantially" from No. 8 in the previous survey to the fourth spot this time around, according to Stevenson. "This shows me in general that consolidation activity is at an all-time high."

More on data deduplication and reduction
EMC users push for better power consumption

IBM chief engineer talks green storage

NetApp adds data deduplication for primary storage

Sepaton claims 50-to-1 data reduction ratio
However, 40% of respondents said backup was the largest drain on time in the data center (beaten out only by first time provisioning of storage arrays). "There's a sense here that there has to be a way to cut out the largest workload, which is often backup," Stevenson said. "There's a pressure cooker -- we have to find a way to innovate [when it comes to backup] or we're going to be stuck."

Yet storage growth continues

The fact that the current focus remains on backup, and, Stevenson said, reducing backup to tape in particular, could explain why a separate study released by IDC today forecasted an explosion in shipments of hard disk drives over the next four years, to 675 million units and approximately $37 billion in revenue worldwide by 2011.

"Having deduplication embedded in next-generation storage arrays is a top priority on our respondents' wish lists," Stevenson said. "The direction of interest seems to counter [the forecast of hard disk growth], but the current means of deployment do not."

According to John Rydning, research manager for IDC's Storage Mechanisms: Disk program and author of the IDC disk drive report, data deduplication products have been factored in to IDC hard drive and storage system forecasts and neither show the technology making much of a dent. "Even with data deduplication, digital content growth is still explosive," he said. "[Data deduplication] isn't enough to stem that rising tide."

In the enterprise, worldwide units shipped will increase at a compound annual growth rate (CAGR) of 15% through 2011, from 36 million to 72 million units, which doesn't seem like that much, Rydning admitted, until you factor in burgeoning capacity -- in the last year, those shipments accounted for 5.6 million Tbytes. In 2011, those units will represent 42 million Tbytes.

According to Dave Reinsel, program director of storage research for IDC and author of the storage systems report, there are several conflicting trends at work that will continue to push storage system growth along at a CAGR of 55% to 60%, in line with growth rates from recent years. These trends include the increased focus on compliance and e-discovery in the storage industry over the last six months.

"The jury is still out on how best to employ deduplication technology where compliance or legal requirements demand data be presented in original format," Reinsel said. "Things like encryption also muddy the waters and demand things be done in the precisely correct order" since encrypted data by definition cannot be compressed or deduplicated.

Right now, Reinsel said, the only consolation is that it could be worse. "Things like storage virtualisation, thin provisioning and deduplication to some extent, are having some impact. If we didn't have those things on board, we would probably see the numbers increase at an even faster rate." Reinsel said IDC doesn't have exact figures for what that rate might be.

Data Domain, EMC top TIP survey

TheInfoPro uses a variety of indexes to report on survey responses, among them "in use" statistics, which indicate how many of the respondents have a product in use, how many are evaluating it for deployment in the next 30-to-90 days, how many are evaluating it for deployment in six months, and how many are evaluating it long term for deployment in a year or more.

In this survey, Data Domain Inc. came out as the No. 1 "in-use" data deduplication technology. However, just 9.3% of respondents had any technology in use, and Data Domain captured 4%. "It's still very early in the adoption curve," Stevenson said, also pointing out that 42% of those with a deduplication technology in use indicated they were planning to expand their deployment in the next six months. Among those with technology in-use, there were just seven or so respondents who qualified to give product ratings, which require the technology be in use for six or more months.

Meanwhile, 8.6% of respondents said they were planning deployments in 30-to-90 days, 12% had it in the near-term plan (six months), and 25% had it in long-term plan.

Behind Data Domain, EMC Corp.'s Avamar was the next highest in-use technology at 2%, and Diligent Technologies Corp. came in third with 1.3%. However, another index used by TheInfoPro, which measures the number of unprompted "vendor mentions" by respondents in the course of the survey, has EMC on top, followed by Data Domain and then Diligent in terms of near-term planning. Among those who are planning data deduplication deployments long-term, Network Appliance Inc.'s A-SIS product got the most mentions.

Read more on Integration software and middleware