Data Domain has announced that it's shipping a new model of its
data deduplication array, dubbed the DD580, which boasts new
dual-core Intel processors and support for more capacity, slotting
it into Data Domain's product line between its DD560 model and its
data center-sized DDX array.
The announcement comes as debate continues to rage in the hot
data deduplication market space, where one of the many battles is
around performance. The new Data Domain performance claim of 800 GB
per hour (220 megabytes per second [MBps]) in lab tests now matches
competitor Diligent Technologies's claims with the previous version
of its ProtecTier
VTL product. Diligent now claims transfer rates of up to 400
MBps.
However, the new aggregate throughput is for a fully-stocked box
with support for more of those single data streams thanks to the
new processors. The single data stream rate, according to Data
Domain co-founder and vice president of product management, Brian
Biles, remains the same as in previous products, around 100
MBps.
"There are many variables on deployed speed in a given datacentre,
including client throughput, network load and media server
capacity," Biles added.
Currently, according to Biles, Data Domain sees load balancing
for performance as the purview of the backup application, such as
Tivoli Storage Manager (TSM), which allows users to designate
separate storage pools and choose target devices manually to
optimize performance.
"Backup software has been very good at load balancing and
targeting different loads to different devices, and users have had
to do it that way for years with tape," he said. "Our product
slides into that environment nondisruptively."
However, Biles also said that the ability to cluster the
network-attached storage (NAS) heads on the arrays for automated
load balancing is on the Data Domain roadmap slated for release
sometime next year.
"Right now the market space Data Domain is targeting is the SMB,
particularly the midsize business," according to Curtis Preston,
vice president of data protection services at Glasshouse
Technologies Inc. "Two hundred megabytes per second would be more
than many data centers in that category would need."
That said, Preston added, "anyone looking to purchase a Data
Domain box and needing bigger [performance] numbers isn't going to
get it with a single head. Data Domain is going to have to find a
way to go to multiple heads to reach new customers."
DD580 beta tester Kirk Schoeffel, technology specialist with the
city of Vancouver, British Columbia, had another idea: the ability
to "trunk" Ethernet ports on the box in order to aggregate their
throughput when things are working properly and for high
availability if one port fails.
Rumor has it that capability could be coming by the end of the
year, but Data Domain was mum on its roadmap on that front. "Data
Domain is very aware of customer requirements, the core
considerations of our product planning and delivery," wrote Data
Domain officials in an email to SearchStorage.com. "Data Domain
does not preannounce products."
Meanwhile, Schoeffel said the increased performance with the
DD580 is a potential perk, but not his chief reason for upgrading
-- that was the higher capacity. The DD580 supports between 550
terabytes (TB) and 1.25 petabytes (PB) of logical capacity, as
opposed to 400 TB to 900 TB with the DD560. For Schoeffel, it
amounts to an extra tray of disks or another 5.5 TB raw physical
capacity, which, with the 22-to-1 data deduplication ratio
Vancouver has seen with its data, can be expected to hold another
123 logical terabytes.
To speed backups, the city of Vancouver stages some 2 TB of TSM
incremental backups to its IBM DS4000 storage area network (SAN)
before backing them up overnight to the Data Domain box during a
14-hour backup window. Currently, a typical backup from the SAN to
the Data Domain system takes about five hours, roughly equivalent
to what Schoeffel calls the "best-case scenario" for TSM backups
also being fed by the SAN. However, he said, TSM backups often
trend longer because of restore requests, which hold up the backup
process with tape.
More importantly, the Data Domain box is exponentially faster
than tape when it comes time for a restore. "We didn't get the Data
Domain array for backup performance -- we got it for restore
performance," Schoeffel said.
There, according to Schoeffel, Data Domain gives him the ability
to specify flexible file sizes and restore multiple data streams at
once in addition to throttling the transfer rate on restores
depending on priority through TSM, none of which is possible with
his tape libraries.
"That kind of efficiency is the huge advantage to the Data
Domain [array]," he said.