In a presentation at Symantec Vision conference on Tuesday
afternoon, two blue-chip customers said Symantec's NetBackup
PureDiskdata deduplicationproduct
allowed them to centralise management of remote office backups and
save on WAN bandwidth, but also said there were some scalability
kinks in the product that have yet to be worked out.According to Tony Elzinga, director of storage strategy for
JPMorgan Chase & Co., the company decided to deploy PureDisk at
250 remote sites to eliminate tape and centralise management of
remote office data.
The PureDisk clients are connected to one of two "core"
datacentres, one in the southern region of the country (Elzinga
didn't specify where) and one in the north. Each of those cores is
handling data from 125 remote clients, and each has 2 terabytes
(Tbytes) of Fibre Channel disk attached, though so far Elzinga said
the company has only had to use about half of that capacity at each
core.
JPMorgan is moving to completely eliminate tape from its backup and
disaster recovery strategies because of concerns about security, as
well as reliability. "Our retention periods in many cases are far
longer than the useful life of a tape cartridge," Elzinga said.
However, at many of the remote sites, bandwidth is extremely
limited. "We have some links that are as small as 384 KBps," he
said. A remote office replication product would have required him
to replace many of those links, a cost that would run well into the
millions.
"The big difference here is that the [data deduplication] isn't
just done on one bit -- the product looks to see if it has the same
128 [kilobyte] segment over any of 125 clients," according to
Elzinga, allowing the company to keep 60 days worth of backups on a
total of 3.5 Tbytes of disk between the two core sites.
Meanwhile, Jeff Krueger, data protection manager for Qualcomm
Inc., said the PureDisk product was a more appealing approach to
dealing with small remote sites than adding a NetBackup instance at
each one, at a cost Krueger estimated at $50,000 each for servers,
tape libraries and software.
Qualcomm is just beginning to roll out PureDisk with 40 clients
attached to one central repository, which Symantec calls a Storage
Pool, at the company's headquarters in San Diego. The sites are
distributed throughout North America, Europe and Asia, including
locations in Taiwan, the UK and the US. So far those sites have
accumulated about 5 Tbytes of backup data on 1.5 Tbytes of
disk.
Krueger said that another appealing feature of the product was
its upcoming integration with NetBackup 6.5. "We are planning to
export data to tape through NetBackup Enterprise and expire older
backups off disk," Krueger said. The company will also be adding
another Storage Pool node in London, in order to speed up backup
and recovery times, and because London acts as a kind of "secondary
headquarters" to European operations.
Scalability still an issue
Both companies have already identified one limitation in version
6.1 of PureDisk -- the fact that it caps each Storage Pool node at
50 million files. "We're already at 45 million at one of our core
sites," Elzinga said.
That limitation is also the reason JPMorgan chose to go with two
Storage Pools in the first place. "We're hoping to see this
limitation moved up in future releases," Elzinga said, though he
added that having two Storage Pools to manage "is a big improvement
over the 250 [separate remote sites] we were managing before."
Elzinga also said that on smaller WAN links, he found that more
than 250 GB of data took too long to back up because of latency on
the network -- up to a week in some cases. For those environments,
Elzinga said the company is still using Network Appliance Inc.'s
(NetApp) SnapVault.
"At our larger remote sites we are able to use our NetApp filers
as primary storage and for backup, but sites smaller than 250 GB
didn't justify the cost of a filer anyway," he said.
Krueger said that he's been backing up remote sites of over 1
Tbytes without the same problems, due to larger WAN links. However,
he said the database limitation to 50 million files is a concern,
though "adding more nodes to boost capacity or performance is
typical of data deduplication products. We're already used to
managing over 30 instances of NetBackup," he said. "It's a
philosophical choice for us, we see the appeal of managing fewer
boxes, but don't like to put all our eggs into just one basket
anyway."
Meanwhile, however, Qualcomm found it had to "precede" the data
on initial backups, sending the initial backup data on tape to the
main datacentre, restoring it to a network share onsite and
performing the initial full backup over the local LAN rather than
via the WAN. "There's still some time involved with each remote
site having to sync its catalog when you do that, but it saved a
lot of time with the initial backup," Krueger said.
According to Wim De Wispelaere, senior product manager for
PureDisk with Symantec, the next release of PureDisk, version 6.2,
which will become generally available in July, will raise the
database limitation to 100 million files.
As for the bandwidth issues, "There's a balance users have to
reach between bandwidth and capacity," De Wispelaere said, but
added that Symantec is working on reducing the number of times
remote sites have to ping the central storage pool to detect
duplicate data segments. "We're planning major improvements there
for version 6.5, which will be available by the end of the year,"
he said.