Times have changed. With increasing regulatory compliance
(worldwide and especially escalating at warp speed within the
European Union), plus the glut of stored data, which doubles down
every 18 months or so, traditional data protection methodologies
can no longer adequately protect the data.
This problem has led to the development of a plethora of next
generation data protection (NGDP) technologies, including
deduplication (aka single instance storage and global single
instance storage), virtual tape libraries (VTL),
continuous data protection (CDP), continuous
snapshot (aka small aperture snapshot) and
distributed remote office/branch office (ROBO) backup to
disk.
[Note: Many traditional backup product vendors, and some of their
customers, may argue vehemently that their products are more than
adequate for the challenge. I contend that those that are adequate
to meet the challenge have actually developed and deployed NGDP
technologies as features of their traditional product.]
Deduplication is one of the more exciting data protection
technologies hitting the data protection market. The value comes
from the elimination of duplicate stored data. The amount of that
value varies by application. If the application creates a lot of
duplicate data, such as full backups or full volume snapshots, then
the value can be as great as a 99% reduction (25-times compression)
in the amount of stored data. This is pretty heady stuff. Now, if
the application produces primarily unique data, such as incremental
backups or snapshots, the value is significantly reduced to
approximately 60% to 80% (three-to-four times compression.)
Deduplication can be deployed as a standalone add-on to a data
protection implementation (Data Domain, Diligent and ExaGrid). This
greatly enhances the incumbent product and avoids the
rip-out-and-replace pain. It can also be deployed as an integral
feature of a data protection product (Asigra, Avamar, Data Domain,
Diligent, EMC, FalconStor, Sepaton, Symantec).
VTL technology is software that enables disk to emulate a tape
drive. The value of this product comes from faster backups to tape
(five times), more efficient tape utilization -- 50% increase for
Windows, Unix, Linux and a 900% increase for zOS -- and a lot less
stored data if deduplication is integrated as part of the
solution.
VTL products are available from many vendors, including Copan,
Data Domain, Diligent, EMC, FalconStor, Fujitsu/Siemens, HP, IBM,
Neartek, Network Appliance, Quantum, Sepaton, Spectra Logic and
SUN.
CDP is getting a lot of attention as another hot next-generation
data protection technology. CDP has a recovery point objective
(RPO) of zero -- that is, the user can restore data to a single
moment before the data corruption or failure, resulting in no data
loss. Traditional data protection technologies usually range from
six to 24 hours, which can be far too long, leading to too much
potential data loss for mission critical applications. Most CDP
products are capable of exceptionally fast data restoration ranging
from a few seconds to several hours, depending on the vendor and
product.
The value of CDP is that it provides the highest level of data
protection available today. It protects 100% of the data against
loss as a result of hardware failure, human error, malware,
corruptions or deletions. Most of the CDP vendors also provide data
and application consistency as well, meaning database and mail
applications are easily recoverable. This is especially important
for exchange. CDP is available from Asempra, Asigra, Atempo, CA,
CommVault, EMC, FalconStor, FilesX, InMage, IBM, Iron Mountain,
Lucid8, Mendocino, Revivio, Sonicwall, Symantec and
Tmespring.
Continuous snapshot (aka small aperture snapshot) is very similar
to CDP with two primary differences. First, continuous snapshots
are not exactly continuous. There is a time gap between snapshots
whereas CDP has no time gaps. This time gap ranges from minutes to
days and this gap is the period of time in which data can be lost
between snapshots. Second, because continuous snapshot data capture
has gaps between snapshot captures, most products (Cloverleaf,
Exanet, Network Appliance and Symantec/DCT) do not have the same
application consistency attributes as CDP. This means they have to
make the application operations quiescent (pause) for the snapshot
data to be in a recoverable form.
Generally speaking, continuous snapshot provides similar and not
quite as high a value as CDP. Distributed ROBO backup to disk is
designed from the ground up to protect the data outside the data
center (equaling 50%-90% of the organization's data depending on
which report you believe). This is a knotty problem that requires
data center performance and recoverability with little or no ROBO
data protection skills and limited wide area bandwidth.
Distributed ROBO backup to disk solves these problems by utilizing
local and global deduplication,
wide area network (WAN) optimization,
centralized management and control, local and centralized data
recovery, file and data versioning, encryption in flight,
encryption at rest and even CDP in some cases. The value of
distributed ROBO backup to disk is centralized control with
local performance and recoverability. Distributed ROBO backup to
disk is available from Asigra, Avamar, eVault, Iron Mountain,
Signiant and Symantec.
I will go into greater detail in future blogs about each of these
NGDP technologies.
About the author: Marc Staimer is president and founder of
Dragon Slayer Consulting in Beaverton, Ore. He is widely known as
one of the leading storage market analysts in the network storage
and storage management industries. His consulting practice of six
plus years provides consulting to the end user and vendor
communities.