The IT industry has had a love/hate relationship with the cloud.
At the turn of the decade, businesses embraced it. Application
service providers (ASPs) were going to change the world until they
did not. Similarly, a brief flirtation with managed service storage
providers (MSSPs) ended in disaster as many of them went out of
business. But now the cloud is back. Aside from
software-as-a-service (the remarketing of ASP), we are seeing a
gradual increase in the number of services that claim to take
storage out of your hands.
Cloud-based storage broadly divides into two types: offsite
backup and primary storage. Primary storage providers want to take
storage administration out of your hands entirely, leaving you with
little more than a network drive (or URL) mapped to the cloud.
Both of them are enticing propositions in their own way. "The
benefits are that it is replicated, it is hosted, it is redundant,
and it is cheaper. And the way the provider can get that is because
we are deciding where we are going to put your data, and what the
hardware that runs on it will look like. When we make those
decisions we can optimise that infrastructure," says Jonathan
Bryce, co-founder of Mosso. Mosso, a venture from
datacentre specialist Rackspace, runs online storage system
CloudFS.
Cloud-based services targeting primary storage are most likely
to appeal to large-scale Web 2.0-style service providers who have
to cope with very high-volume content. If you are running a photo
hosting site, for example, you might consider something like Amazon
S3, which launched in Europe last September, as an option. Amazon
has applied the economies of scale that it enjoys with its own
ecommerce business and used it to provide storage services to
others. It is a non-specialist service, designed to store anything
for anyone, from web hosting files through to photographs, and it
shows in the price - 100Gb with 5Gb transfer in and 30Gb transfer
out will cost you less than £24 - that is consumer class
pricing.
On the other hand, the web service offers little in the way of
service level agreements, and it has suffered some spectacular
outages, leaving a variety of customers affected including photo
hosting site SmugMug. Now, other companies are hoping to offer
primary cloud-based storage on the same scale.
"Amazon proved the market," says Jonathan Buckley, chief
marketing officer of Nirvanix, which launched its storage delivery
network (SDN) a year ago. Nirvanix's cloud includes a selection of
nodes designed to get the content closer to where people need it
geographically. Buckley sees one of the main applications as
seeding content delivery networks (CDNs) - systems operated by the
likes of Akamai, which allow high-volume, low-latency content
carriers such as film studios to get their information closer to
the relevant customers. The likes of Akamai capitalise on their
ability to deliver content quickly to consumers while reducing
network congestion. But Nirvanix sees room to provide those
networks with the necessary storage capability.
Buckley is also courting smaller content providers that need a
scalable storage mechanism but cannot afford the scale of
traditional CDNs. Nirvanix provides the storage for multimedia
content delivered by small movie producer Doom Inc, for
example.
However, he also sees a role for Nirvanix when supporting office
environments. Companies that do not want to worry about managing
storage in a scalable environment can use the cloud-based system
instead, he explains, "The task of capacity planning and disk
replacement and data migration every three years goes away."
Nirvanix uses a software tool currently in beta called CloudNAS.
The system, which sits in a local office environment, effectively
maps the company's SDN as a network drive. Your Windows or Linux
server then treats it as just another disk.
This is not the only way that cloud-based storage can be
accessed. When users integrate applications with the system, they
may want to do more than simply map to a network drive, integrating
the system more tightly. Web services APIs are a popular technique
for achieving this integration. Nirvanix provides integration using
SOAP, the original http-based access protocol for web services.
Along with Mosso's CloudFS, it also uses representational state
transfer (Rest) - a lightweight access mechanisms that uses URIs to
access items when they are online.
"The Rest API is the standard way to access this kind of
storage, but we also target our products at businesses rather than
just hacker developer types," says Bryce. "So we created
language-specific libraries for developers using .net, Python, PHP,
and Java." Bryce is already working with suppliers that he says are
building support for CloudFS directly into their products using
these techniques.
But are users ready to put their data into the hands of storage
providers like these? IT directors assuming that a provider's
datacentre has full redundancy might be disappointed, and
service level agreements may not always be enough to deal with
the problem. A recent incident at cloud storage firm Flexiscale saw
customers unable to access their data after an engineer
accidentally deleted a main storage volume and a bug in the system
restricted customers to read-only access when dealing with their
data. The company was crediting users as per its SLA, but users may
find some extra cash cold comfort when they need their data and it
is not available.
SLAs are even less useful when the company ceases trading. Take
TheLinkup, for example, which came from the same parent firm as
Nirvanix. This cloud storage company closed down in August after
losing vast amounts of customer data.
With incidents like these undermining confidence in cloud
storage, companies might need to consider disaster recovery options
for cloud storage - perhaps even using two providers. Coding that
into their applications might be more complex still.
The other problem is that SLAs might cover availability, but
they cannot always cover latency, which depends on how quickly a
separate internet access provider can route the data between you.
With some applications likely to be sensitive about how quickly
their requests are served, how many of them will be suitable for
environments where network speed could dramatically affect
performance?
"A customer support centre with online transactional data should
not be run in the cloud. It should be fibre channel-attached
storage," says Nirvanix's Buckley. "No matter how much time we can
show on our systems the public internet can be tricky. In an
environment requiring 99.99999% uptime, that should not be up
there. And the latency problem is not solvable by Nirvanix."
That problem may be exacerbated as cloud storage services are
outsourced and subcontracted, warns Hamish MacArthur, specialist
storage analyst at MacArthur Stroud. If a cloud storage provider in
the UK subcontracts to an infrastructure provider in another
country, what legal and technical problems might that create?
"This will be about the way that groups work together, sharing
different projects and tasks, and then it will be about how this
storage is managed," says MacArthur. "Is it my private data that I
am sharing with you? And do you have access to everything I have?
Just what do I put into a federated environment?"
Consequently, companies will want to talk with potential
providers about data governance, but also about encryption. Do not
assume that data encryption is built into the mix. Nirvanix strips
metadata from the data it is storing, for example, but it does not
encrypt data at Rest. It relies on its customers to send encrypted
data if that is what they want to store, and also makes it possible
for CloudNAS to integrate with third-party key management systems,
but this puts the onus on the customer to encrypt their own
data.
Similarly, if you are focusing on cloud storage purely for
backup purposes, think about versioning. Symantec's Storage
Protection Network, for example, focuses purely on cloud-based
backup rather than primary storage. Like many systems, it uses
delta-based incremental backup, meaning that it begins by backing
up everything in the designated set of files (which can take a long
time) but then only backs up changes to the file set.
There are two different offerings for its online backup service.
The basic one offers one full year of versions, while the second
offers up to seven years of versions. "Potentially you have
hundreds of versions to restore from but over time we fade out
those versions so that as data ages, you need fewer restore
points," says Symantec's senior product manager Darren Niller.
"Daily, you could have hundreds of versions available if you are
changing that fast. Then you go to a week, and you have fewer, and
so on."
For many providers, storage is just the start. EMC, which offers
the Mozy Enterprise data backup service, is hoping to layer other
types of service on top of its backup service in the future, says
Vance Checketts, chief operating officer for EMC Mozy. "Online
backup is a really great way to start people down a path to cloud
computing," he says. "Customers will have already paid the price of
getting into the cloud. So we will be able to say how about doing
X, Y and Z?"
X, Y and Z could entail everything from electronic discovery
through to information lifecycle management. And when providers
start offering those services, which lie outside the capabilities
of many customers, that is when cloud-based storage will get really
interesting.
Cloud-based storage systems
Amazon S3
Amazon's online storage
system tested the market for cloud-based storage. It is cheap,
cheerful but has little in the way of SLAs.
CloudFS
CloudFS is
offered by Mosso, a cloud-based service provided by
RackSpace.
EMC Mozy Enterprise
EMC released the enterprise version of its consumer-focused Mozy
online backup system earlier in the year. It works on a monthly
payment basis.
Box.net
Box's Enterprise
version folds collaboration services together with its online
storage offering, giving users the chance to edit and share
documents from within their box account.
Flexiscale
Flexiscale charges no set-up fee, and provides customers with a
'virtual dedicated server' for storage.
Nirvanix
Nirvanix offers a 'storage delivery network' that replicates
your files between different geographical nodes for redundancy and
high performance.
Symantec Protection Network
Symantec's
SPN, launched this Feburary, features a basic online backup
service, and a more advanced service for users of its Backup Exec
product.