Storage in the clouds

The IT industry has had a love/hate relationship with the cloud. At the turn of the decade, businesses embraced it. Application service providers (ASPs)...

The IT industry has had a love/hate relationship with the cloud. At the turn of the decade, businesses embraced it. Application service providers (ASPs) were going to change the world until they did not. Similarly, a brief flirtation with managed service storage providers (MSSPs) ended in disaster as many of them went out of business. But now the cloud is back. Aside from software-as-a-service (the remarketing of ASP), we are seeing a gradual increase in the number of services that claim to take storage out of your hands.

Cloud-based storage broadly divides into two types: offsite backup and primary storage. Primary storage providers want to take storage administration out of your hands entirely, leaving you with little more than a network drive (or URL) mapped to the cloud.

Both of them are enticing propositions in their own way. "The benefits are that it is replicated, it is hosted, it is redundant, and it is cheaper. And the way the provider can get that is because we are deciding where we are going to put your data, and what the hardware that runs on it will look like. When we make those decisions we can optimise that infrastructure," says Jonathan Bryce, co-founder of Mosso. Mosso, a venture from datacentre specialist Rackspace, runs online storage system CloudFS.

Cloud-based services targeting primary storage are most likely to appeal to large-scale Web 2.0-style service providers who have to cope with very high-volume content. If you are running a photo hosting site, for example, you might consider something like Amazon S3, which launched in Europe last September, as an option. Amazon has applied the economies of scale that it enjoys with its own ecommerce business and used it to provide storage services to others. It is a non-specialist service, designed to store anything for anyone, from web hosting files through to photographs, and it shows in the price - 100Gb with 5Gb transfer in and 30Gb transfer out will cost you less than £24 - that is consumer class pricing.

On the other hand, the web service offers little in the way of service level agreements, and it has suffered some spectacular outages, leaving a variety of customers affected including photo hosting site SmugMug. Now, other companies are hoping to offer primary cloud-based storage on the same scale.

"Amazon proved the market," says Jonathan Buckley, chief marketing officer of Nirvanix, which launched its storage delivery network (SDN) a year ago. Nirvanix's cloud includes a selection of nodes designed to get the content closer to where people need it geographically. Buckley sees one of the main applications as seeding content delivery networks (CDNs) - systems operated by the likes of Akamai, which allow high-volume, low-latency content carriers such as film studios to get their information closer to the relevant customers. The likes of Akamai capitalise on their ability to deliver content quickly to consumers while reducing network congestion. But Nirvanix sees room to provide those networks with the necessary storage capability.

Buckley is also courting smaller content providers that need a scalable storage mechanism but cannot afford the scale of traditional CDNs. Nirvanix provides the storage for multimedia content delivered by small movie producer Doom Inc, for example.

However, he also sees a role for Nirvanix when supporting office environments. Companies that do not want to worry about managing storage in a scalable environment can use the cloud-based system instead, he explains, "The task of capacity planning and disk replacement and data migration every three years goes away."

Nirvanix uses a software tool currently in beta called CloudNAS. The system, which sits in a local office environment, effectively maps the company's SDN as a network drive. Your Windows or Linux server then treats it as just another disk.

This is not the only way that cloud-based storage can be accessed. When users integrate applications with the system, they may want to do more than simply map to a network drive, integrating the system more tightly. Web services APIs are a popular technique for achieving this integration. Nirvanix provides integration using SOAP, the original http-based access protocol for web services. Along with Mosso's CloudFS, it also uses representational state transfer (Rest) - a lightweight access mechanisms that uses URIs to access items when they are online.

"The Rest API is the standard way to access this kind of storage, but we also target our products at businesses rather than just hacker developer types," says Bryce. "So we created language-specific libraries for developers using .net, Python, PHP, and Java." Bryce is already working with suppliers that he says are building support for CloudFS directly into their products using these techniques.

But are users ready to put their data into the hands of storage providers like these? IT directors assuming that a provider's datacentre has full redundancy might be disappointed, and service level agreements may not always be enough to deal with the problem. A recent incident at cloud storage firm Flexiscale saw customers unable to access their data after an engineer accidentally deleted a main storage volume and a bug in the system restricted customers to read-only access when dealing with their data. The company was crediting users as per its SLA, but users may find some extra cash cold comfort when they need their data and it is not available.

SLAs are even less useful when the company ceases trading. Take TheLinkup, for example, which came from the same parent firm as Nirvanix. This cloud storage company closed down in August after losing vast amounts of customer data.

With incidents like these undermining confidence in cloud storage, companies might need to consider disaster recovery options for cloud storage - perhaps even using two providers. Coding that into their applications might be more complex still.

The other problem is that SLAs might cover availability, but they cannot always cover latency, which depends on how quickly a separate internet access provider can route the data between you. With some applications likely to be sensitive about how quickly their requests are served, how many of them will be suitable for environments where network speed could dramatically affect performance?

"A customer support centre with online transactional data should not be run in the cloud. It should be fibre channel-attached storage," says Nirvanix's Buckley. "No matter how much time we can show on our systems the public internet can be tricky. In an environment requiring 99.99999% uptime, that should not be up there. And the latency problem is not solvable by Nirvanix."

That problem may be exacerbated as cloud storage services are outsourced and subcontracted, warns Hamish MacArthur, specialist storage analyst at MacArthur Stroud. If a cloud storage provider in the UK subcontracts to an infrastructure provider in another country, what legal and technical problems might that create?

"This will be about the way that groups work together, sharing different projects and tasks, and then it will be about how this storage is managed," says MacArthur. "Is it my private data that I am sharing with you? And do you have access to everything I have? Just what do I put into a federated environment?"

Consequently, companies will want to talk with potential providers about data governance, but also about encryption. Do not assume that data encryption is built into the mix. Nirvanix strips metadata from the data it is storing, for example, but it does not encrypt data at Rest. It relies on its customers to send encrypted data if that is what they want to store, and also makes it possible for CloudNAS to integrate with third-party key management systems, but this puts the onus on the customer to encrypt their own data.

Similarly, if you are focusing on cloud storage purely for backup purposes, think about versioning. Symantec's Storage Protection Network, for example, focuses purely on cloud-based backup rather than primary storage. Like many systems, it uses delta-based incremental backup, meaning that it begins by backing up everything in the designated set of files (which can take a long time) but then only backs up changes to the file set.

There are two different offerings for its online backup service. The basic one offers one full year of versions, while the second offers up to seven years of versions. "Potentially you have hundreds of versions to restore from but over time we fade out those versions so that as data ages, you need fewer restore points," says Symantec's senior product manager Darren Niller. "Daily, you could have hundreds of versions available if you are changing that fast. Then you go to a week, and you have fewer, and so on."

For many providers, storage is just the start. EMC, which offers the Mozy Enterprise data backup service, is hoping to layer other types of service on top of its backup service in the future, says Vance Checketts, chief operating officer for EMC Mozy. "Online backup is a really great way to start people down a path to cloud computing," he says. "Customers will have already paid the price of getting into the cloud. So we will be able to say how about doing X, Y and Z?"

X, Y and Z could entail everything from electronic discovery through to information lifecycle management. And when providers start offering those services, which lie outside the capabilities of many customers, that is when cloud-based storage will get really interesting.

Cloud-based storage systems

Amazon S3

Amazon's online storage system tested the market for cloud-based storage. It is cheap, cheerful but has little in the way of SLAs.


CloudFS is offered by Mosso, a cloud-based service provided by RackSpace.

EMC Mozy Enterprise

EMC released the enterprise version of its consumer-focused Mozy online backup system earlier in the year. It works on a monthly payment basis.

Box's Enterprise version folds collaboration services together with its online storage offering, giving users the chance to edit and share documents from within their box account.


Flexiscale charges no set-up fee, and provides customers with a 'virtual dedicated server' for storage.


Nirvanix offers a 'storage delivery network' that replicates your files between different geographical nodes for redundancy and high performance.

Symantec Protection Network

Symantec's SPN, launched this Feburary, features a basic online backup service, and a more advanced service for users of its Backup Exec product.

Read more on Integration software and middleware