Cloud storage is for those organisations that just don't want to manage their own storage mess anymore. It's an option that becomes attractive when in-house storage provisioning starts to look too complex, too costly, and just goes on and on and on.
Cloud storage, is also oh-so-seductive. Send your data offsite to a cloud storage service provider and they will provision LUNs, manage RAID levels, migrate data between storage tiers, set up connections, maintain disk drives, manage firmware upgrades and run all the replication, snapshot and backup shenanigans. In short, they do it all on your behalf, with all your capital expenditure (CAPEX) transferred to operating expenditure (OPEX), which makes your numbers look better.
Oh, feel the love. It's a good pitch, and using the cloud for storage is becoming easier to do. Companies like Iron Mountain and Nirvanix are serious suppliers, committed to providing good service-level agreements (SLAs), and being members of a responsible and reliable industry. They deplore data loss in stupidly careless incidents like the T-Mobile-Danger-Microsoft case last year, where blitheringly idiotic procedures in a data centre put data out of reach.
Amazon with S3, Microsoft with Azure, EMC with its Atmos/Mozy operations and Iron Mountain are all building up cloud storage services. But how do you use them? When you have a service contract in place, data sent to the cloud is deduplicated, encrypted and indexed before transfer so that minimal bandwidth is used, it's secure from prying eyes, and the cloud content index can be searched locally before identifying data to be restored from the cloud.
The first bulk load to go to the cloud disk library can consume a lot of bandwidth, but this can be done with a hard disk transfer.
Nasuni's move to the cloud
What's cloud storage best suited for? Well, for one, cloud archive is beginning to be thought of as a standard cloud use case. There are no fast access constraints as there are with primary storage data, and read access to archive data is pretty minimal. And by sending deduplicated data, network bandwidth needs are kept to a minimum, keeping costs down.
But what about primary data? That's more of a stretch. Clearly, you need a fast network link, which will cost money. But you also need rocket-like performance in the SLA, and that will edge up the cost of the service, which makes the cost comparison between local storage and cloud storage more finely balanced.
One startup, Nasuni, offers a network-attached storage (NAS) filer in the cloud. What you get from Nasuni is a software NAS appliance packed as a VMware virtual machine. It runs on 500 GB of disk space locally, and to accessing servers it looks like a local filer. This software appliance then sends read and write data requests to a target cloud storage service supplier – currently Amazon's S3 or Iron Mountain, although Nirvanix and Rackspace will be added in the next few weeks.
Nasuni has written code to handle read and write requests, and will offer its software appliance bundled with the cloud storage service, making it easier to set up. The company said there is a file-versioning problem with the cloud and that it has sorted this out by using periodic snapshots so its filer can be restored back to a desired point in time.
Data is deduplicated and encrypted before being sent cloud-wards, and all metadata associated with files is sent to the cloud as well. If the host running the software filer breaks, you set up a new one and reconnect. Any high-availability needs are handled by setting facilities up in VMware.
Cloud storage pioneer or guinea pig?
Will this work? We don't know. The logic seems good but everyone trying this out is a pioneer, and there's risk in being the first over the top. Prospective cloud storage customers have to have CAPEX restrictions, a preference for OPEX over CAPEX, and a very strong wish to get rid of their in-house NAS infrastructure or avoid extending it.
To aspire to be the best cloud storage and to risk the uncertain journey, you have to be coming from an actual or feared local-storage hell. There has to be a real good reason to be a cloud pioneer, especially as it will take time before the benefits become real.
Would you sell your office and have all your staff work from home in a kind of virtual office? It's that kind of decision. You're putting your data through a network pipe somebody else owns and, with a hierarchy of providers between you and your target cloud service, sending it into a remote cloud service that buys NAS storage arrays from vendors you don't know and operates them in the way it thinks best behind an SLA wall.
You'll be remote from the cloud provider's hardware and software, and you'll need to trust your network providers, too. Doubts must exist over whether you'll have a one-throat-to-choke service.
Time is needed for trust to develop in best cloud storage. Only pioneers with an urgent need or big players that can afford to experiment should try these things out for now. Let somebody else get the arrows in the back if you can afford to play it safe. If your storage mess is stable and acceptable, then stick with it and learn about cloud storage pitfalls from somebody else's pain.
This was first published in February 2010