Cloud storage appliances have evolved to make cloud storage a more practical proposition in work and office contexts. They act as a translator and accelerator that will allow business systems to access public or private cloud storage as if it were local storage.
Why are they needed? While cloud storage has a lot going for it – including less hardware to buy and manage, usage-based pricing, and easy access from anywhere – what works well when storing smartphone photos and email doesn’t always work so well in an enterprise context.
It is one thing to use a web-based app that backs onto cloud storage, but quite another to use cloud storage with enterprise applications, even ones as apparently simple as file-sharing. That is because most cloud storage is object-based and stateless, accessed via web-friendly APIs, whereas enterprise software is typically file or block-based (although with the appification of the enterprise, this is changing).
What’s more, what works well at home or in a café, where each individual probably uses different apps and services, may work less well when you have a group of collaborating workers all on the same applications and datasets, and all on the same office internet connection.
Unlike legacy enterprise applications, web apps are (usually) designed to cope gracefully with the latency and bandwidth issues associated with connection over a wide area network such as the internet.
A hardware gateway can help by including local storage as a cache or buffer. This is especially useful in common use cases such as cloud backup and archiving, where local caches can accelerate backup operations and access to online data.
Read more about cloud storage
- The cloud storage narrative has been heavily US-biased, with take-up of cloud in Europe constrained by concerns over privacy and responses to Safe Harbour’s nullification.
- Hybrid cloud storage optimises the opportunities provided by the cloud while recognising and working with its limitations.
Cloud storage gateways have evolved in something of a continuum, but we can see distinctions emerge, as well as step-changes in their capabilities and intended uses.
This is the most usual model. An appliance (physical or virtual) sits on the premises, connected on one side to the LAN and on the other to the cloud. It might take cloud storage and present it to your servers as iSCSI block LUNs, say, or as CIFS file-server volumes. These devices can also include local storage, either to cache hot data locally or to serve as the primary storage tier for certain data – for performance reasons, security or whatever.
As well as gateway capabilities, these devices aim to provide services similar to those offered by traditional enterprise storage arrays, except that the data is stored in the cloud. They add features such as data deduplication, compression and encryption, and cloud-based clones and snapshots.
A step up from the controllers, these provide a higher degree of integration between cloud and local storage. In effect, they assume you will have both, and they treat the cloud storage as one of several tiers, dynamically moving data to the most appropriate tier based on policies. Related to this, we are also seeing the evolution of hybrid hardware/cloud storage arrays. These have built-in cloud integration, so they can add and utilise a storage tier that is actually located in the cloud.
The same idea as cloud-integrated storage but resident in the cloud as a virtual appliance, these serve applications that have been migrated to the cloud. For example, Avere’s CloudFusion gateway takes the different tiers of cloud storage available to it (eg, Amazon offers EC2 RAM and solid-state disk as well as its bulk S3 storage) and builds them into a virtual tiered NAS filer. Appliances in this general category come from a variety of sources, among them specialist developers, WAN optimisation companies and storage developers. There are even some from the cloud storage providers themselves, with both Amazon and Microsoft offering gateways that make their cloud storage more attractive to businesses.
Ten cloud-resident gateways
Here are 10 examples of companies that participate in this space. The list is representative but by no means exhaustive:
- Amazon’s AWS Storage Gateway sends only changed data to save bandwidth, and allows primary data to stay on-premise via gateway-stored volumes.
- Avere’s hybrid-cloud NAS gateway can, for example, add a capacity tier to on-premise performance NAS or mirror to the cloud. It also offers CloudFusion, mentioned above.
- Barracuda Backup acts as an on-premise backup target, before deduplicating data and sending it to cloud storage.
- CTERA’s cloud storage gateways provide NAS and backup services, blending local storage for speed and local sharing with cloud storage for backup, remote office synchronisation and so on.
- EMC CloudArray (formerly TwinStrata) provides iSCSI or CIFS/NFS access to a cloud-based storage tier, with dynamic local caching for performance.
- F5 ARX Cloud Extender provides CIFS file-based access to cloud storage. It is part of a wider file virtualisation scheme, which integrates storage capacity from a variety of sources within a global namespace.
- Microsoft StorSimple is a hybrid local storage device with cloud connectivity. It is designed to work as primary on-premise storage, while using Azure for cloud-based archiving, backup and DR.
- Nasuni Filers blend local disk and Flash storage with cloud storage, creating a cloud-integrated unified storage system able to serve block and file workloads.
- NetApp bought Riverbed’s SteelStore cloud backup gateway, renaming it AltaVault. It provides a local backup target with caching disk, and applies deduplication and compression.
- Panzura controllers can be virtual or physical, and operate as cloud gateways. They merge cloud storage and local solid-state disk within Panzura’s Global File System, caching active data locally.
Which cloud storage gateway you should choose will depend on a number of variables. For instance, what type of data will you be accessing? How big are the files? Will you use streaming? How random are your data access patterns and how responsive will they be to automated tiering? Will you be using cloud storage mainly for backup, say, in which case the data flow is mostly one-way, or as tier-three storage where data is frequently retrieved from the cloud? Which is more important, data access or minimising cost? And, of course, are there regulatory and auditing issues that might mandate encryption or require a verifiably non-rewritable scheme?