tonda55 - Fotolia

Hybrid cloud solutions address cloud data storage’s three key challenges

Hybrid cloud storage addresses the three key challenges of cloud storage—latency, security and reliability—by building in local cache storage and hardware security.

Cloud data storage offers hitherto unparalleled flexibility and capacity-on-demand benefits. But, cloud storage also poses some difficult challenges that have limited its uptake so far. These include issues around latency as well as the security and reliability of data transport that arise when storage resources are located remotely. Consequently, hybrid cloud solutions for storage have emerged, seeking to overcome these issues by locating some storage resources locally and effecting reliable and secure transport to the cloud.

In this article we look at what hybrid cloud storage products offer and some of the leading vendors in the hybrid cloud storage market today.

What is hybrid cloud storage?

Hybrid cloud storage is a method of deploying storage that uses local and cloud-based storage resources. These hybrid cloud solutions can be contrasted with purely local storage, where all hardware sits within the customer data centre, or a completely cloud-based solution, where all resources sit in the cloud and are accessed across the Internet.

Hybrid cloud storage consists of an appliance provided by the vendor in conjunction with a connection to remote storage resources. The implementation appears to the user as a single entity presenting disk storage. The appliance may be supplied as physical hardware or as a virtual machine, and hybrid cloud storage implementations can offer both file and block protocols.

Why go hybrid?

The most obvious benefits to cloud data storage are capacity and the ability to scale. There are, however, a number of challenges in delivering storage resources remotely across the Internet between a cloud service provider and the customer.

These include:

Latency. Latency in this context is the time taken for data to be transmitted over the Internet between the provider and the customer. Higher latency values mean longer response times and lower throughput of data. In local SAN environments buffers in the array and host enable multiple blocks of data to be “on the wire” at any one time, and response times are typically 10 milliseconds (msec) or less, with only a small part of that being the fabric transport time (whether Ethernet or Fibre Channel). As latency increases—such as in transport over long distances to and from the cloud provider—throughput drops dramatically as less data is in transit across the network.

Security. Security means the use of secure transfer protocols and authentication of user requests. Cloud storage is widely perceived as being less secure than retaining data in-house.

Reliability. Data travelling between a cloud storage provider and the host needs to be moved reliably. Reliable delivery means ensuring data transfers are completed successfully and acknowledgements received in the correct order.

In local storage, the protocols used ensure reliable delivery of data, but these cannot be used for transport to and from the cloud. Within the data centre, Fibre Channel, iSCSI and NFS/CIFS are dominant, but they are not suited to long-distance operation. Fibre Channel is a nonrouted protocol, so it relies on IP for routing over disparate networks or requires dark fibre or other dedicated expensive connections. CIFS and NFS are both ”chatty” protocols in that they require a large management overhead; CIFS, for example, requires confirmation of each block of data transferred before transferring the next. Where a network has large latency issues, performance with CIFS can be particularly slow.

How hybrid cloud solutions resolve some of the challenges

Hybrid cloud storage addresses issues related to cloud data storage in the following ways.

Latency. Hybrid cloud solutions overcome latency issues by caching data locally and using WAN optimisation techniques to reduce data traffic. For read requests, this often means use of a least recently used, or LRU, algorithm, where the most recently accessed data stays in cache and is expired or replaced with newer data over time. For write requests, the appliance may choose to store data locally, then write in bursts to improve traffic flow. It is important when evaluating hybrid cloud products to ensure you understand the write caching process used since data loss could occur if the appliance or Internet connection fails.

Security. A hybrid cloud storage appliance provides security at a number of levels. Firstly, it provides secure access to the cloud storage provider, based on the provider’s authentication mechanism. Secondly, it encrypts data in transit across the network (using protocols such as SSL/TLS). Thirdly, it encrypts data at rest within the cloud provider’s storage environment.

Reliability. Appliances provide standard protocol support within the client data centre, including NAS protocols (CIFS and NFS) and block protocols such as iSCSI. But, these are difficult or impossible to use for data to/from the cloud. So, appliances convert local protocol instructions to Web-based APIs such as  Representational State Transfer (REST), which use simplified I/O commands that perform read, write and delete of data stored as objects. In addition, data integrity is maintained by storing metadata containing write time stamps with the content itself. The time stamps allow data to be reconstructed in the correct order in the event of a failure.

Hybrid cloud products

A number of vendors offer hybrid cloud storage products. Some of the most popular are discussed here.

Nasuni, for one, offers physical and virtual appliances that appear to the local user as a NAS filer. Data written to the appliance is encrypted with customer keys and stored with one of Nasuni’s recommended cloud services, such as Amazon S3. The appliance provides features such as snapshots and can replicate data to other filer nodes by sharing encryption keys. Should a filer be lost, it can be re-created within minutes through the Nasuni website since all of the configuration data for each filer is also stored with the cloud storage provider. Nasuni claims unlimited capacity for its solution. Clearly, performance will be limited by the size of the local cache (which is user-configurable) and by the amount of active data.

Nirvanix also offers a hybrid cloud gateway appliance, known as a Hybrid Node. This provides a NAS interface that stores data across Nirvanix’s seven geographically dispersed data centres. Besides standard NFS/CIFS interfaces, the Nirvanix appliance also integrates with software from a number of major backup vendors, including CommVault and Symantec. Nirvanix nodes start with a 200 TB capacity and can scale to multiple petabytes.

StorSimple, for its part, offers a range of hybrid cloud storage devices that provide iSCSI LUN access into cloud storage. The local appliance uses solid-state storage (both single-level cell, SLC, and multi-level cell, MLC) as a cache, while providing features such as thin provisioning, clones and snapshots. StorSimple solutions scale from 10 TB to 200 TB per appliance.

Finally, Panzura’s Alto 6000 Series Cloud Controllers can provide access to local and public cloud infrastructures, making it useful as part of tiered solution. The 1U servers offer both CIFS and NFS support and can scale to 24 TB of local storage with effectively unlimited capacity through cloud providers.

Next Steps

Cloud data storage isn't always straightforward

Read more on Data protection, backup and archiving