Unstructured content is growing rapidly and leads in terms of new data creation.
Typically, information created in unstructured formats is stored as objects in object storage or as files on network attached storage (NAS).
And so the ecosystem around NAS is evolving, with new products and solutions emerging that address the requirements of a hybrid infrastructure, such as cloud NAS.
Block devices are essentially “raw” storage and require a file system in front of them.
Object storage is great for large-scale storage of “binary” data, but generally groups files into large pools or buckets without any hierarchy. Object stores also have a basic store/retrieve mechanism, where the entire object is either written (PUT) or read (GET).
NAS offers more, such as file-based locking, hierarchical directory structures and the ability to partially read and write file content.
As a result, file storage is a great solution to share data geographically or between public and private cloud in the form of cloud NAS.
Cloud NAS benefits
Depending on the solution cloud NAS solutions can offer significant benefits over traditional NAS.
For example, IT organisations can offload administration and management functions by choosing SaaS-based solutions.
At the same time, platforms that implement a single namespace can offer virtually unlimited or infinite scaling, especially when using public cloud as the backing storage.
And, with cloud NAS businesses can move to opex-based models that charge for utilisation, rather than incurring capital expense to deploy hardware that may be poorly utilised when globally distributed.
Cloud NAS use cases
As the NAS market has matured from on-premises based filers to cloud-enabled solutions, we see four distinct categories emerge.
Public cloud SaaS: These solutions are based in the public cloud, with some offering accessibility and integration with on-premises storage. Examples here include Nasuni, NetApp Cloud Volumes and Zadara. The public cloud is the backing store for these solutions, with data accessed in public cloud or through caching appliances on-premises.
Cloud IaaS: These solutions implement NAS as an infrastructure service, either within a private on-premises cloud, through a managed service provider or in hybrid mode using some public cloud infrastructure. The difference between this and SaaS-based offerings is that the customer manages the infrastructure directly. Examples are CTERA and Panzura.
Cloud marketplace: This category covers solutions that can be deployed through public cloud marketplaces. The vendor packages the solution to run on a virtual instance, either charging by the hour or with capacity licensing schemes. In some cases, the solution may integrate with on-premises infrastructure, as is the case with Avere Systems vFXT.
Software-defined hybrid: These solutions are hybrid offerings that work on-premises or on public cloud. In many cases, these products can integrate on and off-premises data. Solutions here include Elastifile, WekaIO Matrix and Qumulo QF2.
Replacing filer-based storage
The most obvious use case for these solutions is in replacing traditional filer-based storage for home directories and shared data. In global organisations, sharing data can be time-consuming and result in many distributed, inconsistent copies of the same information. With a global namespace, cloud NAS can make the deployment and management of file content much easier.
Another use case is to provide the ability to perform analytics against the data stored in the NAS platform. Where hybrid solutions extend access of data into the public cloud, analytics services offered by the likes of Google (Cloud Platform) and Amazon Web Services can perform analysis of content using in-cloud tools.
A third use case is to build out cloud NAS as a backup solution. Most backup products offer NAS as a storage target. This allows backups from branch offices to be centralised and/or restored to alternative locations, giving some form of disaster recovery capability.
Cloud NAS suppliers and products
NetApp has recently announced Cloud Volumes in private preview. On Microsoft Azure and Google Cloud Platform (GCP), file storage is natively integrated as a SaaS solution. On Amazon Web Services (AWS), Cloud Volumes run inside a virtual instance that can be purchased from the AWS Marketplace. The underlying technology for Cloud Volumes is a cloud-based implementation of the NetApp ONTAP storage operating system. This means Cloud Volumes can offer existing data services such as snapshots and higher performance than existing public cloud NAS.
CTERA Networks offers a solution called CTERA Enterprise File Services Platform that implements a global, distributed platform for storing and sharing file content. Endpoint access is via either desktop client software (CTERA Drive) or an edge gateway/filer. The backing store for CTERA is an object store, which can be deployed on customer premises or be based in the public cloud. Typical use cases for CTERA are to replace existing file shares and home directories and as a global backup target.
Avere Systems vFXT is a virtual edge filer that can be run on either AWS or GCP, with Azure a planned addition following acquisition by Microsoft. The solution runs as a virtual instance and extends on-premises data into the public cloud, allowing content to be exposed to native cloud services such as analytics. The main benefit of using vFXT is the ability to make data visible to public cloud without having to ship the entire contents of a data set. Applications gain the benefit of low-latency local cached data, with minimal data transfer costs.
Nasuni Enterprise File Services (NEFS) is a global NAS platform based in the public cloud. The solution can be used for primary storage, archive or backup. The standard backing store for NEFS is public cloud object storage, which is managed directly by Nasuni. The customer has no visibility or access to the cloud account. Alternatively, NEFS can use private object storage. Endpoint access is through either physical or virtual edge appliances on customer’s premises. NEFS offers global file sharing and locking, with unlimited edge filers, depending on the licensing model chosen.
Read more about hybrid cloud
- Hybrid cloud object storage products are an emerging category. But what use cases are driving their emergence? And which suppliers lead the way?
- A single environment across on-premise and cloud environments is possible with a new class of product that builds file systems and object stores with hybrid cloud.
WekaIO Matrix is a scale-out distributed file system, designed to run on NVMe storage and deliver high performance at very low latency. Matrix is delivered as a software-defined storage solution that can also run on virtual instances in the public cloud. Inactive data can also be tiered to an object store (supporting AWS S3 or Swift protocols). Typical use cases for Matrix are those that require low latency, especially with small files, such as AI/machine learning analytics, or high throughput workloads such as media and entertainment.
Elastifile Cloud File System (ECFS) is a scale-out software-defined storage file system product that can run on-premise, in public cloud or as a hybrid of the two. Typical deployment models include dedicated storage (where appliances act as a storage array) or hyper-converged, mixing compute with storage on the same nodes. ECFS is designed to work on heterogeneous configurations and was recently updated to be easily deployed on Google Cloud Platform.
Qumulo QF2 (Qumulo File Fabric) is a software-defined scale-out file system that can be deployed as virtual instances, on bare-metal hardware or in public cloud. QF2 is designed to be highly scalable in a single node configuration, with B-tree indexing used to manage file and data structures. Multiple clusters can replicate file systems between many locations, to enable data to be moved in and out of the public cloud. QF2 is also available from the AWS marketplace.
Panzura CloudFS is a scale-out NAS solution that uses either public or private cloud as the centralised backing store. Edge filers can be physical appliances, virtual machines or cloud instances and provide local access to globally-available content. CloudFS is implemented as a single namespace, with global locking down to the byte-range on individual files. This allows concurrent access, even over wide geographic distances.
One Last Thing
Finally, we shouldn’t forget that object storage suppliers also enable their platforms with file support. So far, we’ve seen implementations from SwiftStack, Ceph, Cloudian, Scality and Caringo. This brings scale-out capability, and in some cases the ability to access content from either protocol. The future may well be a merging of the two data types, where the benefits of both access methods are table stakes for unstructured data solutions.