Clustered NAS has its roots in the worlds of media and high-performance computing; these two areas have dealt with the problems of operating massively scalable storage solutions for longer than most.
Traditional NAS solutions still hark back to the earliest days of Auspex Systems and NetApp, where a NAS solution at the very basic level was a server with some disk attached to it. You could add more disk and a more powerful server, but scalability was limited in terms of performance and capacity.
Traditional NAS solutions essentially comprise a single storage device; more than one of them may be configured in failover cluster, but scalability is limited by the amount of CPU/memory and disk that a single NAS device can make use of. In the case of failover environments, best practise places an upper limit of 50% of each server's individual capacity to provide the space required for failover.
By contrast, clustered NAS allows horizontal scaling across a number of devices with all of them being active and able to see all files in the cluster. This has a number of advantages:
- If your storage servers become CPU/memory-bound, you can add a device to gain processing power without adding disk.
- If you run out of storage, you can add disk that all devices can see, but you don't have to purchase additional devices.
- A device failure is non-disruptive, and the load of the failed unit can be spread across the whole cluster.
The special sauce in the leading clustered NAS products is a distributed file system. This enables all the nodes in a cluster to see all the files in the environment; examples are OneFS from EMC-Isilon, General Parallel File System (GPFS) from IBM and Ibrix Fusion from Hewlett-Packard.
This ability to scale performance and capacity requirements independently of each other is an important feature of most clustered NAS solutions. This allows more effective use of resources compared with traditional NAS, as it is no longer necessary to purchase new NAS devices to add capacity or to purchase storage when all that is required is more throughput at the storage server level.
Clustered NAS can carry out all traditional NAS file serving requirements in a more scalable manner. For example, SONAS from IBM starts at 27 TB and could be configured with just a couple of nodes. This would compare very reasonably to a traditional NAS solution.
But NAS clustering really comes into its own when you have a rapidly growing NAS estate scaling to many terabytes of storage with a rapid growth curve and the requirement to grow non-disruptively and with minimal migration effort.
In the past if growth required more NAS, you often needed to migrate existing data to the new, larger-capacity device. With clustered NAS the addition of extra capacity and performance does not require a data migration exercise since all storage servers can see all the data.
There's far less effort involved in managing clustered NAS compared with multiple traditional NAS devices, and I have found in discussions with colleagues in the industry that with clustered NAS we can manage in excess of 1 petabyte (PB) per full-time equivalent (FTE) employee.
Clustered NAS is beginning to make a big impact in large virtualised environments where many thousands of server images along with their data can be stored in a multi-node NAS cluster. EMC's acquisition of Isilon will certainly drive the use of NAS clustering in VMware environments.
Clustered NAS vendors
With EMC's acquisition of Isilon at the end of 2010, HP's acquisition of Ibrix in 2009 and NetApp's acquisition of Spinnaker in 2003, there are now a number of mature vendors in this space. And Symantec has even waded into the market. Here's a breakdown of these vendors' products.
EMC. The core of Isilon's product is the OneFS Operating System, which scales performance in a near-linear fashion as more nodes are added, up to a maximum of 144, and can provide a capacity of more than 10 PB in a single file system.
An Isilon cluster can be made up of a number of nodes that provide for IOPS, sequential throughput or capacity, which allows for a great deal of flexibility in configuration.
Isilon offers automated storage tiering using an automated policy engine known as SmartPools. SmartPools also allows additional nodes to be added and data to be restriped across these nodes non-disruptively.
IBM. IBM has built its own clustered NAS solution based on its mature GPFS clustered file system and standard Lintel servers; these have been combined to produce SONAS (Scale Out Network Attached Storage).
SONAS supports billions of files and more than 14 PB of storage in a single file system with up to 30 interface nodes and 30 storage pods able to be configured in a single SONAS cluster.
Different types of disk can be put into different pools with a policy engine used to determine file placement and file migration. The policy engine can restripe data when new nodes are added. Tape can also be fully integrated as an additional pool with Tivoli Storage Manager, providing transparent hierarchical storage management (HSM) capabilities.
HP. HP has bundled the Ibrix software it acquired with HP server technology to build the X9000 Network Storage System. This comes in a number of models, including gateways that allow customers to provide their own disk but also fully integrated appliances that contain servers and storage. All models in the X9000 range can be combined into a single file system to provide up to 16 PB of file space. The X9000 supports data tiering that can move data seamlessly and without disruption onto appropriate tiers of storage.
NetApp. The results of NetApp's Spinnaker acquisition were realised in the form of the OnTap 8 operating system. OnTap 8 provides a traditional NAS environment but also can be configured in cluster mode to provide a scale-out environment. This decision must be made at the install of your NetApp appliance, and currently there is no way of migrating between these two modes. OnTap 8 cluster mode allows up to 24 (or 12 pairs) of NetApp filers to be combined into a single cluster. OnTap 8 has probably the least in common architecturally with any of its competitors. Cluster mode only really allows each of the filers to serve one another's file systems via a single service name; there is no single global file system.
So although the global namespace could cover the full capacity of all of the filers combined -- which is currently in excess of 40 PB -- the individual file systems are limited to 100 TB. This is a serious limitation in the NetApp implementation of clustered NAS as it lacks the elegance of one large file system and will require more work, including an increased data migration workload, to scale environments.
Symantec. Symantec has recently launched clustered NAS in the form of the Filestore N8300 in a partnership with Huawei. Although Symantec's N8300 offering only scales to 1.4 PB, Huawei has a high-end offering in the form of OceanSpace N8500, which scales to more than 15 PB of storage. Depending on Symantec's success with the N8300, we may see a Symantec-badged N8500 in future.