Simon Johnson, data recovery practice lead at GlassHouse Technologies (UK), discusses the basics of network-attached storage (NAS), how it differs from other forms of storage by providing storage and file system capabilities, what it can and can't be used for, as well as how you can scale NAS systems for small office/home office and the enterprise by using clustered NAS and NAS gateways. Johnson's answers are also available below as an MP3 download.
Q. What is NAS and how does it differ from other forms of storage?
A. NAS is essentially storage that's accessed over an IP network. A NAS appliance is a combination of components we'd typically find in servers or in laptops -- CPU, memory, disk storage and, most importantly for NAS, an operating system which has been optimised for file sharing.
The OS will typically run multiple protocols to share storage over IP -- NFS for Unix, CIFS for Windows and AFP for Apple Mac. NAS units are typically made up of one or more processing units or heads, and a number of disk enclosures in the back end which will be provided with the NAS or that can be leveraged from your existing storage estate.
While NAS typically provided storage at the file level, implementations can leverage block-level presentation over iSCSI, and in future will be able to do so with Fibre Channel over Ethernet [FCoE]. NAS rarely limits clients to a single protocol. There's a lot of flexibility to serve up storage in different formats and across different operating systems.
Regarding how NAS differs from other forms of storage, there are three main forms: SAN, i.e., block-level Fibre Channel storage-area networks; NAS, i.e., storage presented across an IP network; and storage directly attached to servers. At first glance, NAS and SAN might seem almost identical and, in many cases, either will work in any situation. Both use RAID at the back end and serve data up to systems on the network. However, there are some key differences.
NAS provides storage and file system capabilities. This is often contrasted with SAN, which provides block-based storage and leaves file system concerns to the client side. SAN systems operate over a dedicated Fibre Channel network which is typically used only for storage, whereas NAS is used on an IP network that's shared with other application traffic.
Q. What can NAS be used for and what is it not suitable for?
A. Traditionally, NAS was used for applications at the file level such as file sharing, document management systems or a target for archive solutions, and the presentation of unstructured data or data that doesn't have to be presented at block level. NAS has always been good for situations where access is required by both Windows and Unix OSes as it supports NFS, CIFS and Apple Mac network file systems so it can serve up files across the board.
NAS hasn't been typically used for intensive block-based storage applications such as large database or email systems, so where we saw aggressive requirements or SLAs [service-level agreements] with regard to data being served up we saw that with SAN rather than NAS.
As the size and performance of NAS has increased, however, these limitations have been lifted. Mid-tier and high-end NAS solutions can today be used as a repository for those systems we typically saw in the SAN arena: Exchange, Oracle, SQL, SAP. These structured systems can now be delivered through NAS.
Protocols such as iSCSI bridge the gap between file-based and block-based solutions. This is also being taken forward by FCoE, which we'll come to in a minute, iSCSI or protocols that wrap Fibre Channel into IP so they can be transferred over an Ethernet network. Where a SAN solution encapsulates SCSI commands inside Fibre Channel frames, the iSCSI protocol encapsulates SCSI commands inside TCP/IP.
Applications which traditionally demanded block-based access to their data can use a network-based solution but using Ethernet rather than Fibre Channel. Moving forward, FCoE enables the mapping of Fibre Channel frames over full duplex networks. This enables Fibre Channel to leverage 10 GigE networks while preserving the Fibre Channel protocol.
Q. In what forms can I buy NAS products?
A. NAS products come in all shapes and sizes from inexpensive single disk solutions aimed at the home or small office environment to large-scale enterprise systems providing petabytes of capacity. Enterprise systems can be deployed as single units, clustered systems or as NAS gateways.
A single unit consists of a disk enclosure and a controller or head unit. This is the lowest cost solution but provides no protection against the failure of the head unit. This is the type of solution offered for the home and small office.
A clustered [NAS] solution provides high availability with two or more heads connected to a shared set of disk enclosures. Different vendors implement head units in clusters in different ways; some use an active-active model with both units in operation and in the event of failure IO activity moves from one to the other. This enables all units to service requests from the hosts. Other vendors use an active-passive model so one head will facilitate IO requests and if that fails, it'll pass over to the passive node which will take over and become active.
A NAS gateway is a head which is used to connect third-party storage, which is usually provided from the SAN. If you already have a significant investment, and you want to leverage NAS as has been discussed, you can purchase a gateway rather than having to buy a NAS appliance with storage already built in. All the data is stored on the SAN, so you're getting the performance benefits and the HA [high availability] and RAID redundancy with the NAS gateway enabling optimised file-based access.
Q. How do I scale NAS?
A. The smallest systems discussed, aimed at home and small offices, often cannot be scaled at all or easily. They're designed as a standalone purchase with a fixed IO capability and fixed storage capacity on the back end. Departmental solutions can be scaled by adding additional storage and by adding additional front-end IO capacity -- network interface controllers, Fibre Channel HBAs [host bus adapters], CPU, memory -- so that the NAS controllers can service more clients and drive more traffic to the back-end storage.
NAS gateway systems allow an even greater level of scalability as they allow access to the large, enterprise-level storage systems. Those gateways sit in front of investments you already have in SAN. They service a certain amount of storage and IO transactions, and as you want to move up from that you can purchase additional gateways and continue to leverage that back-end storage while addressing the transactional operations demanded by the servers they're servicing.