Intelligent, or automated, tiered storage is set to become a major trend in 2010, with vendors launching or updating products to meet the need for improved storage performance via smarter use of storage resources.
The Storage Networking Industry Association (SNIA) standards group defines tiered storage as "storage that is physically partitioned into multiple distinct classes based on price, performance or other attributes. Data may be dynamically moved among classes in a tiered storage implementation based on access activity or other considerations."
Essentially, tiered storage helps to reduce storage costs by matching the cost of storage media to the value of the data and how frequently it is accessed. Less-valuable data is generally stored on lower-cost storage such as SATA drives. Data value is often, but not always, related to the age of the data.
The most efficient tiered storage systems work at the block level, deciding whether a block should be moved from one storage tier to another using a combination of criteria, including storage performance, cost of storage and age of data. The price-performance calculation made when deciding where data is most appropriately stored is often not price per TB but available IOPS per TB.
A fully tiered system might consist of four tiers, with the lowest access times delivered by tier 1 and the highest by tier 4. Traditionally, tier 1 consists of Fibre Channel disks, which spin fastest and are more reliable than other types of disk, and this is where primary data with the quickest required access times is held. This is being replaced in some instances by serial-attached SCSI (SAS), which comes close to Fibre Channel's performance at a lower price, and in other cases by solid-state drive (SSD) technology. An SSD tier -- which some call tier 0 because it supplants Fibre Channel in performance and cost -- would contain only hot data to which access is required in a few milliseconds at most (for example, financial data, database table indices and virtual disks). Exploiting the performance of an SSD is expensive, but that expense might be justifiable for some mission-critical applications. Fibre Channel would probably be retained as a tier at first, but many SANs may eventually consist of SSD technology for performance and low-cost SATA for higher-capacity storage.
The second tier is likely to consist of production data used on a daily basis, such as emails and end users' files, data that's not so time-sensitive that expenditure on SSD is warranted. This might be stored in a NAS system on relatively low-cost block- or file-based storage, usually on SAS but perhaps on SATA disks for tiered storage.
The third tier is for data that is no longer in daily use, such as emails and files more than six months to 12 months old, and initial backups. You could expect these to be stored on SATA disks, perhaps using a virtual tape library (VTL) or a MAID configuration to spin down drives when not in use. This tier could also consist of physical tape, which offers cheap storage and can be accessed reasonably quickly when using an automated tape library. Data retention in this tier might be measured in weeks or months, possibly years.
The final tier consists of deep data archiving. This is for data that requires long-term retention for compliance or business purposes. For example, pension and insurance companies often need to retain data for decades to meet compliance regulations, but most of it won't be accessed for years and some of it may never be accessed again. Tape is most frequently used for this purpose. Here, access times might be measured in days, as these tapes are likely to be stored off-site.
Tiered storage vendors
Vendors differ in the degree of automation they apply to tiered storage. "All the major storage vendors offer tiered storage with greater or lesser capabilities, but not all recognise it's the software that drives everything. But that's coming as they try to automate the process," said Tony Lock, programme director at analyst firm Freeform Dynamics.
Compellent Technologies made the running in this area in 2006 with the Data Progression feature in its Storage Center SANs, which consists of policy-driven, block-level automation of data movement between tiers.
Last year, EMC launched its LUN-level FAST (Fully Automated Storage Tiering) technology, which automates tiering as a paid-for option across all of its tier 1 appliances, including Symmetrix V-Max, Clariion CX4 networked storage and Celerra NS unified storage. EMC is expected to launch a more granular version of the technology later in 2010.
SAN vendor 3PAR launched a sub-LUN-level automated tiered storage implementation -- Adaptive Optimization -- early this year at the same time it added support for SSDs.
Hewlett-Packard and Hitachi Data Systems don't yet offer the level of automation that EMC and 3PAR do. They provide data migration based on user-identified criteria and choice of storage system, but do not automatically identify which storage system offers the best location for data based on a range of policies and criteria. Both, however, use the same underlying hardware and have promised delivery of more intelligent tiered storage later in 2010.
Even when a storage vendor does not offer a tiered storage option, a tiering regimen can be set up using external controllers such as IBM's System Storage SAN Volume Controller and FalconStor Software's Network Storage Server (NSS). Such devices abstract the underlying storage, which can consist of multiple storage types from multiple vendors, and simplify management by allowing the combined subsystems to be managed as one.
Such controllers allow storage pools to be deployed as different levels in a tiered system, and so help cut costs by adding tiering to a heterogeneous storage environment. However, they primarily provide data migration rather than tiering. This allows for a degree of automation, but they are not policy-driven.
F5 Networks offers automated tiering with its ARX series of file virtualisation systems. The F5 products allow storage administrators to set policies for file migration according to data age and criticality that allow them to be moved to moved to appropriate-cost storage tiers.
Ease of use is automated tiered storage's big advantage. For most implementations, automation is the way to achieve savings and to make life easier for storage administrators, according to storage consultant Marc Staimer, president at Dragon Slayer Consulting.
"Data migration is a very stressful, manually intensive task, so tiering is only practical when it's policy based, and only a few vendors do that," he said.
The alternative is to migrate data using manual intervention, which is both difficult and time-consuming and therefore costly.
Automated tiered storage is a complex technology that needs careful setup by storage administrators to ensure data is correctly categorised according to the specific requirements and business policies of the organisation. Once that task is performed, and assuming that appropriate storage tiers are in place, tiering can ensure that data is in the right place at the right time while saving huge amounts of administrative time and effort.
This was first published in April 2010