Automated storage tiering has become the must-have feature for all new storage arrays over the last two years.
Storage tiering is the movement of data to different types of disk according to performance and capacity requirements. Tiering allows the majority of your data to reside on slower, larger, cheaper disks -- normally SATA -- whilst your most active data resides on the more performant and expensive drives, such as Fibre Channel and flash SSD.
Traditionally, the tiering process was manually intensive and a management headache in all but the smallest environments. Vendors recognised this; Compellent introduced automated tiering in 2005, and other vendors have played catch-up.
Manual tiering is also something of a blunt instrument as it requires the whole LUN to be migrated between tiers. But, within the LUN only a small proportion of the data may actually be “hot.” Automated tiered storage addresses this issue by working at the sub-LUN level to move only hot data to the highest tiers.
Here’s how the vendors in this space are implementing the technology:
Compellent pioneered automated storage tiering with the Data Progression feature of its Storage Center SAN arrays.
In Data Progression, usage characteristics about each block are captured during SAN operations, and this information is used to move data at user-set or default scheduled times; a block size is defined as 2 MB by default but can be set to as small as 512 KB or as large as 4 MB.
Data movement can be left to Data Progression’s default policy engine with the array left to balance itself; or, users can set their own policies. Policy-based automation allows high-performance volumes to be “pinned” to the highest tier if required.
It is worth noting that Data Progression’s automated tiering is not done in real time so if a workload profile changes, migration to new tiers may take some time.
EMC has a number of automated storage tiering features that come under the banner of FAST (Fully Automated Storage Tiering). FAST was announced in spring 2009, with Clariion and Celerra getting the technology first.
Symmetrix gained a fully featured FAST technology with sub-LUN capability in December 2010 and is the company’s flagship implementation. FAST VP, as this permutation is known, is part of the Symmetrix Virtual Provisioning set of features. If your data is currently organised using the Symmetrix Hypervolume and Metavolume construct, you’ll need to migrate to Symmetrix Virtual Provisioning before using FAST VP.
EMC recommends that you use a tool called Tier Advisor to assist in defining tiering policies for an existing environment. Tier Advisor monitors your I/O and recommends tiering policy settings.
EMC claims that once you have set your policies, FAST VP will start moving data around after a couple of hours and continue on a user-scheduled basis or via manual initiation. It is worth noting that migration of blocks -- which can be as small as 768 KB but are typically 7.5 MB -- is done in real time, enabling Symmetrix to react extremely quickly to changing workloads and access patterns.
HP/3PAR’s automated tiering feature is called Adaptive Optimization. It provides a policy-based system that allows 256 MB “chunklets” (that is, blocks) to be automatically migrated to various disk tiers at user-set frequencies.
Adaptive Optimization also includes QoS (quality of service) Gradients, which allow an administrator to bias data movement depending on performance or cost objectives. QoS Gradients cater to applications that may be dormant for periods of time and would ordinarily be migrated off to lower-performing drives but that have periodic spikes in activity and so need to be retained on a high tier. An accounting database that only sees high levels of activities at month- or year-end would be an example of this.
IBM’s automated storage tiering technology, Easy Tier, is supported on the company’s Storage Volume Controller storage virtualisation device as well as on the v7000 and DS8700 SAN arrays. Easy Tier, which can be run in either fully automatic or manual mode, is not supported on the DS8800 at present but will be later this year.
Easy Tier supports a two-tier storage hierarchy where one tier contains SSDs and the other tier contains HDDs of the same type; it is not possible to mix Fibre Channel and SATA in the same tier. Easy Tier also doesn’t support space-efficient (that is, thin-provisioned) volumes, and it has a fairly chunky minimum block size of 1 GB.
Easy Tier monitors array activity over a 24-hour period and generates a “heat map” that guides data migration to the most appropriate tier every 24 hours. When running in automatic mode, it is a fully automated process without a policy engine, which means the user must trust the array software to make decisions.
Because Easy Tier is supported by SVC and the v7000, it can be used to incorporate tiers in external virtualised arrays, which could be an advantage for users with a legacy estate.
Hitachi Data Systems
Dynamic Tiering works with a page size of 42 MB that can reside on any of the tiers in a volume and uses a heat-map-type algorithm to move data up and down the tiers according to frequency of access. In addition to periodic automatic migration set by user policy, a storage administrator can also manually initiate data movement. However, there is no policy engine so individual applications cannot easily be pinned to a particular tier; this may impact performance of periodically hot applications.
Dynamic Tiering supports thinly provisioned and statically provisioned volumes, as well as externally virtualised arrays.
NetApp’s approach to automated storage tiering is not to do it. CEO Tom Georgens claimed in February 2010 that “the entire concept of tiering is dying.” What NetApp advocates instead is its flash-based Flash Cache (formerly Performance Acceleration Module, or PAM), which the company says extends the buffer cache in an array and allows reads from the array to be accelerated while writes go straight to disk as usual.
NetApp claims this is a more cost-efficient use of flash than as primary storage. NetApp’s second-generation Flash Cache (PAM-II) cards can support up to 512 GB per module, but with use of A-SIS data deduplication technology alongside the PAM cards they can effectively increase the amount of flash available to the system. This can be especially effective in virtual desktop infrastructure (VDI) and other heavily virtualised environments.
Critics of this approach point out that the Flash Cache card is treated as part of the volatile memory of the array. This means time is always required to re-prime the cache in the event of an array restart.
Automated storage tiering is becoming a standard offering across nearly every storage array vendor’s product range. At present, EMC’s Symmetrix has possibly the most fully featured automated storage tiering but it is also the most complex to set up. For many enterprise customers, an out-of-the-box policy-driven approach is probably more attractive, and more vendors will likely offer sophisticated policy engines as part of their products in the future.
NetApp’s approach is gaining traction but as an adjunct to automated storage tiering. Whether NetApp embraces automated storage tiering is yet to be seen but I would expect its rivals to offer cache-based products as well as the automated storage approach. EMC’s FAST Cache, for example, is very similar.
Automated storage tiering is here to stay.