Storage tiering has been around for most of the last decade. But only three or four years ago it was by no means common to all storage array products. Now it is ubiquitous and has been given a massive boost in usefulness by the rise of flash and the possibility of an SSD tier in storage infrastructures.
In this podcast ComputerWeekly.com datacentre editor Archana Venkatraman talks to storage editor Antony Adshead about storage tiering, why you need an SSD tier, how to build it into your storage systems and how vendors differ in their implementation of tiering.
Archana Venkatraman: What is storage tiering and why do I need to think about an SSD tier?
Antony Adshead: Storage tiering is the assignment of data to different classes of media, often automatically via features in the array, so that data resides on the most suitable drive type. "Most suitable" meaning that data is matched to the drive type, according to the performance and cost characteristics it requires.
So, for example hot data would live on a flash SSD tier, while relatively unused data sets that might need to be pulled through for production work could reside on cheap and relatively power-efficient SATA drives.
In fact, the possibility of a flash SSD tier has really given storage tiering a kick up the backside. Previously, when all that was possible was tiering between different classes of spinning disk, it was often of marginal benefit and the added performance and cost savings just didn’t provide enough benefits to justify the spend.
Flash changes that. Now the SSD tier provides such a different level of performance that tiering has really become a worthwhile operation.
Of the three key choices of where to put flash, the most popular was in the array, with 67% of respondents. The second most common was in the server (33%) and the least popular was in a dedicated all-flash array (20%).
Most respondents to the survey (54%) use automated features on their arrays to move data between tiers and we’ll see how those work in a moment.
Venkatraman: How can I include storage tiering into my infrastructure?
Adshead: The two main storage tiering options are to do it manually or to let the array or other device do it automatically. In our survey, we found that 16% move data manually, a figure that was down considerably on the same survey from recent years. In 2011, for example, 31% still moved data manually between tiers.
The decline in manual movement of data is probably a result of automated storage tiering becoming mainstream in array products. But having said that there are issues with automated tiering that mean manual migration is still relatively prevalent.
For example, in some cases automated tiering lags behind real time use patterns, which may not be responsive enough to get hot data where it needs to be in time. That’s a reason some may be manually moving data.
Related to that there’s something known as the Monday morning syndrome, in which an array that has been quiet over the weekend moves operational data to slow disk. That’s another reason why you might move data manually.
So, you can choose to use a combination of automated tiering and manual movement, and that was an option indicated by 18% of respondents to the survey.
Also, in our survey 12% said they don’t move data between tiers at all, which is something I’ve come across among people who have implemented all-flash arrays.
But, by far the biggest set of respondents (54%) used automated tiering functionality to move data between tiers.
So those are the key options for how to do tiering in the broadest sense; manually or automatically or a combination of the two. And if you want automated data movement what are the key options?
Well, mainly it’s about switching on automated tiering functions in your array and most vendors have this now.
Another class of device that can enable tiering is storage virtualisation hardware that allows multi-vendor storage hardware to be part of the same pool of capacity. These are products such as IBM’s SVC (Storage Volume Controller), NetApp’s V-series with Virtual Storage Tier and Hitachi Data Systems’ VSP. There is also file virtualisation, such as from F5, which has automated tiering.
Venkatraman: How do different vendors implement storage tiering?
Adshead: The key storage vendors differ in their approaches to tiering, with wide variations in block size moved, whether data movement happens in real time or not, and the degree of customisation of policies.
Dell’s Compellent arrays, for example, allow for policy-based identification of data to move and scheduling can be set by the user. The vendor also allows identification and movement of block sizes from 512Kb to 4Mb. Tiering isn’t real time, however, so may take some time to take effect.
EMC’s Fully Automated Storage Tiering (FAST) has a default daily movement of data, though this can be set to more frequent intervals and is done in real time.
EMC and NetApp aim for the movement of small chunks of data – EMC’s is 768Kb up to 1Gb, and NetApp is in 4Kb blocks – and moving it often. EMC also allows for movement to be restricted to specific time periods so that I/O isn’t getting hit at busy times. And that illustrates a potential trade-off you’ll need to consider; the more real time your tiering mechanism the more potential there is for a processing hit.
HP’s Ibrix clustered NAS systems scan have a default daily movement, but can be set to hourly. HP’s 3PAR SAN arrays can sample for data “heat” every 30 minutes.
Finally, IBM’s Easy Tier on its v7000 arrays and the SVC storage virtualisation device monitors data usage over a 24-hour period and moves it once a day and has a minimum block size of 1Mb.
So, to sum up, tiering is getting to be a must-have and mainstream technique. And the key choices are whether to do it manually or automatically or what combination of the two, as well as to determine whether it’s something you can switch on in your existing array or to add it via another product.