Implementing a successful storage solution is always a balance between capacity, performance and cost. Given unlimited resources, all data would be deployed on the fastest media possible, which today means either NVDIMM or NAND flash.
But that luxury exists only for the most cash-rich organisations. Instead, businesses have to strike a balance between their capacity and performance needs, using methods such as storage tiering.
Tiering ensures that data sits on the most appropriate type of storage possible, based on performance requirements that include latency and throughput. For example, infrequently-accessed data may be placed on large-capacity SATA drives, whereas a high-performance transactional database may sit on NAND flash.
1 Tiering vs caching – they’re not the same thing
Before diving further into tiering, we should explain the distinction between caching and tiering. This differentiation is important when at supplier implementations.
Tiering is the movement of one persistent copy of data between different types of storage, whereas caching places a temporary copy of data into better-performing storage media. Both improve performance, but caching results in no net additional capacity in the array. The cache is a cost overhead.
The process of tiering has evolved and matured over a number of years. Initial implementations were based on placing an entire LUN or volume on a separate tier and were typically built into systems with multiple levels of HDD, rather than flash.
This rather static approach provided some ability to reduce costs, but created mobility issues when the activity level of data changed. The entire volume had to be moved to another tier of storage, which needed free space to accommodate the move.
Moving data around an array is expensive. It steals I/O that could be used for host requests and could prevent data being accessed while the move takes place.
The next phase of tiering saw a more granular approach at the sub-volume level. With this method, each volume or LUN is broken into smaller units (variously called blocks or pages by suppliers) and assigned to multiple tiers of storage, including flash.
The block-level approach provides much more flexibility to target more expensive storage at the data that can best exploit it. Only the active parts of a database have to be moved up to flash, compared to before, when it would have been moved in its entirety, along with its LUN.
Automated storage tiering takes the data placement process one step further and manages decision-making about which blocks of data are placed in each storage tier. The automation process is a natural progression for large storage arrays because the amount of effort involved to monitor all active data in a single system is too much for human system administrators.
2 Flash storage and tiering – a perfect match
Flash storage is particularly well suited to use in tiering solutions. This is because of the way I/O activity is typically distributed across a volume of data. In general, a small amount of data is responsible for most of the I/O within an application and this is reflected at the volume level.
This effect, known as the Pareto Principle or, more colloquially, the 80/20 rule, allows a small amount of expensive resource (for example, flash) to be assigned to data that caters for most of the I/O workload. Of course, the exact ratios of flash and traditional storage are determined by the profile of the application data.
There is no reason to assume that tiering has to be restricted just to flash and hard disk devices. The NAND flash device market has started to diverge into a range of products that meet endurance, capacity, performance and cost requirements.
It will not be long before array suppliers offer solutions that use multiple tiers of flash in the same appliance. For example, write-active data could be placed on high-endurance flash, with the rest of the data left on low-cost 3D-NAND or TLC flash.
3 Supplier tiering implementations differ
As we look in more detail at supplier-specific implementations of tiering, there are two aspects to consider: how tiering is applied to the application and how the supplier deploys tiering at a technical level.
All suppliers look to apply tiering to application workloads through the use of policies. The aim of using policy definitions is to abstract the workload requirements from the underlying hardware as much as possible.
This abstraction process is important because it allows additional resources to be added to an array when policies are not being met, without having to reconfigure all the existing data on the system.
Read more on storage tiering
Some early auto-tiering solutions, such as FAST VP for EMC VMAX and VNX, still have a hardware-centric approach to policy management. Data is placed into pools built from a combination of flash and traditional HDD storage, to which are applied policies such as “highest available” and “lowest available”.
The result is tiering based on a mechanism where one workload competes against another based on the prioritisation assigned to them. Data is moved between tiers using scheduled or manual processes that make recommendations about data movement based on historical activity trends.
As already discussed, moving data between tiers is expensive and should be kept to a minimum. Static data pooling tiering solutions are not as agile as required when it comes to today’s more virtualised workloads, where active I/O data could change daily or hourly. Moving data on a weekly or monthly basis means these solutions are always playing catch-up. It is much better to either capture the increased I/O activity when it starts, or to sample and move more often.
Dell’s Compellent storage architecture ensures that all write I/O hits tier 1 SSD flash as a default using a feature called Automated Tiered Storage. If that page stays active, then it can remain in tier 1 fast media. If the page becomes inactive, it is aged to a lower tier of storage, typically within 24 hours.
Tintri uses a similar methodology, called Flash First, in its hybrid flash arrays. Data is always written to flash and only evicted to disk once the data becomes cold, or inactive.
DotHill (recently acquired by Seagate) has used a more proactive approach in its tiering implementation, known as RealTier. The RealTier algorithms use three processes to determine data placement across tiers: scoring, to keep track of I/O or ranking every block or page of data; scanning, to identify candidate pages to move between tiers; and sorting, the process of actually moving pages.
Scanning and sorting occur every five seconds, with only a minimal amount of data moved in each cycle.
4 Issues with storage tiering
Allowing the storage array to make all the decisions on data movement makes sense from a general point of view, but there are times when this approach may have its problems.
There may be justifiable reasons to pin some application data permanently to one tier or another. For example, a critical application may need to be able to always guarantee response time or an archive application may never need to write to flash storage. Exceptions must be catered for within tiering policy definitions, rather than assuming that active/inactive or the “hot/cold” status of data should dictate location.
There are also other issues that may be experienced with automated tiering that must be carefully considered. For example, some workloads are time-specific, in that they become active at a particular time of day, week or month. A slow-responding tiering algorithm may cause problems for these applications when data is not moved to the performance tier quickly enough.
There is also the issue of managing contention between applications. Tiering effectively introduces competition between applications for the faster storage tiers. Without adequate telemetry, it may be hard to spot a shortage of resources in one tier that results in application performance issues across the storage infrastructure.
5 A holistic view is required
We should conclude by stating that tiering is only one feature of modern storage arrays that also deliver performance improvements through caching and data optimisation, through thin provisioning, compression and data deduplication.
All these features are intrinsically linked within the architecture, making it difficult to isolate tiering as the only cost-saving and performance-enhancing solution.