nobeastsofierce - Fotolia

Feature

Top five things to know about flash and storage tiering

Automated storage tiering boosts storage efficiency and saves money, but why is tiering different from caching, why is it suited to use with flash and how do suppliers differ?

Chris Evans

Published: 26 Nov 2015

Implementing a successful storage solution is always a balance between capacity, performance and cost. Given unlimited resources, all data would be deployed on the fastest media possible, which today means either NVDIMM or NAND flash.

But that luxury exists only for the most cash-rich organisations. Instead, businesses have to strike a balance between their capacity and performance needs, using methods such as storage tiering.

Tiering ensures that data sits on the most appropriate type of storage possible, based on performance requirements that include latency and throughput. For example, infrequently-accessed data may be placed on large-capacity SATA drives, whereas a high-performance transactional database may sit on NAND flash.

1 Tiering vs caching – they’re not the same thing

Before diving further into tiering, we should explain the distinction between caching and tiering. This differentiation is important when at supplier implementations.

Tiering is the movement of one persistent copy of data between different types of storage, whereas caching places a temporary copy of data into better-performing storage media. Both improve performance, but caching results in no net additional capacity in the array. The cache is a cost overhead.

The process of tiering has evolved and matured over a number of years. Initial implementations were based on placing an entire LUN or volume on a separate tier and were typically built into systems with multiple levels of HDD, rather than flash.

This rather static approach provided some ability to reduce costs, but created mobility issues when the activity level of data changed. The entire volume had to be moved to another tier of storage, which needed free space to accommodate the move.

Moving data around an array is expensive. It steals I/O that could be used for host requests and could prevent data being accessed while the move takes place.

The next phase of tiering saw a more granular approach at the sub-volume level. With this method, each volume or LUN is broken into smaller units (variously called blocks or pages by suppliers) and assigned to multiple tiers of storage, including flash.

The block-level approach provides much more flexibility to target more expensive storage at the data that can best exploit it. Only the active parts of a database have to be moved up to flash, compared to before, when it would have been moved in its entirety, along with its LUN.

automated storage tiering takes the data placement process one step further and manages decision-making about which blocks of data are placed in each storage tier. The automation process is a natural progression for large storage arrays because the amount of effort involved to monitor all active data in a single system is too much for human system administrators.

2 Flash storage and tiering – a perfect match

Flash storage is particularly well suited to use in tiering solutions. This is because of the way I/O activity is typically distributed across a volume of data. In general, a small amount of data is responsible for most of the I/O within an application and this is reflected at the volume level.

This effect, known as the Pareto Principle or, more colloquially, the 80/20 rule, allows a small amount of expensive resource (for example, flash) to be assigned to data that caters for most of the I/O workload. Of course, the exact ratios of flash and traditional storage are determined by the profile of the application data.

There is no reason to assume that tiering has to be restricted just to flash and hard disk devices. The NAND flash device market has started to diverge into a range of products that meet endurance, capacity, performance and cost requirements.

It will not be long before array suppliers offer solutions that use multiple tiers of flash in the same appliance. For example, write-active data could be placed on high-endurance flash, with the rest of the data left on low-cost 3D-NAND or TLC flash.

3 Supplier tiering implementations differ

As we look in more detail at supplier-specific implementations of tiering, there are two aspects to consider: how tiering is applied to the application and how the supplier deploys tiering at a technical level.

All suppliers look to apply tiering to application workloads through the use of policies. The aim of using policy definitions is to abstract the workload requirements from the underlying hardware as much as possible.

This abstraction process is important because it allows additional resources to be added to an array when policies are not being met, without having to reconfigure all the existing data on the system.

4 Issues with storage tiering

Allowing the storage array to make all the decisions on data movement makes sense from a general point of view, but there are times when this approach may have its problems.

There may be justifiable reasons to pin some application data permanently to one tier or another. For example, a critical application may need to be able to always guarantee response time or an archive application may never need to write to flash storage. Exceptions must be catered for within tiering policy definitions, rather than assuming that active/inactive or the “hot/cold” status of data should dictate location.

There are also other issues that may be experienced with automated tiering that must be carefully considered. For example, some workloads are time-specific, in that they become active at a particular time of day, week or month. A slow-responding tiering algorithm may cause problems for these applications when data is not moved to the performance tier quickly enough.

There is also the issue of managing contention between applications. Tiering effectively introduces competition between applications for the faster storage tiers. Without adequate telemetry, it may be hard to spot a shortage of resources in one tier that results in application performance issues across the storage infrastructure.

5 A holistic view is required

We should conclude by stating that tiering is only one feature of modern storage arrays that also deliver performance improvements through caching and data optimisation, through thin provisioning, compression and data deduplication.

All these features are intrinsically linked within the architecture, making it difficult to isolate tiering as the only cost-saving and performance-enhancing solution.

Top five things to know about flash and storage tiering

Automated storage tiering boosts storage efficiency and saves money, but why is tiering different from caching, why is it suited to use with flash and how do suppliers differ?

1 Tiering vs caching – they’re not the same thing

2 Flash storage and tiering – a perfect match

3 Supplier tiering implementations differ

Read more on storage tiering

4 Issues with storage tiering

5 A holistic view is required

Read more on Computer storage hardware

Pure punts raft of unifying features in FlashBlade file and object

How can flape storage help my organization?

all-flash array (AFA)

hybrid flash array