SSD RAID essentials: What you need to know about flash and Raid

With the advent of flash and high-capacity HDDs, what do today’s storage professionals need to know about Raid and its new variants?

Chris Evans

Published: 16 May 2013

As a fundamental of data protection, Raid (redundant array of independent disks), has been around since the mid-1980s. The idea is quite simple; use multiple disk drives to enable data protection (via mirroring or parity) and spread data across all the drives to allow any failing unit to be rebuilt in reference to the others.

But what about SSD Raid? How does the advent of solid-state media affect the use of Raid? Let us recap on the basics of Raid, then explore the ramifications of the use of flash for Raid.

Raid is implemented in many variants. These include Raid 1/10 for mirroring, which provides good read and write performance; Raid 5 for capacity, which has good read performance but delivers less well on write input/output (I/O); and Raid 6, which provides for a higher degree of availability than Raid 5, due to the extra parity data it stores.

As well as Raid level, storage administrators must consider other factors that have a bearing on performance.

Stripe set size is the number of disks across which data is written. As the stripe set increases, data is written across more drives and can result in greater I/O capability.

However, large stripe sets with high-capacity drives can result in failure during a data rebuild, due to unrecoverable read errors. This is where the drive rebuild fails, due to being unable to successfully read a block of data needed to complete the process.

Raid rebuild times also increase significantly as drive capacities grow and rebuilds can now take days to complete, depending on the continuing background workload, while drives of capacities predicted for the future will incur rebuild times running into months. Even a rebuild that takes a few days results in an unacceptable length of time for production data to remain unprotected.

The evolution of Raid

Raid has continued to evolve and we have seen new protection methods that use the essential components of Raid, but distribute data and parity information in new ways.

For example, the idea of building resilient storage from block-level Raid has been implemented in many systems, including HP’s 3Par platform and IBM’s XIV. The XIV array divides physical disks into 1MB partitions and mirrors them across all devices in the array. For any single disk failure, all disks in the system are involved in the rebuild, making recovery time significantly faster than with traditional Raid.

The idea of Raid has also been challenged in other ways; Hyperscale computing has moved the unit of redundancy up to the server level. Here, the costs of implementing Raid (controller cards and/or software and additional disk capacity) have been replaced by redundant groups of servers.

Some suppliers, such as X-IO, have built black-box sealed-unit disk arrays into which Raid resilience and additional capacity has been added but which cannot be upgraded or repaired during the lifetime of the device.

Using Raid and SSD

So how relevant is Raid to the new world of flash drives? Unlike spinning disk hard disk drives (HDDs), flash drives have no moving parts and are not subject to mechanical failure such as disk head crashes. To improve the life of the device, solid-state devices implement wear leveling and other algorithms to distribute write I/O, which over time would cause these devices to fail prematurely.

But, despite their differences, flash drives do fail. There is always the risk of component failure (for example, issues with the device controller) and eventually an SSD will fail because they have limited write I/O capacity. This means some protection is required to cater for failure scenarios.

The question is, how this protection should be achieved. Suppliers offering new all-flash arrays have typically implemented system-wide redundancy to gain the benefits of using all devices for I/O and to evenly distribute write I/O to gain maximum lifetime from all solid state components.

Violin Memory, for example, implements a proprietary Raid technology called vRaid. This distributes I/O load across all components and ensures the normal erase cycle encountered when writing to SSD does not affect the performance of other I/O host traffic.

The impact on performance of the erase cycle for read I/O from SSD devices may result in performance problems that can be mitigated using Raid. Pure Storage’s FlashArray uses a proprietary Raid known as Raid 3D. This treats read I/O delays on a single flash drive as a device failure and reads the data by rebuilding the read request from other devices in the same parity group. This is only possible because of the high performance and consistent response times of solid-state devices.

SSD Raid in products

Raid has limitations and these are being experienced as individual disk capacities scale into many terabytes. Building arrays from the ground up – especially using SSDs and flash components – offers the opportunity to be creative with new models of data protection that extend the Raid paradigm.

However, of the established storage suppliers, only Hitachi Data Systems (HDS) has developed a bespoke flash module, with all suppliers treating SSDs as traditional hard drives in its Raid implementations. However, as we move forward with new array designs, the traditional view of Raid will become a thing of the past.

SSD RAID essentials: What you need to know about flash and Raid

With the advent of flash and high-capacity HDDs, what do today’s storage professionals need to know about Raid and its new variants?

The evolution of Raid

Using Raid and SSD

SSD Raid in products

Read more about Raid

Read more on SAN, NAS, solid state, RAID

Choosing from a universe of SSD form factors

What is SSD RAID (solid-state drive RAID)?

AHCI vs. RAID: Features, differences and applications

disk array