RAID chunk size the key to RAID striping performance

RAID chunk size in striped RAID levels is set for the best possible performance for your data profile. Read this tip to learn why chunk size is important and what's best for you.

By Antony Adshead, UK Bureau Chief

RAID chunk size is an important concept to be familiar with if you're setting up a RAID level that stripes data across drives, such as RAID 0, RAID 0+1, RAID 3, RAID 4, RAID 5 and RAID 6.

But, before we talk about chunks, it's important to set the context in which RAID chunk size is important. That context is striping.

RAID striping

RAID systems take a number of disks and present them as one drive to the user. Data is written to the drives using mirroring (replicating the same data across two or more drives) and striping, in which chunks of data from the same file are distributed across the drives that form the RAID array. Often, RAID striping is used in conjunction with parity data, which functions to maintain a record of the data written to the drives and its location, information that is used in case a drive needs to be rebuilt following a hardware failure.

Striping brings big gains in performance. By writing data in small chunks across several drives, the performance of those drives can be aggregated. For example, where a single drive is restricted to its individual I/O performance and disk RPM, a write to an array made up of several drives combines the I/O of all those drives. So, an array made up of four drives with a throughput of 50 IOPS each would together have a total I/O performance of up to 200 IOPS.

RAID chunk size

When striping data across drives, you need to be sure that data is being spread across them evenly and that the size of data written to each disk is of optimum size for the type of file you're working with. The piece of a stripe that's written to each drive is called a chunk; you can control chunk size in storage subsystem management software.

The RAID chunk size should suit the I/O characteristics of the data you're working with. The key here is the size of the average I/O request you're going to place on the RAID array; as a rule of thumb, if you want big I/O requests, you should opt for smaller RAID chunk sizes, and if I/O will be small, you should go for larger chunks.

For example, if your business works with lots of video or large image files, you will want to ensure maximum throughput. That means you will need to spread data across individual drives as much as possible. For this use case smaller RAID chunk sizes (for example, 512 bytes -- one block -- to 8 KB) fit the bill because you want to take data from one drive while the others seek the next chunks to be read.

At the other extreme in terms of use cases would be running a database in which the amount of data read on each operation is small, say, up to 4 KB. Here you want a single I/O to be dealt with by one drive with one seek action rather than be spilt between more than one drive and multiple seeks. So, for use cases such as databases and email servers, you should go for a bigger RAID chunk size, say, 64 KB or larger.

Read more on SAN, NAS, solid state, RAID