But, before we talk about chunks, it's important to set the context in which RAID chunk size is important. That context is striping.
RAID systems take a number of disks and present them as one drive to the user. Data is written to the drives using mirroring (replicating the same data across two or more drives) and striping, in which chunks of data from the same file are distributed across the drives that form the RAID array. Often, RAID striping is used in conjunction with parity data, which functions to maintain a record of the data written to the drives and its location, information that is used in case a drive needs to be rebuilt following a hardware failure.
Striping brings big gains in performance. By writing data in small chunks across several drives, the performance of those drives can be aggregated. For example, where a single drive is restricted to its individual I/O performance and disk RPM, a write to an array made up of several drives combines the I/O of all those drives. So, an array made up of four drives with a throughput of 50 IOPS each would together have a total I/O performance of up to 200 IOPS.
RAID chunk size
When striping data across drives, you need to be sure that data is being spread across them evenly and that the size of data written to each disk is of optimum size for the type of file you're working with. The piece of a stripe that's written to each drive is called a chunk; you can control chunk size in storage subsystem management software.
The RAID chunk size should suit the I/O characteristics of the data you're working with. The key here is the size of the average I/O request you're going to place on the RAID array; as a rule of thumb, if you want big I/O requests, you should opt for smaller RAID chunk sizes, and if I/O will be small, you should go for larger chunks.
For example, if your business works with lots of video or large image files, you will want to ensure maximum throughput. That means you will need to spread data across individual drives as much as possible. For this use case smaller RAID chunk sizes (for example, 512 bytes -- one block -- to 8 KB) fit the bill because you want to take data from one drive while the others seek the next chunks to be read.
This was first published in November 2010