When should hard drive capacity be part of the purchasing equation?

When it comes to purchasing disc drives for consolidated storage, a key question is whether the additional benefits of the best currently available products are justified. Or are they an unnecessary expense?

A storage manager purchasing disc storage needs to consider two factors: required capacity and required performance.

Required capacity is reasonably easy to determine. This involves looking at the current capacity, estimating the data growth rate over a number of years, and arriving at a figure. For example:

Current capacity = 10TB
Estimated growth per annum = 50%
Number of years growth to purchase = 3
Storage required = 5 TB + 7.5 TB + 11.25 TB = 23.75 TB

If the SAN-attached host count for a company with the capacity profile above was currently 50, you would also expect this to grow in a similar fashion. After the same three-year period, the host count would be approximately 170. Using 105 300 GB drives in a 6+1 RAID5 configuration would give approximately 24 TB of capacity over 15 RAID groups, equating to just over 15 hosts per RAID group.

The performance of a typical disc drive or RAID group can be assessed either by the number of I/O operations per second (IOPS), or by the data transfer rate in megabytes per second (MB/sec).

The sustained data transfer rate for a particular disc drive is fairly easy to measure and is readily available from vendor specification documents. However, IOPS performance can differ significantly, depending on the data being accessed, the block size on the drive, and the percentage of reads and writes that occur on the data.

There is no benefit in purchasing high-capacity disc drives if those drives do not have the performance to keep up with the hosts attached to them.
Steve Pinder
principal consultantGlassHouse Technologies (UK) Ltd.

In general, when the performance for IOPS is high, the performance for MB/sec is low and vice versa. High IOPS performance is generally required in applications that have a high number of random reads such as databases, whereas high throughput is generally required in applications that require high data volumes, such as streaming video or backup purposes.

The newest 300 GB Fibre Channel drives have a maximum sustained transfer rate of above 100 MB/sec, so a 6+1 RAID group should be able to achieve 600 MB/sec. In our example there would be 15 hosts connected to the RAID group, meaning an average throughput per host of around 40 MB/sec, which may be sufficient.

However, if the block size on the drives was small, in order to optimise performance for databases the average throughput available per host would be reduced substantially and may lead to performance degradation.

It is imperative to take into account the differing requirements of applications and servers when purchasing data storage capacity and the placement of hosts on that capacity. There is no benefit in purchasing high-capacity disc drives if those drives do not have the performance to keep up with the hosts attached to them. Buying a greater number of lower cost drives to create the same capacity may seem like a waste of money, but this additional cost may be insignificant when judged against a poorly performing storage infrastructure.

About the author: Steve Pinder is a principal consultant at GlassHouse Technologies (UK) Ltd. He has more than 11 years experience in backup and storage technologies and has been involved in many deployments for companies of varying sizes, with responsibilities ranging throughout the sales and deployment lifecycle. Prior to working for GlassHouse, Steve was an IT contractor concentrating on backup and network management roles. He has a BSc Hons in Computer and Communication Systems.

Read more on SAN, NAS, solid state, RAID