A quick check on the website of any online retailer shows that storage is cheap. A consumer can get a 4TB capacity drive for less than £100. That’s an incredible £0.025 per GB.
But while raw drive prices are measured in pennies per gigabyte, the actual cost of storage for the enterprise is much higher, and more difficult to calculate. There are many factors that need to be considered, including media type, platform hardware and software costs.
So how can we work out a fair $/GB price, is this still the right metric and how do we use it to compare products from multiple suppliers?
Over the past 15 years of shared storage in the enterprise, the $/GB metric has been the de facto standard in product price comparison.
In the early days of SAN storage, things were pretty simple: there were 10,000rpm drives and RAID 1 systems. Things quickly became more complex, with other protection systems (RAID 5, RAID 6) and an increase in performance (to 15,000rpm drives) followed by a scaling back ( to 7,500rpm SATA drives), and all supported within the same array.
From 2009, we saw the introduction into enterprise platforms of flash storage with hot data dynamically assigned to flash as a tier or cache, making it even more difficult to calculate a simple $/GB figure for capacity.
So what is the best strategy to price storage? Here are some options to consider:
Start with raw $/GB. In any calculation, it makes sense to begin with raw capacity numbers before any reduction for data protection (for example, RAID) or any multipliers that result from data deduplication or compression, for example. Regardless of platform features, raw capacity allows a fair measure of one supplier’s platform against another.
More on storage and IT pricing
Compare like for like. It’s not fair to compare SATA drives with flash drives in terms of performance or capacity, so don’t try to. Instead, look at either performance or capacity requirements (whichever is the most appropriate for the media) and base your calculation on that.
Check for Binary or Decimal GB. Yes, there is a difference. Storage suppliers quote capacities based on multipliers of 1,000, so 1GB = 109 bytes. But, traditionally, storage has been sized using binary values, where one 1GB = 1,073,741,824 bytes = 10243 = 230 bytes (technically known as a Gibibyte). The difference in these two figures is 7%, which can be significant in sizing calculations.
Build a standard model. Enterprise storage consists of more than just disk hardware. There are controllers, back-end directors, front-end ports, backplanes, management nodes and other pieces required to run a shared storage solution. Suppliers sell either as a package or price each component individually. On top of that, there is software. Some suppliers bundle software features for “free”, and some charge for each component. Either way, you’re paying for software in the cost of the product, explicitly or otherwise, and it should form part of your $/GB calculation.
Supplier comparisons cannot be made without having a similar configuration to compare against. This is why establishing design requirements is so important and why new storage deployments can be so complex. In legacy platforms from the last 15 years, a lot of time and design effort was invested in scaling the back-end and front-end connectivity of storage arrays. Today this is less of an issue because of the over-performance of flash systems (but it will become an issue again).
Be careful of supplier optimisation claims. Raw storage capacity is a good starting point for cost calculations, but many suppliers push the concept of “effective capacity”. This metric is the usable capacity of a system after data protection overhead and space optimisation techniques are taken into consideration. Features such as data deduplication and compression can result in very high savings ratios of 10:1 or more.
The actual compression benefits received may be somewhat less than suppliers advertise, as they are entirely dependent on the mix of data to which they are applied. Virtual machines and desktops deduplicate well. Meanwhile, OLTP databases compress well but are not great for dedupe, and very mixed datasets may not deduplicate very much at all. Values will also change over time and be affected by things like the roll-out of new operating systems (OS).
Apart from the ability to validate one supplier calculation against another, why would we want to boil storage down to a simple $/GB number? After all, suppliers offer a range of features that cannot be compared by simple metric alone. The answer is that in the move to service-based or private cloud infrastructure deployments, IT departments want to ensure internal customers are correctly charged for their storage. For that reason, it is essential to understand the cost base. Also, as organisations look to the cloud as an option, there needs to be some way to measure whether moving data to the public cloud is cost-effective.
So, even if $/GB is not used as a method of comparing one supplier against another, it may be as important in delivering storage as a service.
$/GB or $/IOPS?
There is a saying in the storage industry that capacity is free and performance costs. This is true to an extent because, for example, in raw capacity terms, the price (per TB) of a 15,000rpm high-performance drive can be 20x to 30x more than a SATA 7,200rpm device.
The multiplier also applies to flash drives, which have a similar price differential. It is interesting to see that $/GB parity between high-performance HDD and flash drives is being reached, making the 15,000rpm HDD pretty much obsolete.
Although HDD and SSD prices are nearing parity, prices charged by all-flash array makers are still at a premium compared to disk-based systems. Here, many flash suppliers highlight the $/IOPS measurement and the performance benefits of their products.
This approach was acceptable a couple of years ago, but as flash prices have dropped and larger capacity drives have become commonplace, some all-flash arrays have hit a price point around $2/GB and we are seeing a swing back to the use of $/GB as a comparison point.
Hardware suppliers are notoriously cagey about releasing any information on pricing unless it gives them some advantage. One notable example is the coverage of pricing in all-flash solutions. Here are some numbers that are advertised: Kaminario $2/GB, Tegile $1.10/GB, HP 3PAR $2/GB.
Dig a little further and you can find other interesting numbers online.
PEPPM, the technology bidding and purchasing program, provides line item pricing for a range of suppliers in the US public sector (schools, libraries, and so on). These figures are only indicative (and are based on single-item purchases), but they do give some idea of pricing, including: Pure Storage FA450 at $20/GB (line item FA-450-58TB-FC8-SH5), EMC XtremIO starter 5TB X-brick at $24.50/GB (line item X02-012-400-E-U) and disks for NetApp EF series at $10/GB not including controllers, shelves and other components (line item EF-X4059A-12-AD-R6-C).
It is always worth taking a little time to research public lists of storage prices as they help in the negotiation phase. More than ever, the internet makes it much more difficult for suppliers to hide behind secret pricing strategies and a little work can deliver big rewards.
Learn what factors to think about for SSD prices
Learn why flash pricing is important for all-flash array vs. disk decision