Data is a valuable commodity that sits at the heart of our IT systems, whether on-premise or in the public cloud. The need to retain our data on persistent storage media, however, has been around since the invention of tape and disk in the 1950s.
The benefit of persistence comes at a cost, because storage devices – disk, tape or flash – are nowhere near as fast as modern processors and system memory. Therefore, to achieve good storage performance, we must constantly try to make persistent storage media as fast as possible.
We keep data on external media for two reasons: persistence and protection.
Persistence means our data is retained after the application is shut down and/or the application server is switched off. System memory (DRAM) isn’t persistent, so data must be periodically written to the media that retains the contents in the event of a server or application failure.
Protection is also essential. Servers fail and disasters can happen, so without the ability to duplicate and keep data across multiple media, we would have a problem. Protection mechanisms, such as Raid, erasure coding and snapshots, are also used to protect data against physical and logical corruption, as well as common “user errors”.
Storage performance metrics
There are three main metrics to measure storage performance.
- Latency: Latency is a measure of the response time of a device. System DRAM latency speeds are measured in nanoseconds (ns), flash in microseconds (µs) and hard drives in milliseconds (ms).
- Bandwidth: The capability of a device to transfer data, measured over a specific time period, typically quoted in megabits per second (Mbps) or gigabits per second (Gbps).
- Throughput: The practical capability of a device to transfer data, usually measured in megabytes per second (MBps) or gigabytes per second (GBps).
Although they appear similar, bandwidth and throughput are subtly different. Hard-disk drives (HDDs) and solid-state drives (SSDs), for example, will have a maximum bandwidth for their interface, but different practical throughput figures depending on the input/output (I/O) profile – sequential or random, read or write.
In an ideal world, all data would reside in memory and be accessed at the fastest performance possible, but system DRAM is volatile and expensive, and servers have a limited capacity. Most applications don’t need to access all their data all the time, so cost becomes a factor in determining the best location to store data, based on how quickly and frequently we need to access it.
So, how can we make best use of storage resources? What can be tuned and how can we get the best performance for the lowest cost?
Storage media performance comparison
As we dig deeper, we need to look at the hierarchy of storage media available to the enterprise.
- DRAM: The fastest storage media in terms of performance and latency, but contents are volatile. DRAM capacity doesn’t scale well and cannot be easily shared between servers. DRAM is byte-addressable.
- NVDIMM: Very fast persistent DRAM-like media using the same DIMM form factor and employing either flash or other techniques to retain contents when powered off. Not as fast as DRAM and has the same issues of accessibility and scalability. NVDIMM is usually byte-addressable.
- Flash: Very fast persistent storage media, with good scalability, either in a single server or as part of a storage array. Much cheaper than DRAM and block-addressable. Flash has a range of price/performance/cost options such as MLC, TLC, etc.
- Hard drives: A relatively slow persistent storage media that is being pushed towards archive and backup. Hard drives are also block-addressable.
Each of these media can be used individually or in combination to provide a range of storage performance options.
Storage media server options
Storage can be deployed directly in the server, where getting data as close as possible to the central processing unit reduces the I/O path and so reduces latency.
Flash provides higher levels of performance than hard drives. A cost/performance balance can be struck by using varying amounts of flash and hard disk drives in combination, depending on the performance needs of the data. Flash and DRAM can be used as a cache to store active data, while retaining inactive data on HDD.
Depending on the hit ratio – accuracy of storing active data in cache – then I/O performance will be generally predictable, but read requests for data not in the cache will take a performance hit if it must be read from hard drive. One solution is to use a mix of cheap and expensive flash to get the right ratio of price/performance.
Another consideration with storage in the server is the need to provide protection against device failure. If a drive fails, the server may need to be shut down for replacement, unless the device is hot swappable. If the server fails, data on the internal devices will be inaccessible, so replication of data is required. This adds extra latency to I/O, based on the speed of the network that connects a group of servers together.
Storage array performance considerations
Shared storage arrays offer the benefits of higher availability and accessibility for data, but come with a latency penalty due to the need to traverse a storage network that could be Ethernet or Fibre Channel.
But new products are coming to market that offer solutions based on connectivity options such as RDMA, RoCE and NVMe. These newer technologies aren’t as scaleable as traditional storage networks, but offer significant latency reductions.
Storage arrays can benefit from a range of media aimed at improving performance.
Read more about storage performance
DRAM caching offers read and write improvements, while hybrid systems mix flash and spinning disk drives to get the best price/performance mix.
Meanwhile, all-flash systems offer guaranteed I/O performance and latency for all data, compared to the risks of “cache miss” issues already discussed above.
The use of flash has resulted in a range of techniques beyond simple caching that aim to improve throughput and latency issues. In general, though, the more flash and DRAM used relative to spinning disk, the better the performance will be.
Storage network performance
Where external storage arrays are used, the storage network can be tuned to improve performance.
The latest host bus adapters (HBAs) offer very high bandwidth (32Gbps Fibre Channel, 40Gbps and 100Gbps Ethernet) with network switches delivering very low latency. In general, the faster devices provide better performance, although the cost of continually upgrading switches may be prohibitive. Throughput to storage arrays can be improved by adding more front-end connectivity (extra HBAs) and spreading the work across more connections.
There are other techniques that can be used to improve performance:
- Load balancing: This balances the use of resources across as many devices as possible. This means spreading data across multiple disks (flash or HDD) and using all available connectivity. Wide striping can be done in the server or storage array.
- Workload placement: Place more active workloads on faster devices. Flash is expensive, so use it sparingly and in the right place. NVDIMMs offer the potential to vastly improve performance for some workloads, if server availability issues can be overcome.
- Caching: We’ve already discussed caching in the array or server, but caching can also be done across both devices. DRAM and flash can be used in the server to cache I/O from external storage, either for I/O reads (no resiliency required) or writes (with resiliency required to protect against data loss).
- Storage network tuning: In more complex networks, bottlenecks can occur on shared storage ports and inter-switch links. This means looking at settings such as queue depth and buffer credits (Fibre Channel) and the topology of the network itself.
In all of these discussions, one thing becomes clear: to improve performance we need to measure it. Without suitable measuring tools, there is no way to quantify performance issues and whether problems have been solved.
All storage suppliers offer tools that show the performance of their products. There are also end-to-end performance tools available that show the impact of storage performance on overall application performance. These can show real-time and historical values – both essential in the continued quest to improve the performance of our storage systems.