Storage performance is never good enough. DRAM speeds are measured in nanoseconds, but is volatile/non-persistent, meanwhile, even the fastest non-volatile storage – such as Intel Optane/3D XPoint – is orders of magnitude slower.
But, we need persistent memory, so how can we minimise the impact of needing to store data on it, while ensuring it is delivered to the processor as efficiently as possible?
In terms of storage media, there’s an embarrassment of riches in the industry today. NAND flash storage has proliferated, with devices to meet every performance requirement at the right cost.
Today’s NAND flash storage devices are built using one of three media types: MLC, TLC or QLC. These deliver increasing capacities while having slightly lower endurance and performance. And, across the range, access times are in the 10s of microseconds.
Compared to spinning disk hard drives, NAND is many tens or hundreds of times faster and comes at increasingly competitive prices. Simply moving to a flash array or using local flash storage will deliver huge performance improvements.
Flash is good, but drives that use SAS or SATA – most of them currently – are hampered by inherent inefficiencies in those protocols.
They were designed with hard drives in mind – SATA from the PC ATAPI interface and SAS from SCSI. With both, limited queuing and a lack of parallelism means NAND performance isn’t fully exploited.
Read more about storage performance
- The key performance attributes of storage media in use today, and the choices and basic tuning steps you can take to get the most from your storage.
- We run the rule over what’s needed to get the best storage performance for databases, virtual servers and desktops, web and email servers and analytics use cases.
The answer is a new protocol called non-volatile memory express (NVMe). The NVMe standard reduces protocol overhead by a number of software improvements and connects devices directly onto the PCIe bus. Parallel input/output (I/O) is improved with up to 64K queues and 64K elements per queue.
NVMe flash devices deliver latency figures that are similar to SAS/SATA devices. However, they deliver around a 10x improvement in raw throughput (IOPS). Servers and storage arrays need to support NVMe, but where they do, the gains are impressive.
NVMe is an enabler for a new range of technologies known as storage-class memory. Traditional HDD and flash devices provide block-level access to persistent storage. IE, data is read or written a block at a time.
But, storage-class memory or SCM provides persistent media that can be accessed at the byte level. In the market today, the main SCM product is 3D XPoint, sold by Intel under the brand name Optane. The technology was originally developed in conjunction with Micron, although Micron is yet to release any products.
Initial claims for Optane were ambitious, but the reality has been much more muted. Having said that, Optane does offer much greater performance and lower latency compared to NVMe NAND flash devices. Real world figures are around 10x better IOPS than flash with around 10µs of latency (compared to 20-25µs with flash).
SCM products are marketed either as NVMe devices in a range of form factors, or as persistent memory. This leads us to another category of products that deliver even greater levels of performance.
Persistent memory products include a range of hardware devices that plug directly into the memory bus of a server. By putting persistent storage on the memory bus, latency figures are even better than NVMe, and fall into the sub-10µs level.
To use persistent memory, the server BIOS and operating system must be capable of detecting it, otherwise it gets treated like normal volatile memory. As a result, using persistent memory products requires the replacement of existing server technology.
Persistent memory products such as non-volatile DIMM (NVDIMM) use either flash or battery-backed memory to maintain persistence. There are also other technologies in development or in limited availability that use newer techniques like resistive RAM or MRAM.
We won’t dig into the specifics of these technologies here, but the result is the same – ultra low latency persistent storage on the memory bus.
Exploit the hierarchy
More than ever before there is a wide diversity of storage media covering a hierarchy of performance capabilities.
To quickly summarise, we now have (from fastest to slowest):
- NVMe SCM
- NVMe Flash
- SAS/SATA Flash
The key to fixing issues with storage performance is in how we use this technology effectively.
Causes of storage bottlenecks
All storage issues are most likely the result of two scenarios. Either latency is too high (and needs to be reduced) or throughput is too slow (and needs to be increased). In other words, that means increasing IOPS or the amount of data transferred (Mbps).
Eliminating these bottlenecks is achieved by targeting active data (the working set) somewhere between the processor and the data at rest. This means using fast media for caching or replacing the storage itself.
Cache has been around since computers were invented. Typically, not all data is active in a dataset at any one time and the active or working set is usually much smaller, depending on the application.
Caching aims to make fast media available to the working set, with optimisation between cost and performance a key determinant of the media used.
Storage-class memory or persistent memory hardware make good candidates for use as cache because they deliver low latency performance, although they both require hardware or operating system support (or both). It’s also possible to use NVMe and flash drives for caching with fewer compatibility issues.
Windows supports storage-class memory devices from Windows Server 2016 onwards using a technology called DAX (Direct Access). The Linux kernel has support for NV-DIMM devices since release 4.2.
Hypervisors can support flash and NVMe products to improve performance of virtual machines. VMware vSphere Flash Read Cache improves the performance of individual virtual machines. VMware Virtual SAN has supported NVMe and NV-DIMM devices for some time, and allows them to be use for the caching part of persistent storage.
Persistent local storage
Storage-class memory and persistent memory products can also be used for local storage.
Where resiliency is managed by the application, using shared storage may not be necessary, so simply adding fast, persistent storage to the host could deliver great improvements to performance.
Storage suppliers have had all-flash platforms for some time. These are now evolving to deliver even better performance by using all-SCM and NVMe protocols end-to-end.
Storage-class memory-enabled arrays such as HPE 3PAR can provide better internal caching using SCM, which delivers improved latency to the connected host/application.
HPE (Nimble), NetApp and Dell EMC all have NVMe enabled arrays that use NVMe devices at the back-end, reducing the bottleneck of SAS platforms.
The result will be measurable performance improvements, however these may not match the capability of the media, meaning some IOPS capability will go unused. This is due to the architecture of controller-based platforms, where all I/O passes through two or more centralised controllers.
One solution to the controller bottleneck is to disaggregate storage components.
We’ve already seen products in the market from Datrium, E8 Storage and Apeiron, while NetApp has demonstrated a disaggregated solution in its Plexistor acquisition. These products can fully exploit NVMe and SCM/PM but require a new server/storage architecture.
Part of what these solutions has done is to revise or improve storage protocols. NVMe-over-fabrics (NVMf), for example, using Fibre Channel or Ethernet, will bring the benefits of NVMe to storage networks. This will reduce some of the latency seen from using shared storage arrays.
We can expect to see all shared array suppliers move to back-end NVMe support. NVMe at the front end will come, both for IP and Fibre Channel. The use of storage-class memory as a mainstream media will probably take longer, simply due to the increased cost per GB compared to flash.
Thinking of cloud
In this article we’ve not addressed public cloud.
Getting data in/out of public cloud environments has challenges due to latency and throughput. Within the cloud itself, access to resources is obfuscated and is determined by what the cloud provider offers. This is perhaps a topic of discussion on its own, but one that needs to be considered in the move to hybrid IT.