NVMe is a new flash storage protocol set to revolutionise the performance of storage in servers and storage arrays, but for storage suppliers to incorporate NVMe and its benefits into products is not a straightforward matter. The challenges include dealing with new I/O bottlenecks at the hardware level and the implementation...
of a new protocol across Ethernet and Fibre Channel.
So, where have the big five storage suppliers got to in adoption of NVMe?
Traditional SAS and SATA storage interfaces were designed in a time of spinning media and have limitations on I/O performance within their design that reflect this.
This performance overhead didn’t really matter with spinning disk HDDs, because access times to the physical platter were so long. However, with the move to NAND flash, the overhead inherent in the use of SAS or SATA becomes more prominent.
NVMe addresses these and other shortcomings by introducing greater parallelism, an optimised software stack and deployment on the PCIe bus.
These features serve to significantly reduce I/O latency compared to SAS and SATA, while optimising data throughput. That performance improvement is directly experienced by applications that run NVMe locally within a server.
NVMe in storage arrays
In the past year, storage suppliers have started to adopt NVMe within their platforms.
At the back end, SAS is being replaced by NVMe as a means to connect to flash drives and to provide much greater system throughput and lower latency.
At the front end of storage systems, vendors have started to support NVMe over fabrics (NVMf) across multiple transports that include Infiniband, Ethernet and Fibre Channel.
As a rule-of-thumb, existing Gen6 and some Gen5 Fibre Channel hardware can support NVMe running over a Fibre Channel fabric. Of course, vendors also need to add FC-NVMe support into their products to make this happen.
NVMf is also being adopted with Ethernet as a carrier via a range of transport protocols, including RoCEv2, iWARP and TCP. The latter allow generic Ethernet cards to be used, rather than the RDMA-capable RNICs that are needed for the other options.
NVMe hardware refresh
NVMe back end support requires upgraded hardware that replaces SAS controllers with PCIe drive bays.
Vendors currently often use the U.2 solid state drive form factor which resembles a traditional 2.5” drive. Meanwhile, U.3 is being developed to enable NVMe, SAS and SATA drives to be intermixed on the same storage interface.
Front-end support needs suitable HBAs, either Gen5/6 Fibre Channel or RDMA-capable Ethernet NICs. Vendors typically support 25GbE and 40GbE speeds.
NVMe: Surveying the big five
Dell EMC puts NVMe in PowerMax. It has upgraded its existing VMAX line of products to be fully NVMe-enabled at the back end and the platform was renamed PowerMax in the process. PowerMax will be the long-term successor to VMAX as the company transitions its high-end platforms to solid-state media.
The PowerMax platform currently supports 1.92TB, 3.84TB and 7.6TB drives for a maximum raw capacity on PowerMax 2000 of 737TB and 2211TB on PowerMax 8000.
Dell EMC claims PowerMax 2000 systems can reach 1.7 million IOPS, with 10 million IOPS from a fully configured PowerMax 8000. Maximum performance figures are 150GBps of throughput at a latency of 300µs.
NetApp puts NVMe in arrays, plus a server solution
NetApp has added NVMe support to its AFF series of ONTAP storage arrays and EF series of high-performance block storage.
The AFF A800 supports up to 48 NVMe SSDs per 4U controller pair with 24 drives in each controller. Any additional drives per controller pair must continue to use SAS connectivity.
A single A800 system with 12 HA (high availability) pairs (NAS only) can support 1,152 drives, with 576 drives in a six HA pair SAN configuration. With 15.36TB of NVMe drives, the AF800 is highly scalable. NetApp claims 1.1 million IOPS and 25GBps at 200µs latency per HA pair.
At the front-end, the AF800 supports FC-NVMe using 32Gbps (Gen6) Fibre Channel to enable NetApp to claim full end-to-end NVMe support.
The EF570 array supports NVMf through 100Gbps InfiniBand EDR. Performance figures are quoted as 1 million IOPS and 21GBps of bandwidth at 100µs, which is effectively the speed of the underlying NAND flash media.
NetApp has also a third tier of NVMe, by using technology from its acquisition of Plexistor in 2017. MAX Data is a software solution that implements a tier of storage in a host server, backed by an AF800 array. Data is periodically written to the backing array through snapshots. With locally attached NVMe, NetApp is claiming performance of single-digit microseconds, i.e. < 10µs.
HPE puts NVMe to use as cache
HPE has chosen to hold off adding NVMe-enabled SSDs to its storage platforms as a replacement for SAS-connected devices. Instead, NVMe Storage Class Memory (SCM) has been added to the 3PAR platform (and is now GA), with SCM-enabled Nimble Storage platforms in product preview.
HPE claims that use of SCM as a read cache can deliver as good a performance boost as replacing drives with their NVMe equivalents.
This means being able to achieve an average of less than 110µs, with 99% of all IOPS guaranteed to be below 300µs.
Remember that in this implementation, NVMe SCM is a cache, so consistent I/O performance depends on effective cache algorithms.
Hyper-converged the site for Hitachi Vantara NVMe
Hitachi Vantara has not currently implemented any NVMe features within its existing storage platforms.
However, the company has NVMe storage into its hyper-converged systems. The HC V124N hyperconverged platform is based on VMware vSphere and uses vSAN as the storage layer.
The vSAN cache is implemented with Intel Optane (375GB drives), while the vSAN capacity layer is NVMe NAND SSD (Intel P4510 1TB drives). This configuration enables Hitachi to achieve a claimed doubling in performance compared to the previous flash-based HC solutions.
IBM puts NVMe at the back end
Initially, IBM claimed NVMe wasn’t fast enough to be used on the back-end of its storage arrays.
However, with the release of FlashSystem 9100, IBM has adopted NVMe as the standard connection for internal drives, either as commodity SSDs or IBM’s custom NVMe FlashCore modules.
The FlashSystem 9110 and 9150 models both support up to 24 NVMe drives in a 2U chassis, while expansion shelves continue to be SAS-connected.
Read more about NVMe
- NVMe could boost flash storage performance, but controller-based storage architectures are a bottleneck. Does hyper-converged infrastructure give a clue to the solution?
- NVMe can unleash flash by doing away with the built-for-disk SCSI protocol. But so far there’s no consensus between suppliers about how to build products around NVMe.
Front-end NVMe support is currently a statement of direction and is expected in 2019.
IBM performance figures claim 2.5 million IOPS, although this is based purely on 4K read I/O. A more reasonable 1.1 million 4K IOPS for read misses with 34GBps throughput and latencies “as low as” 100µs is also quoted by the company.
IBM has also demonstrated NVMe over Fabrics on FlashSystem and Power9 servers. This uses QDR (40Gbps) Infiniband to deliver performance figures of 600,000 random write IOPS and 4.5GBps random write throughput. Read/write latencies are quoted at 95µs and 155µs respectively.
Big five cautious compared to NVMe startups
It’s fair to say that NVMe enablement for the big five vendors looks more like a gradual process than a radical overhaul. The transition to NVMe will take time, probably dictated by customers moving through a refresh cycle for their solutions.
Outside the big five, Pure Storage already has NVMe built into its platform, so customers do not need to replace the chassis to adopt NVMe but can simply field replace drives and controllers.
The NVMe startups are a lot more aggressive, and have implemented new architectures and designs that disaggregate the traditional components of a storage array. NetApp is also moving this way with MAX Data.
For now though, NVMe adoption will be incremental within storage arrays. NVMe-over-Fabrics will likely take a little longer to be adopted, simply because many end users have not made the transition to the latest Gen5 and Gen6 hardware.