In a previous article we discussed the concept of server SSD; PCIe-based flash cards that bring storage closer to compute, reducing latency and improving application response times.
In this article, we discuss vendors’ server SSD roadmaps and how new products are being integrated into the application layer to overcome some of the early shortcomings of one of the newest storage product categories.
Here’s the story so far.
Centralised storage did a great job of providing a cost-effective way to improve scalability, increase resiliency and reduce costs by making storage a shared resource, today’s SAN and NAS platforms. But as server processing power has increased, so have the I/O needs of servers and hypervisors. Array throughput could be increased by use of faster disks and SSDs, but those technologies don’t always address the issue of latency to individual servers.
Enter PCIe server SSD, card-format hardware that provides low-latency, non-volatile storage by placing solid state directly on the server bus. There’s no need to traverse the SAN or incur the latency penalty of using a shared storage appliance, which could add milliseconds to every I/O operation. But the benefits of server flash are also its Achilles' Heel; having storage resources hardwired to the server isolates them from use elsewhere and reduces resiliency, especially in case of a server failure.
As we will see, there are two main solutions being followed by the storage array vendors to address current technology shortcomings. These are to move flash SSD out of the server and turn it back into a shared resource, or interconnect servers and provide cache replication and coherency between them. With either system, there is a need to deploy operating system filter drivers to correctly identify cache devices and manage the flow of data from main memory to cache or external storage.
Server SSD vendor products
EMC released VFCache, its server SSD offering in February 2012. Codenamed “Project Lightning”, VFCache uses a filter driver on the host operating system to identify and cache data for subsequent reads. Write I/O is not acknowledged from cache, with the device operating in “write-through” mode, which preserves data integrity but potentially limits the performance of VFCache on write requests. VFCache is also currently limited to one device per server.
EMC’s “Project Thunder” aims to address the issues of VFCache by placing multiple VFCache devices into an appliance closely coupled with the server in what EMC calls a “Server Area Network” built on Ethernet or Infiniband. Thunder will scale to terabytes of flash storage and millions of IOPS, according to EMC, and will run a light operating system stack. Placing storage outside the server removes the risks associated with server failure but does bring back the issue of latency, especially for read requests, which need no cache synchronisation (or coherency). EMC has said VFCache and Project Thunder will be integrated into future releases of Fast (fully automated storage tiering) which today exists on its major storage platforms.
Dell has taken a slightly different approach with its Hermes project. This uses technology from Dell’s 2011 acquisition of RNA Networks. Rather than have storage on an appliance external to the server, Hermes will connect multiple server SSD instances (or DRAM for that matter), using a fast network and remote direct memory access (RDMA) to enable cache card coherency across multiple servers. Dell believes maintaining a coherent cache will provide lower latency than externalising server SSD.
The Dell solution enables read and write caching, as writes can be synchronised to other servers in a cluster, so replicating and protecting data. Read requests need no coherency and so can be satisfied (in parallel) from each server in the cluster. Dell has already talked about integrating Hermes with its “Fluid Data Architecture”. This means expanding platforms like the Compellent storage array and the Data Progression feature into the server itself.
HP debuted its Smart Cache feature as part of the HP Proliant Gen8 server platform release. Smart Cache currently accelerates read and write direct attached storage (DAS) workloads, but HP plans to expand support to the 3PAR platform, presumably incorporating the recently released all-flash P10000 systems. As yet, it’s not clear how external array acceleration will be implemented, with HP expected to release details in the near future.
NetApp has been quietly working on a project codenamed “Mercury”. This is a software system for Linux that can be deployed in a number of locations; either as a hypervisor filter driver, an O/S filter driver, in the application or as a local cache for NAS protocols such as NFS. I/O requests are accelerated by using the Mercury driver cache, which manages requests destined for traditional storage. Although details aren’t fully clear, the benefit for NetApp could be in integrating the Mercury driver with NFS and so providing similar server/array data-tiering capabilities to those planned by EMC with Fast, and Dell with Data Progression.
Aside from the main array vendors there is also Fusion-IO, a pioneer in the server SSD market. Fusion’s existing products, including ioTurbine and ioCache, already provide application-aware I/O acceleration. Recently the company announced the ION Data Accelerator, a software product that takes Fusion-IO PCIe SSD devices and integrates them into a server, turning it into a storage array. Fusion-IO quote read/write response times of 73 and 56 microseconds respectively on a reference architecture HP DL-370 server with 20 TB of SSD capacity.
So, where the array vendors are pushing into the host, Fusion-IO is pushing towards the storage array. It’s not hard to imagine a next-generation release that integrates host server SSD cards with an ION array to deliver a shared accelerated infrastructure.
This was first published in October 2012