Sergey Nivens - Fotolia

Pioneers take flash storage to analytics and big data use cases

Flash storage is now being applied to big data and analytics workloads as array makers exploit flash performance for rapid access and high throughput for big datasets

The flash storage market was started by addressing application performance issues in the enterprise. Due to the cost profile of the first flash systems, their use was targeted at only the applications that could truly benefit from increased throughput and low latency.

However, the market has matured and flash platforms have become mainstream, with a new range of products now emerging to tackle specific requirements. One of these is the use of flash for big data and analytics workloads.

While the mainstream market has focused on complementing performance with features and functionality, the new wave of flash products has divided into two areas.

At the high end, suppliers such as EMC (with DSSD) and Mangstor (NX-Series), deliver ultra-high performance and low latency in boxes that come with no frills.

At the other end of the market we see “cheap and deep” flash products that take advantage of recent increases in Nand capacity.

TLC and 3D-Nand technology has allowed suppliers, such as SanDisk and Pure Storage, to bring products to the market that are less focused on the endurance side of flash, but deliver to capacity and performance requirements.

These new platforms, although capable, aren’t targeted at traditional workloads such as server virtualisation. Low latency and high throughput requirements make them a perfect fit for big data and analytics use cases.

If we look at the characteristics of big data workloads, there are a number of features that make flash suitable:

Low latency/high IOPS – Analytics tasks are typically input/output (I/O) intensive, in many cases reading and re-reading the same data many times.

There is little benefit to gain from caching when whole datasets are being processed, so the ability to deliver fast analytics responses comes from storage that runs as fast as possible.

The products discussed in the round-up later in this article offer latency figures of 100μs or less (depending on read/write activity), which are figures that are comparable with PCIe SSD devices deployed directly in servers.

Scalability – Big data is all about volume. Information is produced at a vast rate and analytics looks to use as much of this data as possible when deriving insight and value. Flash-based analytics systems need to offer rack-level scalability, into the petabyte range of capacity.

Parallelism – Platforms such as Hadoop have been designed around the idea of splitting up query workloads and running many in parallel. This was done because, at the time Hadoop was invented, the only way to get I/O throughput was with many hard drives, spread across many physical servers.

Flash consolidates the workload of many servers into, potentially, a single system, so these systems need to be capable of parallel I/O to deliver performance. With the introduction of technologies such as NVMe, we see the capability to run many more concurrent I/O tasks than with traditional storage.

Randomness – Most analytics processing is random in nature, making it difficult to predict what part of the data will be requested next. This suits flash, which is capable of delivering consistently to random I/O requests.

As previously mentioned, caching isn’t practical in a situation where large volumes of the data set are read quickly, as the cache becomes simply a staging area for I/O. Therefore, having consistent I/O response across all data is critical.

One other consideration with most analytics environments is that they are read-intensive. Data is typically added (rather than continuously updated) in big data systems, with the majority of I/O consisting of reading data for processing.

Scale-out flash systems using newer technology, such as 3D-Nand and TLC, can deliver high density flash systems with lower endurance than SLC or MLC-based devices. Lower endurance isn’t as critical in a mainly read-based environment.

Flash for HPC

High-performance computing (HPC) systems are typically focused on performing high-volume, high-intensity applications with scale-out computing nodes and parallel I/O workloads.

Data is provided into HPC systems using parallel file systems or scale-out NAS products, such as Isilon.

The nature of HPC designs will rule out some suppliers’ products as being suitable for HPC. For example, high-density products are likely to see fewer benefits due to the inability to connect many nodes to a single storage platform (those with direct connect), compared with systems presenting traditional NAS protocols.

By comparison, products that provide connectivity to a small number of clusters (while at the same time not being highly dense) will be more appropriate to the scale-out nature of HPC. At this point, cost becomes an issue and traditional storage with a large cache may be more appropriate.

Flash analytics product round-up

Mangstor’s flash fabric arrays deliver I/O at latencies of 110μs (read) and 30μs (write), with throughput of up to 5 million IOPS. Throughput is up to 20GBps (Gigabytes per second) with the highest specification NX6340 model offering 27.54TB of capacity.

NX-Series models are connected using Ethernet or Infiniband at 40, 56 or 100Gbps (Gigabits per second) and use NVMe over RDMA as the storage protocol. Note that NX appliances don’t provide Raid protection, which is expected to be implemented on the host.

EMC’s DSSD does offer data protection (a proprietary brand called Cubic Raid) with similar levels of performance to Mangstor.

EMC DSSD D5 claims around 10 million IOPS at around 100μs latency and 100GBps of throughput. A 5U enclosure scales to a maximum of 144TB using 36 proprietary 4TB flash modules. The D5 supports a maximum of 96 PCIe connections (48 dual connected hosts) using PCIe Gen3 x4.

E8 Storage recently debuted its NVMe-based storage D8-D24 array that uses 24 2.5” Intel NVMe SSDs to deliver total throughput of 20GBps (write) and 40GBps (read) at 2 million IOPS (write) and 10 million IOPS (read). Latency figures are quoted at 100μs (read) and 40μs (write). Connectivity is provided by 40, 50 or 100Gbps Ethernet.

SanDisk’s Infiniflash delivers a highly scalable and dense flash platform, delivering 512TB of capacity in just 3U. Performance is rated at 2 million IOPS and up to 12GBps throughput. Suppliers such as Tegile have taken InfiniFlash and used it as the basis of a scale-out storage platform – Tegile provides the features and functionality (and external connectivity), while InfiniFlash provides the raw capacity.

Pure Storage recently announced FlashBlade, a scale-out storage platform that delivers up to 1.6PB of usable capacity (792TB raw) in 4U. FlashBlade can be scaled to a 16PB in a single rack, and is capable of supporting hundreds of blades in a single logical cluster. The product is currently in the early adopter phase of release.

IBM’s offering is called DeepFlash and is comprised of 64 custom flash cards in a 3U chassis. The platform scales to 170TB per rack unit, or around 6PB for a full rack of flash.

Read more about flash storage

As with all the other hardware products discussed, suppliers use direct server connectivity, such as SAS, PCIe, Infiniband and Ethernet. Server to storage protocols include RoCE (RDMA over Converged Ethernet), NVMe over PCIe and iWARP.

The use of fast interconnects and new storage protocols is bypassing traditional Fibre Channel and NAS connectivity.

At this stage in development, the physical connection restrictions are reminiscent of the early days of direct-attached storage with SCSI. Issues such as distance and host counts were resolved as Fibre Channel took hold.

As we see the development of NVMeF (including possibly NVMe over Fibre Channel), we may see these high-end products scale further and become more adapted to the general storage market.

Read more on Storage management and strategy