Laurent -

How to deploy NVMe flash storage for artificial intelligence

We run the rule over NVMe flash as a storage choice for AI applications and the key decision points in form factor and hardware specification

This article can also be found in the Premium Editorial Download: CW Asia-Pacific: CW APAC: Expert advice on storage

Artificial intelligence (AI) applications are inherently data-intensive, with multiple reads and writes to the file system. And, at the outset, the AI algorithm absorbs tremendous amounts of training data as it learns the parameters of its job.

Once that is done, your AI system then diligently performs its task, but it has to output the results somewhere. And, as AI applications scale, they can encounter storage-related bottlenecks that can harm performance.

So, at every stage in the deployment, training and operation of AI systems, storage is a big consideration. In this article, we look at AI/machine learning and the storage needed to support it, which increasingly means NVMe flash.

AI and NVMe

NVMe feels like a logical evolution in the history of storage technologies. For much of the history of personal computing, users were stuck using mechanical hard drives, which had myriad flaws. They were slow and often prone to failure. 

Then solid-state drives (SSDs) arrived on the scene. Although initially far more expensive than mechanical drives, they were very much more performant. And, as they contained no moving parts, they were far more reliable and energy-efficient. 

But there was still more work to be done. Early SSDs connected over the same Serial ATA bus interfaces as their mechanical brethren, thereby introducing a throughput bottleneck that could not be avoided by nifty on-drive engineering. 

NVMe storage avoids that bottleneck entirely by connecting via PCIe (Peripheral Component Interconnect Express) buses directly to the computer’s CPU. This is a logical move – PCIe was designed to handle components where speed is of the essence, such as graphics cards or modems. It is only fairly recently that it has been used as a medium to connect storage devices. 

It is difficult to understate how much of a quantum leap NVMe represents over previous flash technologies. Compared to an old-school SATA SSD drive, an NVMe-based drive can write to disk up to 4x faster. Also, seek times – the time it takes for a drive to locate the area in which a file is stored – are up to 10x faster.

For the sake of completeness, it is worth noting that NVMe is not merely fast because it connects via PCIe interfaces. There is also a lot of clever engineering on the drives themselves, particularly pertaining to how it organises read/write requests. 

SATA drives supported only a solitary I/O queue, with just 32 entries. This meant that much of the heavy computation got passed to the host computer, which had to determine the priority and order in which reads and writes took place. 

NVMe-based storage, on the other hand, supports multiple I/O queues, with a theoretical maximum of 64,000 queues, each permitting 64,000 entries for a grand total of 4.096 billion entries. Also, the drive’s controller software is designed to create and manage I/O queues. These are intelligently shaped by the system’s characteristics and predicted workload, rather than some kind of hard-coded one-size-fits-all solution.

What does this mean for AI developers?  

Although the speed benefits of NVMe are manifested in overall improved performance, you will feel the advantage of using them more keenly when dealing with larger files.

For AI engineers, this advantage will present during the training data phase, when the model is constantly reading and learning from files most likely stored on the local file system. 

NVMe is a must for anyone working in the computer vision niche of the AI field, which inherently involves training a model on photographs and videos. By reducing read times, engineers can shorten the time it takes for a model to develop, while simultaneously improving day-to-day performance. 

Bypassing the SATA bottleneck also presents new opportunities, particularly for AI engineers. The switch to PCIe permits other components, such as graphics cards, to directly access storage volumes, and vice versa.

Read more on artificial intelligence

One technology that supports this – albeit with some third-party spit and polish – is Nvidia’s GPUDirect, which came out in 2010. The main advantage of this is that it drastically reduces I/O latency, as well as the demand on the CPU. Given that AI engineers almost universally rely on GPUs to accelerate their workflows, this is a huge bonus.  

The best part of all is that NVMe was designed with concurrency and scalability in mind. You will see these characteristics present themselves more keenly on multicore systems – which is pretty much every computer on the market these days. 

That is because the NVMe specification permits individual CPU cores to influence the queue in which I/O operations are processed, as well as their priority. This sounds cool, but what it boils down to is lower data latencies, as well as a more intelligent and context-sensitive approach to file system operations. 

AI engineers will experience the benefits of this at all stages of their application’s lifecycle, from training the model to applying it to a task. 

NVMe form factors

The most common implementation of NVMe storage is the M.2 specification – previously known as the Next Generation Form Factor (NGFF). 

Physically, these look very different from previous SATA-based drives. They are thinner and narrower, which is crucial given the propensity for contemporary computer manufacturers to offer smaller, lighter machines. 

If you have invested in medium-to-high hardware in recent years, there is a decent chance that you have already got M.2 slots on board. Upgrading to the latest storage format is therefore a matter of buying compatible drives. 

If not, you can always buy adaptors that connect to the PCIe slot.

But if you are looking at adaptors, it may be time to upgrade to new systems altogether. That is because to get the most out of NVMe, you’ve got to boot your operating system from it, which requires a BIOS that supports the storage format. And if the motherboard needs an M.2 adaptor, it’s not very likely the manufacturer has released a firmware update to support it. 

Why? A couple of reasons

Firstly, those upgrading to NVME storage are a somewhat small niche, with most home-users content to abide with the speeds offered by ordinary SATA-based SSD drives. Secondly, there is no incentive for manufacturers to do so, as they can use the promise of M.2 to sell you brand-new hardware. 

Also, NVMe storage is still comparatively expensive. There is no point in upgrading to it just to get some of the benefits. 

You can also buy external M.2 storage caddies that allow you to add extra NVMe drives to computers without having to crack open the case. That is particularly helpful with hardware, such as laptops, that is not particularly geared to expansion. 

These external caddies require a USB-C port, ideally using Intel’s Thunderbolt 3 connection. Although many enclosures support USB-A, using the old USB format will introduce a major I/O bottleneck that will make the investment somewhat pointless. 

They should also support PCIe, rather than the antiquated SATA bus technology. These cost more and they are a bit harder to find, but they are also totally worth it. 

Welcome to the storage revolution 

It’s easy to get a bit evangelical about NVMe. Although the technology is somewhat in its nascent stages, it still offers genuine workflow and performance advantages to those working in demanding occupations, such as AI. 

Of course, AI engineers have long fussed over their setups. They invest ungodly sums of money in high-powered CUDA-enabled graphics cards. On Reddit, they bicker about who makes the best power-guzzling multi-core processors – AMD or Intel? 

Storage, now, is just another part of that conversation.

Read more on Artificial intelligence, automation and robotics

Data Center
Data Management