archy13 - stock.adobe.com
Computational storage: What is it and what are its key use cases?
Computational storage brings the CPU to the storage and so boosts system performance by tackling processing tasks, such as near the edge or in AI/machine learning workloads
Computational storage brings processing power to storage. It’s a response to the idea that conventional storage architecture hasn’t kept up with today’s data storage needs.
Moving data between storage and compute resources is inefficient. And as data volumes increase, it becomes more and more of a bottleneck. As the Storage Industry Networking Association (SNIA) has said: “Storage architecture has remained mostly unchanged dating back to pre-tape and floppy.”
That might be a slight exaggeration, but the principle that storage is separate from processing remains at the core of most enterprise IT systems. With advanced analytics, big data, AI, machine learning and streaming, this is a problem.
Some solutions are available. In-memory databases such as SAP’s Hana cut the need to move data to and from storage. And server flash can bypass the conventional SAS and SATA interfaces between drive and CPU by connecting the controller and flash storage directly to the host’s PCI bus.
Computational storage goes further, however. The technology puts processing onto storage media. Solid-state storage is sufficiently fast that moving data processing closer to storage brings a big jump in performance. Applications such as Hadoop have already moved in this direction, through distributed processing.
Computational storage puts processing onto the storage media. This offloads processing from the CPU and reduces the storage-to-CPU bottleneck. Research by the University of California Irvine and NGD Systems suggests that eight- or nine-fold performance gains and energy savings are possible, with most systems offering at least a 2.2x improvement.
What is computational storage?
Computational storage is a storage subsystem that includes a number of processors, or CPUs, located on the storage media, or their controllers. These are known as computational storage drives (CSDs), which collectively provide computational storage services. The idea is to move processing to the data, not data to the processor.
The idea is to use the CSDs to pre-empt some of the workloads, so that less data is passed to the main CPU. In some cases, the CPU might need to carry out fewer tasks.
An example is an artificial intelligence (AI)-based surveillance system. A CSD at the edge, perhaps even in the camera itself, can carry out basic tasks such as analysing the image for intruders. Only “positives” are then fed to the main CPU and application, perhaps to run facial recognition.
“When a computer needs to do calculations on a data set, the data needs to be read from storage into memory and then processed,” says Andrew Larssen, an IT transformation expert at PA Consulting.
“As storage sizes normally vastly exceed memory, the data has to be read in chunks. This slows down analytics and makes real-time analytics impossible for most data sets. By having processing capabilities directly in the storage layer, computational storage lets you avoid this.”
Computational storage architecture
Computational storage is typically based around an ARM Cortex or similar processor, located in front of the storage controller, usually NVMe-based. Some CSDs, though, use an integrated compute module and controller.
The computational storage system can also include ASICs or FPGA accelerators, depending on the intended application.
SNIA breaks down current computational storage systems into two broad categories – fixed computational storage services (FCSS) and programmable computational storage services (PCSS).
FCSS are optimised for a specific and compute-intensive task, such as compression or encryption. PCSS can run a host operating system, typically Linux. Both systems have pros and cons: FCSS should provide the best performance and cost ratio; PCSS is more flexible. The architecture will also determine whether drivers or application programming interfaces (APIs) are needed, or whether an application could, potentially, run natively on the CSDs. The premise of the proposed “Catalina” system will allow CSDs, running Linux, to act as data nodes in a Hadoop cluster.
And a system might use just CSDs, or a mix of CSDs and conventional storage, although at present a mix is more likely.
Read more on computational storage
- Three key problems computational storage devices can solve. This emerging technology brings stored data closer to compute processes, solving latency and performance issues.
- IoT is the main application for computational storage. Another potential use case is inside scale-out server architecture, but those applications are still under development.
Early applications for computational storage are areas where even a single processor can ease bottlenecks, and include data compression, encryption, and RAID management.
But the technology has evolved to create a wider range of use cases. In part, this is being driven by improvements in software and APIs that allow distributed workloads across a number of CSDs. This brings the greatest performance increases.
“Use cases include computational edge, machine learning processing, real-time data analytics and HPC [high-performance computing],” says Julia Palmer, a research vice-president at Gartner.
“While this technology is nascent, it has potential to grow substantially. Gartner predicts that by 2024, more than 50% of enterprise-generated data will be created and processed outside the datacentre or cloud. That’s up from less than 10% in 2020.”
Data streaming is another application where CSDs offer benefits.
Computational storage pros and cons
The main advantage of computational storage is the performance increase, which can be significant. Applications that are data-intensive, rather than computationally intensive, stand to benefit most by removing the storage-to-processor bottleneck.
Applications that lend themselves to distributed processing will also perform better, as will those that rely on low latency to function well.
Carefully designed CSD systems also offer significant power savings.
Downsides include increasing the complexity of IT architecture, the need for APIs or for the host to be aware of computational storage services, and the additional costs of adding CPUs to storage devices or storage controllers.
Nor is computational storage a cure for all performance ills. A single-instance CSD provides only limited performance benefits. Applications that work across multiple nodes or can be reconfigured to work that way will perform best.
According to Tim Stammers of 451 Research, computational storage is set to become commonplace – not least because growing data volumes have all but eaten up the performance advantages gained from the move to flash.
Computational storage suppliers
The computational storage market is still very much in development, but these are some of the key suppliers.
Canadian company Eideticom’s NoLoad CSD is claimed to run in peer-to-peer mode without any processing from the host CPU. The supplier uses MVMe on PCI, powered by FPGAs. Its focus is on storage services, including data compression and deduplication.
Of the mainstream storage suppliers, only Samsung has a product. Its SmartSSD was announced in 2018. It uses a Xilinx FPGA chip. Initial applications include compression, data deduplication and encryption.
Nyriad has an unusual background. Its products were developed initially for the Square Kilometer Array radio telescope. Nyriad developed a CSD that was driven by Nvidia GPUs and could handle data processing at 160TBps.
ScaleFlux was founded as a startup in 2014. Its CSDs can process workloads “in situ”, and its market is the hyperscalers and cloud operators.
NGD makes CSDs powered by ASICs containing ARM cores. Previously, it used FPGAs. They have been used in edge computing projects. The drives can also be used in non-computational mode as regular storage.
Flexible Computational Storage Solutions
Session A-10: Keys to Making Computational Storage Work in Your Applications