Software runs on data and data is often regarded as the new oil. So it makes sense to put data as close to where it is being processed as possible, in order to reduce latency for performance-hungry processing tasks.
Some architectures call for big chunks of memory-like storage located near the compute function, while, conversely, in some cases, it makes more sense to move the compute nearer to the bulk storage.
In this series of articles we explore the architectural decisions driving modern data processing… and, specifically, we look at computational storage
The Storage Network Industry Association (SNIA) defines computational storage as follows:
“Computational storage is defined as architectures that provide Computational Storage Functions (CSF) coupled to storage, offloading host processing or reducing data movement. These architectures enable improvements in application performance and/or infrastructure efficiency through the integration of compute resources (outside of the traditional compute & memory architecture) either directly with storage or between the host and the storage. The goal of these architectures is to enable parallel computation and/or to alleviate constraints on existing compute, memory, storage and I/O.”
Fern writes as follows
We have already broached this subject in our first analysis here, so let’s now ask whether it only real-time apps and data services that stand to benefit from computational storage, or are there other definite beneficiaries of this lack of latency?
Everything benefits from reduced latency.
In IT, no matter how big you build it, it will fill up, but you can tweak elements at each level of the architectural stack to improve performance and make the user experience more pleasurable.
Today, data is stored in certain ways purely because of the way CPU architecture has evolved when we’ve built computers, but it is not fit for purpose when it comes to accessing the volumes of data available now and the exponential growth we’ll experience as we approach the quantum age.
Spanning the conventional-quantum mix
As we move into that age, the real latency reduction benefits are likely to be found at the points where conventional and quantum computing mix. For example, picture and video storage or data created by companies such as Google ‘reading’ and storing all the books in the world.
Not all of this data will be stored in quantum computers but, for quantum computers to generate the massive scale performance improvements they promise to, data will still need to be stored in an appropriately accessible form for them to consume.
We need to make the ability to get stuff out quickly and beautifully formed much more prevalent so quantum computers are not sat effectively ‘twiddling their thumbs’. Along with real-time apps and data services, that integration point is the point to get really excited about.
We asked about On-Drive Linux in the brief to this series because to really drive the adoption of Computational Storage Devices (CSDs), some technologists (ARM for one) say that On-Drive Linux will be key.
We further noted that if standard hard drives rely upon NVMe (non volatile memory express) protocols to dispatch (or retrieve) chunks of data, then although that process works fine, the Solid State Drive (SSD) itself will remain blissfully ignorant of what the data it holds actually is, does or relates to i.e. it could be an image file, a video or voice file or it could be a text document, spreadsheet or other. Linux on the other hand has the power to mount the file system relating to the data that the SSD stores and be able to achieve awareness and cognizance of what the blocks of data actually are.
Linux certainly seems to have the most most gravitational pull but, as an industry, we need to be more open to looking at alternative ways of driving the adoption of CSDs.
A lot of this is about mathematics and signal processing. There are lots of different ways of doing this that people overlook, simply because they’re used to doing it with Linux and therefore assume it’s the right tool for the job. In some cases it may be, but in others it may not, and there will be other ways of doing things that are better.
Every time we add a new layer of abstraction, there’s an opportunity to do things differently. That means taking the time to have a look at what’s going on ‘under the bonnet’, to identify the optimal storage approach that supports requirements initially but will also provide the storage and processing architecture needed to support requirements we may not even be aware of as yet.
Standardisation can be stifling
So will standardisation milestones be the next key requirement for this technology?
In fact, there’s more up for debate here than what you might think. For example, just because you’ve been told you need cloud storage, how you use it is open to discussion.
Yes, we can do things quickly and efficiently but too often these capabilities fall into a big heap when we put governance and standards around them. Over-standardising bakes in old thinking and assumptions, meaning alternative opportunities and solutions just aren’t even considered.
To use a political analogy, our current system of parliament was created with local MPs to represent people who didn’t have a horse so couldn’t get to London. If you were to create a political system for the modern age, you would tear up the current system and start again.
Sometimes legacy systems create more problems than they solve but it’s easy to overlook simple solutions because they’re not part of the current canon. Standardisation works well when talking about compatibility of computer sockets and so on, but it can be stifling when it comes to dealing with what’s going on under the covers.
As the IT world focuses on joining capabilities together across networks, understanding how the platforms they are built on operate – and why – is crucial. We shouldn’t think of what goes on under the bonnet as being finished – it’s a moving picture that needs to evolve all the time if we’re to realise its potential while protecting and processing data.
Over-standardising computational storage could actually hamper this progress, rather than supporting it.