Is the commodity server bandwagon running out of steam?

Freeform Dynamics

The term “paradigm shift” gets flung about far too much in IT, but we may be at one of those few times that genuinely deserves it. For decades now, specialised hardware devices have been losing ground to software running on cheaper, general-purpose hardware.

More recently, virtualisation and containerisation have added even more flexibility and abstraction. In many ways IT has become software-defined.

But now we are rediscovering the limitations of that general-purpose hardware. Even virtualised appliances need to connect to the physical world at some point, whether it’s to store and retrieve data or to receive work and communicate the results.

And as networking becomes more demanding, and more and more applications incorporate compute-heavy AI techniques such as machine learning, general-purpose hardware is no longer adequate.

Specialist hardware is coming back into fashion

So specialist hardware is back in mainstream fashion, and a new model is emerging for IT. Whether we call it ‘composable’ or ‘offload’, it’s a model where general-purpose servers are not the dominant force. Instead, they become the hosts or hubs for a range of specialised capabilities. Whether you see them as “merely hosts” or as “essential hubs” is a matter of perspective.

This specialist hardware can take several forms and names: The GPU is the best-known and most widely adopted, whether for AI or for its original purpose of graphics processing, but we’re now seeing a lot of attention paid to the DPU, or data processing unit, and its close cousins the infrastructure and services processing units (IPU and SPU).

The DPU name is a bit confusing – don’t all computers process data? But what suppliers such as Broadcom, Intel, Marvell, Nebulon and Nvidia mean is that these are devices specially designed to handle all the functions involved in moving and storing data. So that’s the likes of TCP and RDMA network acceleration, data compression, network virtualisation and data encryption.

Again, this might look familiar – it’s what SmartNICs have been doing for several years now, while mainframes and minicomputers have had I/O offload processors for rather longer than that. Indeed, some will argue that it’s long been possible to reconfigure or expand commodity servers to better-fit specific workloads.

The need for speed is spreading

I see two major differences now, though. One is that it’s no longer just specific workloads – it’s a broader and more general need. The other is that the DPU expands considerably on earlier options like the SmartNIC, adding more compute power, memory and programmability, and enabling higher degrees of composablity. For instance, where a SmartNIC extends the host server, say, the DPU can act as an infrastructure endpoint, delivering network and storage services to its host.

But how long will we need these specialist devices – won’t these capabilities eventually be integrated into the processor? After all, we are already seeing AI features built into standard processor chips from the likes of Apple, ARM, IBM and Intel.

Well, yes and no. AI features such as the custom matrix multiply unit in Apple’s M1 processor and the on-chip acceleration in IBM’s Telum are mainly aimed at inferencing – that is, the application of a pre-built AI model to new data. They’re also going to be useful for other work, such as analytics, but less so for the job of building and training those AI models in the first place.

Going close to the wire

So it is with the DPU – these can actually be used to run applications, as they include processor elements (typically ARM-based), but fundamentally they are all about being in or very close to the network or data plane. So you might run a firewall on one (this is especially relevant as more and more inter-server traffic runs east-west within your network), but perhaps not a database or ERP system.

And that need for composability won’t go away. Of course, not everyone or every application will need a GPU, DPU, IPU or SPU, anyway. But as workloads grow in depth and complexity it’s going to be important to know where the bottlenecks and opportunities are, so composable computing will inevitably grow in relevance.