agsandrew - stock.adobe.com
When IBM unveiled the world’s first 2nm chipset in May 2021, capable of fitting up to 50 billion transistors onto a chip the size of your fingernail, a brave new world seemed on the horizon. However, it could be some years yet before 2nm benefits cascade down to datacentre power densities, efficiencies and sustainability.
Rumour has it that a 3nm Intel chip will not appear until Lunar Lake in 2024, with a 17th generation chip for client devices, Nova Lake, to follow with a hoped-for 50% boost to CPU performance and the largest change to the architecture since Core in 2006 – roughly contemporaneous with Diamond Rapids for the server in 2025 or thereabouts.
The dominant chipmaking giant continues to play its cards slowly, and close to its chest.
Intel declined to speak to ComputerWeekly for this feature, but Uptime research director Daniel Bizo also suggests that performance improvements will be gradual rather than revolutionary.
“Really, what it comes down to is the very nature of semiconductor physics,” says Bizo. “So this is not a new phenomenon. It improves at a slower pace than the physical density of it, across the span of a decade.”
From 2022, higher power envelope Sapphire Rapids multi-tile chiplet design 10nm Xeon chips are expected, incorporating integrated, dynamic, high-bandwidth memory for enhanced capacity and data storage closer to the processor, which will be helpful for certain workloads, such as those that require lower latencies, says Bizo.
The Intel roadmap unofficially posted on Reddit in mid-2021 suggested a 10% CPU performance boost in 2022 with Raptor Lake followed by a “true chiplet or tile design” in Meteor Lake, more or less keeping pace with AMD and Apple.
Xeon's Sapphire Rapids has been tipped to offer 56 cores, 112 threads and thermal design power (TDP) of up to 350W. Key rival AMD is expected to offer up to 96 cores and 192 threads at up to 400W TDP with its EPYC Genoa processors as well as improved cache, IO, PCIe lanes and DDR5 capabilities.
So it might be some time before chipset innovations on their own can assist with datacentre pressures.
Bizo adds: “When you keep integrating more cores, it’s difficult to keep pace with memory. You can add memory channels, which we’ve done, but it becomes costly after a certain point. You need logic boards with much more wiring, or you’ll run out of pins.”
Revising the software stack
Feeding processors with data without adding even more memory channels and modules should prove a more effective way of serving extreme-bandwidth applications. However, Uptime also considers that to drive performance and efficiency via the latest chips, examining and revising the software stack is becoming crucial.
“Towards the mid-2020s, upcoming chips are not really going to offer that much excitement if people are not willing really to shake up the application stack and the way they run the infrastructure,” says Bizo.
“You can gain efficiencies if you change your practices around things like workload consolidation and software virtualisation – with many more virtual machines on the same server or perhaps consider software containers.”
That can be the bottom line, says Bizo, noting that the Skylake generation of scalable server chips emerging from 2017, at 14nm, consumed less power idling than the latest chips do today.
Anthony Milovantsev, partner at tech consultancy Altman Solon, says the reality is that we will firmly be in the standard paradigm of silicon substrate, CMOS transistors and Von Neumann architecture for the foreseeable future.
He adds that while quantum computing is generating activity, use cases are a small subset of what is required – although datacentres to house a quantum machine will eventually look very different, perhaps with cryogenic cooling, for example.
“If they do need quantum capacity at all, normal enterprises will almost surely consume it as a service, rather than own their own,” says Milovantsev.
“More near-term, compound semiconductors have interesting properties allowing for higher clock speed operations, but they have been around a while and there are significant drawbacks versus silicon dioxide. So this will continue to be niche.”
So Milovantsev agrees with Bizo that chip innovation is likely to rely on continued incremental improvements in transistor process nodes like 3nm, as well as innovations such as gate-all-around RibbonFETs, or usage of innovative die packaging, such as 2.5D with silicon interposers or true 3D die stacking.
However, he points to Arm/RISC for datacentre chip developments for improved price performance or niche HPC workloads. Examples include hyperscalers such as Amazon Web Services (AWS) moving into Arm/RISC with Graviton or Nvidia’s announced Grace CPU for high-performance computing (HPC).
“The net result of all this, though, is only marginal like-for-like power reduction at the chip level,” says Milovantsev. “In fact, the main result is rather higher power densities as you cram more transistors into small form factors to serve the ever-growing need for compute power. The power density – and therefore the datacentre cooling – problem will only become more important over time.”
Once, unless you were a hyperscaler or a datacentre hosting infrastructure-as-a-service (IaaS) companies or cryptomining, you probably didn’t need the high power densities or the robust cooling to support it. Of course, things are changing as enterprises more broadly make use of analytics, big data and machine learning.
“High-end datacentre CPUs from Intel and AMD have historically had TDPs in the 100-200W range,” says Milovantsev. “Current top-end AMD EPYC or Intel Ice Lake are already above 250W, and Intel Sapphire Rapids in late 2022 will be 350W.”
Read more about evolving datacentre chipset designs
- Digitisation and geopolitical tension between the US and China may be behind Pat Gelsinger’s news about building fabrication plants in the US and Europe.
- Irish bank has continued its investment in IBM z-series hardware and the Red Hat software portfolio to help it achieve its digital transformation goals.
He advises explicitly linking the right applications to the right hardware with the right cooling and power systems in the right type of hall or facility, although business units will increasingly be asking for, and buying, chips with higher TDP envelopes.
Datacentres should devise a menu of vetted cooling options to work with, as well as how to route higher-amperage powers using modern busbars in addition to using the right server monitoring tools, says Milovantsev.
Nigel Gore, global high density and liquid cooling lead at Vertiv, points out that, historically, datacentres have been designed to support a rack power density of 3-5kW, but today’s high-performance systems support 10-20 times the power density.
“The chip vendors are always talking about what is the performance per watt consumed, with every single advancement and roadmap looking for an incremental performance gain over their previous generation,” says Gore. “So when you look at cooling these chipsets, you need airflow and heatsink to be able to dissipate that amount of heat and you need to watch humidity.”
Often today it’s running nearer the top end of operating parameters, which is why liquid cooling solutions have been getting more traction, particularly at the higher end, with Intel now also considering liquid cooling as important for newer chipset designs.
As we have seen, those incremental gains for a number of years ahead look quite modest.
But Gore also suggests keeping an eye on news about the GPU-like accelerator module being developed by members of the Open Compute Project.
“It will have a number of combinations depending on how they package the performance system,” he says. “But it’ll include ASICs and high-speed interconnects for memory and it’s really designed around high density and performance to support automation and machine learning.
“You can put eight of these devices into one server. Multiply them by their TDP measurement – that’s eight times 700W. In one server, you’ve got 5.6kW of thermal density.”
Datacentres might not yet be supporting machine learning, artificial intelligence and these high-end, rich HPC applications and are running lower power densities. While they don’t have an immediate need to deploy the latest chipsets, more organisations will soon be looking at bringing in advanced applications – and then they will have that performance need, says Gore.
“In mid-2020, we were seeing high-density racks 30-35kW,” he adds. “Very quickly, after six months, it went to 45kW, and this year we started to see design consultants talking about supporting densities of 60kW.”
Fausto Vaninetti, technical solutions architect for cloud infrastructure and software at Cisco EMEA and Russia (EMEAR), notes that while we wait for new chipset designs, a focus on standalone or modular servers with enough motherboard footprint for airflow and to accommodate heatsinks – paying attention to fans and power supply efficiencies – can be useful.
After all, CPU technology is evolving, but so are power requirements.
Acceleration and special-purpose devices are also becoming more and more common and they require specific attention. GPU cards or persistent-memory modules, for example, have high power consumption and cooling needs, says Vaninetti.
“Intel Xeon Platinum 8380 scalable processors have TDP of 270W, the AMD EPYC 7763 a TDP of 280W, with next-gen CPUs expected to push the envelope to 350W,” he adds. “Support of higher-performance CPUs is critical to enable a balanced configuration of six to eight GPUs or pools of persistent memory to be attached to a server.”
AMD’s “very high” PCIe lane count is most useful in rack servers that can attach resources, such as many NVMe drives or PCIe cards and in evolving form factors like Cisco UCS X-Series, says Vaninetti. Intel has some CPUs capable of high clock speeds more appropriate for certain workloads, he adds.
Cisco's UCS X-Series was focused on airflow and power efficiency and can still support top-end configs without “an unusable amount of power input” being required, says Vaninetti. However, over the next decade, liquid cooling will be one technology that becomes essential to support higher power densities – dependent on individual datacentre constraints, he points out.
“Chassis-level or perhaps rack-level options may allow the customer to maintain their existing datacentre-level air cooling,” he adds. “The other major consideration when it comes to server farms and related equipment is the way you manage and operate it.”