Eakrin - stock.adobe.com

Broadcom unveils Jericho4 for distributed AI networking across datacentres

Global semiconductor and infrastructure software solutions provider scales over one million specialised processing unit clusters beyond single facility limits

Even as artificial intelligence (AI) becomes indispensable for network operations and enabling tomorrow’s businesses, AI models are growing in size and complexity, with infrastructure requirements exceeding the power and physical limits of datacentres. To address this, Broadcom has introduced the Jericho4 Ethernet fabric router, a purpose-built platform for the next generation of distributed AI infrastructure.

Explaining the tasks that the new platform is designed for, Broadcom said distributing specialised processing units (XPUs) across multiple facilities – each provisioned with tens to hundreds of megawatts of power – requires a new class of router, optimised for very high-bandwidth, secure and lossless transport across regional distances.

Designed to interconnect over a million XPUs across multiple datacentres, Broadcom claims Jericho4 breaks through traditional scaling limits with “unmatched” bandwidth, security and “lossless” performance. Added to the existing Tomahawk 6 and Tomahawk Ultra lines, Broadcom believes that with Jericho4, it can offer a complete networking portfolio for high-performance computing (HPC) and AI.

“The Jericho4 family is engineered to extend AI-scale Ethernet fabrics beyond individual datacentres, supporting congestion-free RoCE and 3.2Tbps HyperPort for unprecedented interconnect efficiency,” said Ram Velaga, senior vice-president and general manager of Broadcom’s Core Switching Group. “Scale Up Ethernet (SUE), Tomahawk Ultra, Tomahawk 6 and Jericho4 all play a very important role in enabling large-scale distributed computing systems within a rack, across racks and across datacentres in an open and interoperable way.”

A single Jericho4 system scales to 36,000 HyperPorts, each operating at 3.2Tbps. With deep buffering, line-rate MACsec and remote direct memory access over a converged ethernet (RoCE) transport over 100km+ distances, Jericho4 is attributed with ensuring a truly distributed AI infrastructure unconstrained by power and space limitations at a single location. Broadcom’s 3.2T HyperPort technology consolidates four 800GE links into a single logical port, eliminating load-balancing inefficiencies, boosting utilisation by up to 70% and streamlining traffic flow across large fabrics.

Jericho4 also supports MACsec encryption on every port at full speed to protect data moving between datacentres. It claims this will deliver strong security without compromising performance, even at the highest traffic loads.

Manufactured on a 3nm process, Jericho4 features Broadcom’s 200G PAM4 SerDes with industry-leading reach. This is said to eliminate the need for extra components such as retimers, resulting in lower power usage, reduced cost and higher system reliability.

A number of manufacturers have already had access to the new technology. Commenting on its potential, Nokia’s vice-president of hardware, Jeff Jakab, said: “Broadcom’s Jericho4 family of silicon delivers the scale, performance and efficiency we need to push AI infrastructure to the next level. As AI workloads stretch across datacentres and regions, Nokia’s 7250 IXR routers – powered by Jericho4 — ensure high-throughput, lossless connectivity for the most demanding distributed AI systems. Broadcom continues to be our trusted partner in helping Nokia meet the evolving needs of the AI era.”

Michael KT Lee, senior vice-president for research and development at Accton, said his company has successfully delivered systems to customers using Broadcom’s distributed disaggregated chassis (DDC) scheduled fabric solutions. “With the availability of Jericho4, Accton is looking forward to collaborating with Broadcom to design new platforms that scale out the AI network further, incorporating features such as MACsec, long-reach 200G SerDes and UEC as the building blocks for evolving demands of scale-out AI clusters, while improving the energy efficiency and the modular flexibility needed for mega-scale GPU clusters,” he said.

Read more about AI in networking

Read more on Chips and processor hardware