BAIVECTOR - AdobeStock
Nvidia prepares for exponential growth in AI inference
The company, famous for its datacentre AI acceleration, is focused on delivering better performance per watt to fuel the AI boom
Chipmaker Nvidia has reported revenue of $57bn for its third-quarter 2026 filing,with its datacentre business contributing the most to the company’s bottom line, posting revenue of $51bn – a 66% year-over-year (YoY) increase compared with last year’s results.
CEO Jensen Huang said the company continued to see growth in artificial intelligence (AI) workloads, which require the high performance graphics processor units (GPUs) that Nvidia specialises in.
According to Huang, AI inference is scaling exponentially due to advancements in pre-training, post-training, and reasoning capabilities. He said that that inference is becoming increasingly complex as AI systems now “read, think and reason” before generating answers. Huang claimed this exponential growth in computation requirements is driving demand for Nvidia’s platforms.
Beyond its datacentre AI accelerator GPUs, the company’s NVLink AI networking infrastructure business grew 162%, with revenue of $8.2bn.
Huang said: “Customer interest in NVLink Fusion continues to grow. We announced a strategic collaboration with Fujitsu in October where we will integrate Fujitsu’s CPUs and Nvidia GPUs via NVL Fusion, connecting our large ecosystems. We also announced a collaboration with Intel to develop multiple generations of custom datacentre and PC products connecting Nvidia and Intel’s ecosystems using NVLink.”
Among the areas Nvidia sees as a big differentiator is the power to watt metric, which is directly linked to the running costs of high-performance compute in datacentres. Discussing the breakthroughs in GPUs, he said: “In each generation, from Ampere to Hopper, from Hopper to Blackwell, Blackwell to Rubin, our part of the datacentre increases.”
He said that each generation of GPU sees a major increase in performance, but that performance needs to be delivered within the power limits of the datacentre. “You still only have one gigawatt of power in a one-gigawatt datacentre,” he said. “Your performance per watt translates directly to your revenues, which is the reason why choosing the right architecture matters so much now.”
When asked about the single biggest bottleneck that could constrain Nvidia’s growth, Huang said: “What Nvidia is doing obviously has never been done before. We’ve created a whole new industry. On the one hand, we are transitioning computing from general-purpose and classical or traditional computing to accelerated computing and AI. On the other hand, we created a whole new industry called AI factories. The idea that for software to run, you need these factories to generate it, generate every single token instead of retrieving information that was pre-created.”
He said the transition requires “extraordinary” skill, adding: “The most important thing that we have to do is a good job planning. We plan up the supply chain, down the supply chain. We’ve established a whole lot of partners, and so we have a lot of routes to market.”
However, one market that is now effectively closed off is China. Huang said: “Sizable purchase orders never materialised in the quarter due to geopolitical issues and the increasingly competitive market in China. While we were disappointed in the current state that prevents us from shipping more competitive datacentre compute products to China, we are committed to continued engagement with the US and China governments and will continue to advocate for America’s ability to compete around the world.”
In what could be regarded as a carefully crafted message to the White House, he added: “To establish a sustainable leadership and position in AI computing, America must win the support of every developer and be the platform of choice for every commercial business, including those in China.”
Overall, Huang wants Nvidia to deliver the best value to customers, saying: “At this point, I’m very confident that Nvidia’s architecture is the best performance per TCO [total cost of ownership]. It is the best performance per watt, and therefore for any amount of energy that is delivered, our architecture will drive the most revenues.”
Read more about AI inference
- What are the storage requirements for AI training and inference: Storage for AI must cope with huge volumes of data that can multiply rapidly as vector data is created, plus lightning-fast I/O requirements and the needs of agentic AI.
- Forget training, find your killer apps during AI inference: Pure Storage executives talk about why most artificial intelligence projects are about inference, during production, and why that means storage must respond to capacity needs and help optimise data management.
