trahko - stock.adobe.com

As AI costs spiral, Dell pitches return to on-premise datacentres

With agentic AI driving up public cloud consumption, Dell Technologies is pitching local and hybrid infrastructure to shield enterprises from soaring token costs

Enterprises that have been rushing to adopt agentic artificial intelligence (AI) systems are consuming tokens at an unprecedented rate, resulting in exorbitant monthly bills from the major public cloud suppliers.

Dell Technologies is looking to capitalise on the ensuing bill shock with new hardware and software offerings unveiled at its customer conference this week, betting that the future of enterprise AI is local, secure and shielded from variable cloud pricing.

“What we’re starting to see with our customers is that the amount of tokens generated is increasing faster than token costs are coming down, which means that the overall bill for customers is going up very high,” said Varun Chhabra, senior vice-president of infrastructure solutions group at Dell, during a media briefing ahead of the conference.

To illustrate the point, Jon Siegal, senior vice-president for Dell’s client solutions group, noted that a single developer within Dell recently burned through one billion tokens in 24 hours, racking up a $3,400 cloud bill in a single day.

In response, Dell is introducing Dell Deskside Agentic AI, an on-premise sandbox for building, testing and running AI agents locally. Powered by Nvidia NemoClaw and running on high-performance Dell workstations capable of supporting models from 30 billion up to a trillion parameters, the offering ensures sensitive data never leaves the corporate environment.

Siegal noted that running agentic AI entirely on-premise with open models can reduce enterprise spend by up to 87% over a two-year horizon compared to public cloud APIs, with a break-even point in as little as three months.

“The workstation is really becoming that free token generator for the right use cases,” Siegal explained. “Agentic AI, more than anything else, is most cost-effective when it's near the data.”

Bringing frontier models to the datacentre

Historically, the most powerful frontier models have been locked behind public cloud walls. But Dell looking to tear down those walls through a series of high-profile partnerships, bringing advanced models on-premise or into hybrid settings for data sovereignty and performance.

Dell announced that Google Gemini models will now run on-premise via Google Distributed Cloud on Dell PowerEdge servers. Additionally, the company is collaborating with Palantir to bring its Foundry and AI platforms on-premises and teaming up with SpaceX AI to bring Grok’s advanced reasoning and multimodal capabilities to on-premise or hybrid environments for customers.

“I cannot stress how big of a deal this is,” Chhabra said. “These are some of the world’s most powerful frontier models that have so far only been available in the cloud...giving customers more choice, flexibility on where they want to run these models, and bringing all of these models closer to their data and their enterprise workloads.”

During the conference keynote, executives from major industrial and pharmaceutical giants also took the stage to detail how they are using on-premise AI infrastructure.

AI is not just changing technology, it’s changing the economics of technology in favour of enterprise infrastructure. Now is the time to decide how you can most cost-effectively generate the tokens that you're going to need for the long term
Michael Dell, Dell Technologies

Diogo Rau, executive vice-president and chief information and digital officer at Eli Lilly, explained how the company relies on a Dell supercomputer equipped with more than 1,000 Nvidia GPUs to simulate complex protein interactions for drug discovery and digitally inspect manufacturing lines in milliseconds. Meanwhile, Suresh Venkatarayalu, chief technology officer of Honeywell, described deploying physical AI servers directly at industrial sites to drive autonomous operations where real-time decision-making is critical.

To support intensive AI models side-by-side with traditional workloads, Dell also announced a total hardware and software refresh for its flagship storage array: PowerStore Elite.

Boasting up to three times the input/output operations per second (IOPS), density and throughput of previous generations, PowerStore Elite uses new E3 drives, removes the NVRAM cache to maximise usable capacity, and pushes Dell’s data reduction guarantee to an industry-leading 6:1 ratio.

“The question isn’t just what can this platform do today? It is, will this decision still make sense a year from now? What happens when my workloads change? What happens when my costs shift?” Chhabra noted. “This is exactly why PowerStore Elite matters.”

On the compute side, Dell unveiled the 18th-generation PowerEdge servers, touted as the broadest single-socket lineup the company has ever shipped. Delivering up to a 70% performance improvement over the previous generation and a 13:1 server consolidation ratio, the new servers will all ship with quantum-safe firmware in preparation for 2027 post-quantum cryptography mandates.

For organisations grappling with the physical deployment of AI fabrics, Dell also introduced the Dell PowerRack, where AI compute, network and storage are engineered as a scalable unit and the Dell PowerCool CDU-C7000, a compact cooling distribution unit delivering over 220 kW of liquid cooling capacity for high-density GPUs such as Nvidia’s Rubin.

Meanwhile, Dell is also streamlining its security portfolio. The company introduced PowerProtect One, a unified cyber resilience platform that merges the capabilities of PowerProtect Data Manager and Data Domain into a single control plane, reducing deployment time by up to 75%.

To help organisations improve resilience against cyber attacks, Dell also unveiled CyberDetect, an AI-powered analytics tool that deeply inspects data at the byte level to identify ransomware corruption. Boasting 99.99% accuracy, it allows IT teams to definitively know which data is clean after an attack, turning ransomware recovery “from uncertainty into AI-powered, evidence-based assurance”.

As Dell brings these major updates to market, its message to enterprise IT leaders is clear: the infrastructure to support scalable, cost-predictable AI is ready today and the financial models of the past no longer apply.

“AI is not just changing technology, it’s changing the economics of technology in favour of enterprise infrastructure,” Dell Technologies’ CEO Michael Dell said in his keynote address. “Now is the time to decide how you can most cost-effectively generate the tokens that you’re going to need for the long term.”

Read more about AI in APAC

Read more on Datacentre capacity planning