Agentic AI: Storage and ‘the biggest tech refresh in IT history’
We talk to Jeff Denworth of Vast Data about a future where employees are outnumbered by artificial intelligence agents and even smaller enterprises may need supercomputing levels of resources
With agentic artificial intelligence (AI), we could be facing the biggest tech refresh event in history, where every organisation might deploy up to 2,000 agents per employee.
And to meet that need, the entire IT infrastructure – and storage in particular – will be affected.
Those are the views of Jeff Denworth, co-founder of Vast Data, who talks in this podcast about the challenges of agentic AI infrastructure for IT departments, the challenges to storage of agentic AI, and how customers can begin to meet those challenges across their datacentres and the cloud.
This includes being very careful to clearly specify and provision infrastructure while not over-buying, as well as ensuring storage and compute work hand in hand with application architectures and database teams.
What extra challenges does agentic AI pose for the IT infrastructure?
It’s a very broad question. But, to start, I think it’s important to point out that this is in some respects an entirely new form of business logic and a new form of computing.
And so, the first question becomes, if agentic systems are reasoning models coupled with agents that perform tasks by leveraging reasoning models, as well as different tools that have been allocated to them to help them accomplish their tasks ... these models need to run on very high-performance machinery.
Today’s AI infrastructure often runs best on GPUs [graphics processing units] and other types of AI accelerators. And so, the first question becomes, how do you prepare the compute infrastructure for this new form of computing?
And here, customers talk about deploying AI factories and RAG [retrieval augmented generation], and AI agent deployment tends to be the initial use case people think about as they start to deploy these AI factories.
These are tightly coupled systems that require fast networks that interconnect very, very fast AI processors and GPUs, and then connect them to different data repositories and storage resources that you might want to go and feed these agents with.
The interesting thing about agentic infrastructure is that agents can ultimately work across a number of different datasets, and even in different domains. You have kind of two types of agents – workers, and other agents, which are supervisors or supervisory agents.
So, maybe I want to do something simple like develop a sales forecast for my product while reviewing all the customer conversations and the different databases or datasets that could inform my forecast.
Well, that would take me to having agents that work on and process a number of different independent datasets that may not even be in my datacentre. A great example is if you want something to go and process data in Salesforce, the supervisory agent may use an agent that has been deployed within Salesforce.com to go and sort out that part of the business system that it wants to process data on.
So, the first question becomes, how do you define this pipeline? How do you scope out all of the various data sources that you may want to process on? How do you size for what you would think is kind of a nominal operational workload, so that you’ve got enough compute resources for the steady state?
There are so many different facets of decision-making that come into play when people think they want to start deploying agentic workloads
Jeff Denworth, Vast Data
And then, the compute discussion takes you down the path of datacentre and power infrastructure readiness, which is a whole different kettle of fish because some of these new systems – for example, the GB200 and L72 systems from Nvidia – are very tightly coupled racks of GPUs that have very fast networks between them. These require something like 120kW per datacentre rack, which most customers don’t have.
And then you start working through the considerations of my GPU requirements and where can I deploy them? In a colo? Is it in a datacentre I have? Is that potentially hosted in some cloud or neo-cloud environment? Neo clouds are these new AI clouds born in the era of AI. There are so many different facets of decision-making that come into play when people think they want to start deploying agentic workloads.
What are the key challenges for storage infrastructure, in particular, in agentic AI?
Well, just as with the first question, it’s really multidimensional.
I think the first thing to size up is what is storage in agentic AI? And this is something that has radically changed since people started training AI models. Most people generally worked under the assumption that if you have a good and fast file system, that’s good enough. And so, the difference here is that when people are training in the AI sense, and even fine-tuning, often these are very well-curated datasets that get fed into AI machinery, and you wait a few hours or a few days, and out pops a new model.
And that’s the level of interaction you have with underlying storage systems, other than that storage system also needing to be able to capture intermittent checkpoints to make sure that if the cluster fails, you can recover from some point in time in a job and start over.
If you think about agents, a user gets on a system and makes a prompt, and that prompt will then send the agent to do some sort of almost unpredictable level of computing, where the AI model will then go and look to work with different auxiliary datasets.
And it’s not just conventional storage, like file systems and object storage, that customers need. They also need databases. If you saw some of the announcements from Databricks, they’re talking about how AI systems are creating more databases now than humans have. And data warehouses are particularly important as AI agents look to reason across large-scale data warehouses.
So, anything that requires analytics requires a data warehouse. Anything that requires an understanding of unstructured data not only requires a file system or an object storage system, but it also requires a vector database to help AI agents understand what’s in those file systems through a process called retrieval augmented generative AI.
The first thing that needs to be wrestled down is a reconciliation of this idea that there’s all sorts of different data sources, and all of them need to be modernised or ready for the AI computing that is about to hit these data sources.
I like to kind of look at what’s changed and what hasn’t changed in the market. And it’s true that there’s all sorts of new applications that are being deployed in the form of new applications that use reasoning agents, and they use reasoning models as part of their business logic. But there’s also a lot of legacy applications that are now being up-levelled to also support this new type of AI computing.
And so, our general conclusion is that every single business application in the future will have some component of AI embedded into it. And there will be a whole bunch of new applications that also will be AI-centric that we haven’t planned for or don’t exist yet.
The common thread is that there’s this new style of computing that’s happening at the application level on a new type of processor that historically was not popular within the enterprise, which is a GPU or an AI processor. But I think the thing that people don’t realise is that the datasets they’ll be processing on is a lot of historic data.
So, whereas the opportunity to modernise a datacentre is greenfield at the application level and at the processor level or at the compute level, [there is] the brownfield opportunity to modernise the legacy data infrastructure that today holds the value and the information that these AI agents and reasoning models will look to process around.
We may be embarking on what could be the world’s largest technology refresh event in history
Jeff Denworth, Vast Data
Then the question becomes, why would I modernise, and why is this important to me? That’s where scale comes back into the equation.
I think it’s important to checkpoint where we’re at with respect to agentic workflows and how that will impact the enterprise. It’s fair to say that pretty much anything that is routine or a process-bound approach to doing business will be automated as much as humanly possible.
There are now examples of many organisations that are not thinking about a few agents across the enterprise, but hundreds of thousands, and in certain cases, hundreds of millions of agents.
Nvidia, for example, made a very public statement that they’re going to be deploying 100 million agents over the next few years. And that would be at a time when their organisation will be 50,000 employees. Now, if I put these two statements together, what you have is roughly a 2,000 to one AI agent-to-employee ratio that you might think about planning for.
If this is true, a company of 10,000 employees would require large-scale supercomputing infrastructure just to process this level of agency. So, I think about it in terms of what the drivers are to modernise infrastructure. If just half or a fraction of this level of AI agent scale starts to hit a standard business, then every single legacy system that’s holding its data will be incapable of supporting the computational intensity that comes from this level of machinery.
And this is the thing that has us thinking we may be embarking on what could be the world’s largest technology refresh event in history. Probably the most recent one up until AI hit the market was virtualisation, which created new demands at the storage and database level. That same thing appears to be true for AI, as different customers we work with start to rethink data and storage infrastructure for large-scale agentic deployment.
How can customers ensure their infrastructure is up to the job for agentic AI?
It definitely requires some level of focus and understanding the customer workload.
But one of the things I see happening across the market is also over-rotation, where infrastructure practitioners will not necessarily understand the needs that come from either new business logic or AI research.
And so, they tend to overcompensate for the unknown. And that’s also pretty dangerous, because that creates a bad taste in the mouth for organisations that are starting to ramp into different AI initiatives when they realise, OK, we overbought here, we bought the wrong stuff here.
The first thing I would say is that there are best practices out in the market that should definitely be adhered to. Nvidia, for example, has done a really terrific job of helping articulate what customers need and sizing according to different GPU definitions, such that they can build infrastructure that’s general-purpose and optimised, but not necessarily over-architected.
The second thing that I would say is that hybrid cloud strategies definitely need to be reconciled, not only just for infrastructure-as-a-service – do I deploy stuff in my datacentre? do I deploy some stuff in different AI clouds or public clouds? – but also different SaaS [software-as-a-service] services.
The reason is that a lot of agentic work will happen there. You now have, for example, Slack, that has its own AI services in it. Pretty much any major SaaS offering also has an AI sub-component that includes some amount of agents. The best thing to do is sit down with the application architects team, which a lot of our storage customers don’t necessarily all have close connection to.
The second thing is to sit down with the database teams. Why? Because enterprise data warehouses need to be rethought and reimagined in this world of agentic computing, but also new types of databases are required in the form of vector databases. These have different requirements, at the infrastructure and compute as well as at the storage level.
Finally, there needs to be some harmonisation around what will happen with the datacentre and across different clouds. You need to talk to the different vendors you work with. That and the whole practice of helping people with this.
We’ve got something like roughly about 1.2 million GPUs that we’ve been powering around the world, and there’s all sorts of interesting approaches to not only sizing, but also future-proofing data systems by understanding how to continue to scale if different AI projects stick and prove to be successful.
Read more about storage and AI
Storage technology explained – AI and data storage: In this guide, we examine the data storage needs of artificial intelligence, the demands it places on data storage, the suitability of cloud and object storage for AI, and key AI storage products.
AI storage – NAS vs SAN vs object for training and inference: Artificial intelligence operations can place different demands on storage during training, inference, and so on. We look at NAS, SAN and object storage for AI and how to balance them for AI projects.