How AI is driving a rethink of storage architecture

Pure Storage’s head of AI infrastructure, Par Botes, argues that the growing use of artificial intelligence will require storage systems with audit trails and versioning to ensure trust and traceability

Aaron Tan, Informa TechTarget

Published: 01 Oct 2025 9:55

A common refrain in the tech industry is that artificial intelligence (AI) needs fast storage. While true, speed is not the only consideration when supporting the growing number of AI workloads, according to Par Botes, vice-president of AI infrastructure at Pure Storage.

The real challenge, he contends, is redesigning storage for a world of autonomous AI agents, where trust and data lineage are important. This requires a rethink in storage architecture, going beyond performance to building new auditing capabilities right into the storage layer.

“No storage system has been designed that way,” Botes told Computer Weekly in a recent interview. Current systems treat files and objects as mutable items to be read from and written to. This is not enough for what Botes calls “systems of consequence”, in which AI interactions have real-world impact and must be verifiable for compliance and quality assurance purposes.

To address this, Botes is advocating for storage systems that record every interaction an AI agent has with data. “The second you move away from simple chatbots into systems of consequence, where data has an impact, you’d want to have an auditor,” he said. “Every interaction involving agents should be recorded for auditing reasons, and you’d want a copy of the conversation.”

Pure Storage is looking to build versioning and full data lineage capabilities directly into its products, with every update creating a new version of a data object rather than overwriting the previous one. “For an object that you’ve written to six times, you’d want to be able to retrieve the copy that was four versions ago,” Botes explained. “You can see the whole lineage and what was used by which AI query at what time.”

This capability, while not ready yet, is key to fostering trustworthy AI. “The version of the data is just as important as the version of the model to build the outcome,” said Botes. “So, you’d want to have that full lineage and history.”

While the new storage architecture is about more than just speed, performance remains key as AI workloads require high performance per unit of capacity. In addition, AI algorithms can scan vast datasets – including archived data in long-term storage – at high speed, effectively blurring the distinction between hot and cold data.

That could very well speed up the transition to the all-flash datacentre. “I think in the world of AI, flash is super hard not to use because you don’t have cold datasets the way you used to, as AI looks at all the data, all the time,” said Botes.

However, flash memory has a finite lifespan. Each memory cell degrades every time it undergoes a program-erase cycle. Over time, this makes the cell slower until it’s eventually marked as failed. Traditionally, small, low-power controllers on each storage device use basic statistical models to manage wear and tear, applying broad-stroke adjustments to voltages and timings across thousands of cells at a time.

Botes and his team, however, have developed sophisticated AI models that can manage flash media at a granular, per-cell level. By analysing the precise response time and behaviour of every single cell over its lifetime, the models can make micro-adjustments that improve performance and durability beyond what traditional controllers can achieve.

The data that’s used to train and improve the models comes from the telemetry data of Pure Storage devices deployed in the field. “Every customer sends back telemetry to us every 20 seconds, so we have billions of data points every day on how the hardware and chips are used,” he said.

In fact, the insights are so valuable that Pure Storage licenses the knowledge back to memory chip manufacturers themselves, helping them design better silicon based on real-world usage rather than laboratory assumptions. “We get this virtual cycle where we learn more about the chips and feed that back to manufacturers to make the chips better,” said Botes.

Meanwhile, Pure has also been helping organisations manage the complexity of AI data. It has developed a distributed control plane designed to make an enterprise’s entire storage footprint, both on-premise and in the cloud, operate as a single unified system, allowing administrators to define service levels, performance characteristics and protection policies for different classes of data.

“It makes many small storage components look like a singular large storage plane,” said Botes. “You stop worrying about how it’s run and just focus on the outcomes you want.”

How AI is driving a rethink of storage architecture

Pure Storage’s head of AI infrastructure, Par Botes, argues that the growing use of artificial intelligence will require storage systems with audit trails and versioning to ensure trust and traceability

Read more about storage and AI

Read more on AI and storage

Couchbase ups database vector search, indexing capabilities

Cloudian launches object storage AI platform at corporate LLM

AWS adds vector functionality to S3 object storage

Interview: Pure Storage on the AI data challenge beyond hardware