IRStone - stock.adobe.com
AI drives storage array makers to embrace data management
We look at the efforts of storage suppliers to move to data management, often driven by AI – but analysts highlight the contradiction between supplier and customer needs
One way of viewing efforts by storage suppliers to move into data management over the past couple of years is that storage technology is emerging from the backroom and wants to be at the centre of efforts to gain value from data for artificial intelligence (AI) – and more widely across the business.
Nearly all the storage array makers now have a data management play, going almost as far as subsuming their entire offer underneath it. We’ve all grown to appreciate the value inherent in data – notably in AI – but also in the value of being able to easily move it to where it’s needed.
So, why are array suppliers making such an effort to widen their offering out to data management? There are a range of responses to this question, from those who put data and fleet management at the core, to those who put the spin on utility for AI workloads. Also, however, there’s a contradiction between suppliers and customers that often underlies the situation.
What data management offerings are available from storage suppliers?
Dell’s data management push is based around Dell AI Factory, which comprises a portfolio of infrastructure and services for AI. Core to this, and built on Dell storage, is the Data Lakehouse, which can combine metadata in stored data for AI with additional data ingested from other sources.
Meanwhile, Dell also has Project Lightning, a parallel file system. It is in an early project phase and aims to be a next-generation file system for its scale-out unstructured data storage. Dell’s Apex as-a-service cloud console brings unified access to all its pay-as-you-go offerings.
Hitachi Vantara’s approach has been less explicitly AI-centric. Its Virtual Storage Platform One (VSP One) aims at delivering a software-defined storage architecture with a common data plane across block, file and object storage, and hybrid and multicloud environments. VSP appeared in late 2023 and addresses the issue that customers often have a data environment that spans multiple datacentres and public cloud providers.
HPE has undergone a big revamp of its storage product line, with the all-NVMe Alletra family and a new cloud-based data services console, Data Services Cloud Console (DSCC). DSCC will be available by year’s end and will work with HPE Nimble and Primera storage to provide a common software-as-a-service (SaaS) control plane.
Customers will pick the type of application workload, service level and capacity they need, and HPE’s AI-driven InfoSight will provide recommendations on the best way to optimise the system, with some degree of ability to move data between HPE storage subsystems.
IBM appears to have a strategy based around a wide range of technologies and products that include databases, the lakehouse concept and so on, but no defined single data management platform.
NetApp’s marketing focus has been more narrowly on storage for AI and specifically knowing, understanding and being able to use the data more effectively via AI tools. Here, NetApp has built around the idea of its NetApp Data Platform, which aims to create a “metadata fabric” to provide access to data for AI that targets strong levels of timeliness and integrity in the data and to simplify the AI data pipeline. Its MetaData Engine – part of AI Data Engine – allows customers to extract data from OnTap and manage it via the BlueXP control plane console across on-premise and cloud environments.
Pure Storage has been vocal and prominent in this space. It has a range of platforms by which customers can provision, manage and upgrade their Pure fleet. Some of these have existed for some time, such as Pure1, which allows for operational management, but also upgrade triggering.
More recently, Pure added the Fusion control plane that brought the ability to provision storage by performance profile across the customer storage estate, including in the cloud, and to be able to shift data between storage instances. Pure put the cap on all this in 2024 with the introduction of Enterprise Data Cloud, which adopted a cloud operating model approach that abstracted storage provisioning away from the array.
Another supplier throwing its weight behind managing data for AI pipelines is Vast Data. It has the stated aim of providing an “AI operating system” from storage to data and application layer. It also promotes its Vast Data Platform as an “AI data repository” with data warehouse capabilities that includes Vast Event Broker – with Kafka API event streaming – to connect data at ingestion with archived data. More recently – and planned for general availability in late 2025 – Vast has its AgentEngine which will allow customers to deploy and manage AI agents.
Huawei has developed its own data lake software and aims at a full-stack approach to AI data storage and pipelining for AI workloads. Its Data Management Engine (DME) is core to this and provides a central management interface to Huawei storage, third-party storage, switches, and hosts using APIs.
Functionality in DME include Huawei’s data warehouse, a vector database, data catalogue, data lineage, version management and access control.
What do analysts think of these initiatives?
Industry analyst views reflect that storage makers address a genuine development in data but also that supplier and customer needs can conflict.
Tony Lock, an analyst with Freeform Dynamics, sees these moves by storage suppliers towards wider data management as part of such an evolution. “The core matter is that when commercial computing took off, almost all of the focus was on the compute side, data was simply the raw input from which output was created by the systems,” he says. “Over the course of the past decade, this has changed dramatically as the business value of data has been recognised and it can be turned into information on which actions can be taken.”
But at the same time, analysts also recognise these new initiatives tend towards lock-in for customers. Marc Staimer of Dragonslayer Consulting puts the emphasis here: “You could say this storage supplier evolution is defensive. Not that it doesn’t have value – it does. There is definitely high value, but it locks those customers in for the long term. The open multi-supplier data management approach provides equivalent and better value – just not to the storage suppliers.”
Staimer points out that suppliers such as Hammerspace, Arcitecta, Komprise and some others are multi-supplier in their ability to manage across platforms.
Roy Illsley, chief analyst at Omdia, also points to this contradiction between the motives of the supplier and the needs of customers. “The most obvious pitfall is around the question, ‘Is this data platform able to work with any data held in any storage?’,” he says. “While the storage suppliers push this approach they have a simple issue – they build storage, but it needs to be more a data-centric view.”
Read more about data management
- Podcast: Data management and storage strategy in the AI era. We talk to Pure Storage EMEA field chief technology officer Patrick Smith about the challenges of data management in an era of AI and data proliferation, and how storage functionality can help.
- Data retention in the UK: How long should you keep data? We look at data retention periods, what the key laws and regulations say, how long they recommend to keep different kinds of data, and the software tools that can help.
- AI pushes data storage need but UK firms struggle to manage it - The rise of AI means potentially almost any corporate data could be useful, but has led to ballooning data volumes and organisations spending more on storage and energy.
