
Mike Kiev - Fotolia
Podcast: Data management and storage strategy in the AI era
We talk to Pure Storage EMEA field chief technology officer Patrick Smith about the challenges of data management in an era of AI and data proliferation, and how storage functionality can help
In this podcast, we talk to Pure Storage EMEA field chief technology officer Patrick Smith about the issues of data management in an era where the upsurge of artificial intelligence (AI) has created ballooning volumes of data, often unstructured.
Here, Smith talks about the challenges presented by data silos, data sprawl and the need to know where it is, what’s there, and to establish governance and control over it. He particularly focuses on how that’s all been made more challenging by AI workloads.
Core to the discussion is how a data strategy can be helped by use of data storage product functionality to automate discovery, management and cyber resilience of data at fleet level.
What do you think is the core challenge for organisations and their data?
I think this is a really interesting one, because for the last couple of years, all we’ve really been talking about is the world of AI. And that’s been dominating not just the news, but the technology landscape.
But what we’ve realised is, whilst everyone’s talking about training large models and inferencing at scale, actually, for a lot of enterprises, the big challenges around AI have exposed a lot of foundational weaknesses in their data landscapes.
So, all the big challenges we’ve been talking about for years, if not decades, such as data silos, data sprawl, the sprawl of data across organisations … AI has just made even worse because now we have more data, more data that we have to try and understand and bring into the AI environment.
But also, who owns the data? How can you get access to that data? What’s the governance and control on the data? And so, all of those things are bringing to a head the fact that we’ve never really had a good, holistic data management platform. And then if you align that with the growth in data caused by AI, you’re driving potentially even more manual operations and reactive management.
We know that organisations have problems predicting how quickly an AI project will grow and how successful or unsuccessful it might be. So, you have a lack of certainty around investment at the data layer. All of that is, I think, giving organisations some really big challenges as they look beyond AI to the broader technology landscape.
What do you think are the key components of a solution to making data more manageable and gaining value from it?
Our view is that as part of understanding that data environment and data landscape, you need to be able to effectively create a virtualised cloud of data and understand where the data sits. [You need to make] sure the data is on the right platform, providing the right level of reliability, availability, performance and simplifying the management of that landscape so you can actually focus more on managing the data and less on managing the storage.
Our view is that that is driven through automated data management, ultimately autonomous data management, where you actually don’t have to worry about where the data sits. It’s taken care of for you based on policy, process and automation across the estate.
And then in today’s climate, one of the really big areas of concern around data is the cyber resilience. And by automating [it], what we see is the ability to take out the risk of misconfiguration, which may mean that you’re not protecting the data properly, maybe you’re not applying snapshots, maybe data replication isn’t being applied properly.
Being able to automate all that takes the risk of human error out of the equation and really provides a much more consistent data and storage environment, and is better able to serve the needs of the business and lower the risk.
How does all this translate into storage strategy?
In terms of the storage strategy, storage is obviously the enabler for today’s data landscapes.
There’s the risk that the complexity and scale of the storage environment means that organisations are overwhelmed by managing that storage landscape and don’t have the time to properly manage the data landscape.
We need to flip that on its head, reduce the overhead completely of managing storage so that actually, you move away from managing storage as individual devices and instead manage at a fleet level so that now you can manage the fleet of storage across an enterprise based on policy process and automation.
That frees up organisations to focus on the data, managing the data, ensuring that they’re serving the business’s consumers in the best possible way and freeing up the organisation from the mundane tasks that we can now do through technology and applying that technology broadly across the environment.
Read more about data management
- Data management key to GenAI success: Deloitte survey shows business and IT leaders are optimistic, but academic researchers warn of AI training time bomb.
- What is data management and why is it important? A guide: Data management is the process of ingesting, storing, organising and maintaining the data created and collected by an organisation.