peshkov -

How ANZ organisations can address challenges in AI adoption

Pure Storage's global CTO Alex McMullan discusses the data and sustainability challenges in artificial intelligence adoption, which can be addressed by centralising datasets and focusing on data quality and management

This article can also be found in the Premium Editorial Download: CW Asia-Pacific: CW APAC – Trend Watch: AI infrastructure

Major industries such as mining, financial services and logistics in Australia and New Zealand (ANZ) are already adopting artificial intelligence (AI), but they need help to manage and curate the growing volume of data they are capturing before they can use it to train AI models.

And while various open source AI models are available for download, the challenge is in managing the associated data pipelines to get results that deliver high business value from them.

Those are some key stumbling blocks faced by AI adopters in ANZ, according to Alex McMullan, global chief technology officer (CTO) at Pure Storage, who emphasised the importance of data quality – rather than the number of graphics processing units (GPUs) available – so organisations can create and finetune AI models to drive business outcomes.

Organisations will also need to ask whether they can deploy AI efficiently in their own datacentres, run GPUs at a sufficiently high utilisation level, or if they should turn to a cloud or managed service provider, McMullan told Computer Weekly. The answer, he suggested, depends, again, on how good they are at managing data and the industry they operate in.

Areas where Pure customers are applying AI include genomics in healthcare, fraud detection in finance, and in forecasting earthquakes and volcanic eruptions. “A lot of our big customers are building very large datasets with a lot more historical data,” he said.

McMullan noted that the location of the data can be a significant issue, but it doesn’t necessarily tip the scales in one direction or the other. For example, a subset of a massive dataset can be uploaded to a cloud provider and used to train a model. That model can then be deployed on-premise, where it is close to the entirety of the data and users throughout the organisation. This approach also helps with data governance.

Read more about IT in Australia

“The trend we observe is that datasets are going to centralise in one place, whether you want to call that a data warehouse or an unstructured data lake,” said McMullan, adding that this includes bringing back data from tape or Amazon S3 Glacier archival storage and adding it to those pools of active and accessible data.

“Broadly, centralisation is part of the story. Nobody can afford to run a distributed GPU cluster across different datacentres, in different clouds. That just isn’t practical at large scale,” he said.

Either way, something that successful deployments have in common is that the necessary effort went into cleaning, labelling and categorising the data drawn from a variety of sources.

With the pressure on power grids and datacentres from electric vehicles and AI clusters alike, organisations will also need to consider the technology’s impact on infrastructure and carbon emissions. “I think the standard value proposition for Pure was always true in terms of energy, efficiency, security and reliability,” he said.

McMullan said Pure has been asked by its customers to run an entire datacentre in a single rack, noting that “we genuinely believe we have a pretty good trajectory in terms of engineering or get to 100PB [petabytes] a rack over the next few years, which is enough for every modern company, with all its historical data.”

Achieving that “will give folks enough power budget to run more GPUs, or faster networks, or more CPU [central processing unit] cores to process, store, analyse and innovate in the whole ecosystem”, he added.

Currently, Pure customers are consuming around 1W per terabyte of stored data. “We want to get to 0.1W per terabyte over the next couple of years. And that’s a great sustainability message, but also a datacentre message for customers trying to do more.”

Pure customers are also requesting the option of running Pure Storage devices on a power cap rather than a performance cap, said McMullan. The more input/output operations a flash device performs, the more power it consumes, so this can be seen as a quality-of-service measure, and it’s something Pure is looking at implementing over the next couple of years.

More generally, McMullan sees a focus on sustainability and the circular economy, observing that CO2 emissions don’t need a passport to move from one country to the next, and that sequestration is a long-term strategy that may or may not bear fruit.

“We have to take some corrective action on thermal profiles, on energy consumption and the way we use energy and AI is a great example of that,” he said.

Other data infrastructure issues

Here are some other data infrastructure issues observed by McMullan that are particularly relevant to ANZ:

  • A realisation that high availability and disaster recovery sites probably should be separated from primary sites by hundreds rather than tens of kilometres due to seismic and volcanic activity.
  • Cultural awareness of the need for stewardship of land and other resources. “I think that’s not a lip service thing, it’s genuinely there as a more modern cultural approach,” said Alex McMullan, CTO at Pure Storage. “I’m not just talking about miners, I’m talking about technology deployment here in Australia, and in New Zealand, too. So, it was something that I hadn’t seen or heard before, but it is very much to the forefront now, in all the customer conversations I’m having.”
  • Localised temperatures are a challenge for cooling datacentres, and the more power that’s used for cooling means there’s less available to do the real work. “It was 43°C in Perth when I was there last week. In Auckland, it was blowing a gale and it was almost 30°C at the same time. It’s 34°C in Brisbane today. Those are all temperatures that we’re not designing for. [That said], “we have a number of customers running Pure [Storage arrays with] inlet temps in the high 30s, and we’re fine with that. It doesn’t play for everybody, but clearly, we don’t have any moving parts apart from fans, so we’re good with it as long as there’s enough airflow.”

Read more on AI and storage

Data Center
Data Management