Kittiphat - stock.adobe.com

NasuniIQ brings visualisation of massive unstructured datasets

Global file system provider adds visualisation of massive distributed unstructured datasets to allow customers to analyse data usage and prepare clean training data for artificial intelligence

Nasuni NAS file storage customers will now be able to visualise multiple petabytes of massively distributed and heterogeneous unstructured data in Grafana-powered dashboards.

The functionality, called NasuniIQ, which already ships with the Nasuni File Data Platform, will allow customers to interrogate and visualise data (see image below) held in multiple locations and answer questions about the data, its usage, cost, and so on.

Nasuni also touts NasuniIQ functionality as key to curating datasets for artificial intelligence (AI) processing.

NasuniIQ gathers intelligence from multiple Nasuni edge file storage instances – which it says can run to many hundreds in as many locations – and builds a centralised view of file activity.

Events that NasuniIQ can build visualisations of include user access to files, and patterns of access across users and departments, by folder and location. Such information can build knowledge about storage consumption patterns and performance issues.

Jim Liddle, chief innovation officer at Nasuni, cited use cases among companies that use chargeback to their departments as a key use case.

“Some companies have petabytes of data that’s distributed globally and in hundreds of edge filers, and they want a consistent view,” said Liddle.

“Companies accumulate unstructured data,” he added. “Just like we do on our personal laptops, but for an organisation, the scale is 200- or 300-fold, and it’s the job of those that manage data to tame it and understand it.”

Picture shows example of how NasuniIQ will allow customers to interrogate and visualise data held in multiple locations and answer questions about the data, its usage, cost, and so on.
NasuniIQ will allow customers to interrogate and visualise data held in multiple locations and answer questions about the data, its usage, cost, and so on

Nasuni is keen to emphasise the applicability of NasuniIQ to preparing AI training datasets.

“As companies get AI-ready, they need to curate datasets that are selected for being clean, for being frequently accessed,” said Liddle. “That’s frequently the case because companies want to use their own data for training sets, especially as those organisations that previously made data available are now locking it down in the wake of copyright cases.”

It’s also the case, said Liddle, that curation is important to ensure AI training data is of high quality.

“Unstructured data, as it grows, gets ‘dirty’,” said Liddle. “So you need to look at the files that are active, figure out what’s old and needs archiving, etc. Eliminating dirty data helps to reduce the possibility of hallucinations in outputs. AI is great, but it only works with good quality data behind it.”

Nasuni provides a cloud-based global file system that can replace traditional network-attached storage (NAS). It is a cloud file storage system. Customers interact with it like any other file system on the front end, but it lives on cloud object storage such as AWS S3. 

“Unstructured data, as it grows, gets ‘dirty’. Eliminating dirty data helps to reduce the possibility of hallucinations in outputs. AI is great, but it only works with good quality data behind it”
Jim Liddle, Nasuni

Nasuni bills itself as a cloud-native storage maker, with its global file system UniFS as the building block that integrates with back-end cloud object storage. It supports NAS and file server consolidation, backup and recovery, disaster recovery (DR) and collaboration tools.

Nasuni supports global file locking, so two users cannot write to a file simultaneously, and provides edge appliances for NFS and SMB services, as well as deduplication. Or, users can install virtual machines on their virtual or hyper-converged infrastructure in place of NAS hardware. Nasuni claims firms that deploy its global file system no longer need separate backup or DR.

NasuniIQ is already shipping with the Nasuni File Data Platform and does not incur any extra licence fee. Liddle said the company may develop a more advanced, charged-for version in future.

Read more on unstructured data

  • Unstructured data and the storage it needs: We look at unstructured and semi-structured data and find increasing amounts of production workloads that have their own storage hardware – file and object – needs, including flash.
  • How to get structure from unstructured data: We look at how to gain structure from unstructured data, via AI/ML analytics to create new records, selecting object data via SQL and storing unstructured files in NoSQL formats.

Read more on AI and storage