Kalawin - stock.adobe.com

Northwestern cuts storage costs in half with Komprise

Komprise’s Deep Analytics helps university classify and migrate research data to cheaper disk and cloud as it cuts storage costs from $1.1m to $680,000 per annum

Chicago-based Northwestern University has deployed Komprise storage management, which has allowed it to classify large amounts of research data and migrate it off to less costly media. In the process, it has cut annual spending on its central storage pool from $1.1m to $680,000 by moving data from Isilon scale-out NAS to media that is up to 66% cheaper to buy and run.

Northwestern is very much a research-focussed university, and when faculty members initiate research, data goes first to a central pool. Research data holdings have extended to billions of files and many petabytes. They comprise unstructured data such as medical imaging and data from other fields, such as medicine the natural sciences and music.

IT manager Kenneth-David Turner said that when he first joined the university, the big issues were storage capacity and classification. “We didn’t know what the data was, who owned it, when it was last used,” he said. “We needed to stay ahead of capacity demands and to be compliant – with HIPAA, for example.”

Turner said Northwestern’s central storage repository was getting bigger. There was tons on disk, but it was impossible to categorise it,” he said. “That meant data was all held on SAS disks on the university’s EMC Isilon scale-out NAS storage.

Turner’s team looked at various ways of interrogating the central repository, using things like TreeSize and writing their own scripts, but found no way to also migrate the data discovered.

Eventually, it deployed Komprise, which is a storage analytics tool that can index content and migrate it according to user policy. It is based on the open source Elasticsearch search engine.

Northwestern deployed Komprise “observers” and proxies at its datacentre with a director offsite at the supplier site.

Read more about storage management

Policies are set per workgroup, and these can differ by, for example, retention time and the exclusion of certain directories. Then, according to policy, data is moved off to cheaper media.

This has seen data move from SAS disk on Isilon to 12TB SAS on EMC ECS – Elastic Cloud Storage, a software-defined object storage product – as well as AWS Deep Glacier cloud storage.

Overall, the cost of storage can be up to two-thirds the cost of keeping data on Isilon, said Turner.

Komprise’s Deep Analytics functionality allows the customer to query its unstructured data – which could be in the petabytes – and create a distinct virtual data set from disparate physical data storage.

Customers are able to set policies to categorise data by when last accessed, who accessed it, type of data, etc, and to project how much data can be moved off existing storage. Komprise also migrates to other physical storage, and can do so between protocols and to the cloud, with a link left behind for local access.

According to chief operating officer Krishna Subramanian, Komprise can start to show results on “petabytes of data in 15 minutes”.

“Komprise is designed for massive scale,” she said. “It interrogates file metadata with very efficient distributed computing techniques to give a very low overhead, and builds its results into a nontraditional database.”

Read more on Data protection, backup and archiving

Data Center
Data Management