kentoh - Fotolia
Data management software company Komprise plans to launch its Deep Analytics functionality “in the next few weeks”, which will allow customers to query entire enterprise unstructured data sets for analytics processing in Hadoop, on the Amazon cloud and other clouds.
The functionality allows the customer to query its unstructured data – which could be in the petabytes – and create a distinct virtual data set from disparate physical data storage.
Deep Analytics is currently in beta with some enterprise customers.
Komprise’s chief thrust is to provide software that allows customers to interrogate very large, enterprise-scale data sets and categorise them, primarily in terms of usage, allowing them to be migrated to the most suitable storage tier.
It cites, for example, UK-based property company CBRE, which was able to move 70% of its inactive unstructured data from primary storage by using Komprise. CBRE had found itself in the difficult position of being unable to put a new backup product in place because its data on primary storage had grown too large in volume.
Komprise runs on HA pairs of virtual machines (VMs) – called “observers”, that can handle up to a few petabytes per VM – which connect to NAS and object storage via NFS, SMB and S3 to categories the metadata of potentially billions of files across multiple shares. There is also a “director” management console.
Customers can set policies to categorise data according to when last accessed, who accessed it, type of data, and so on, and then project how much data can be moved off existing storage. Komprise also carries out migration to other physical storage, including between protocols and also to the cloud, with a link left behind for local access.
Read more on unstructured data
- Unstructured data exists in huge volumes, but often it is semi-structured with metadata. We lift the lid on unstructured data and key approaches to its storage.
- NAS and object storage offer highly scalable file storage for large volumes of unstructured data, but which is right for your environment?
According to COO Krishna Subramanian, Komprise can start to show results on “petabytes of data within 15 minutes”.
“Komprise is designed for massive scale,” she said. “It interrogates file metadata with very efficient distributed computing techniques to give a very low overhead and builds its results into a non-traditional database.”
According to Subramanian, the company has about 150 customers, with some running Komprise on nearly 100PB of data.
Komprise supports migration to all three main cloud providers – AWS, Microsoft Azure and Google Cloud Platform.