polygraphus - Fotolia
IBM has publicised plans for so-called “cognitive storage” that will help IT systems determine which information is most important and which should be assigned a lower priority, particularly in big data scenarios.
It envisages cognitive storage cutting costs by helping to decide automatically what data should reside on which class of media, what levels of data protection should apply and what policies should be set for the retention and lifecycle of different classes of data.
IBM’s Zurich labs publicised the project in the IEEE’s magazine Computer this month in an article in which it talks about its work with the Square Kilometre Array, an astronomy project that will comprise one million square metres of radio telescopes and which will generate up to 1PB of data a day.
In the project, IBM faces the challenge of building storage systems that can handle such a large volume of data but keep costs low.
“But we thought about treating data as the human brain does – where there is limited storage space and you don’t store all information.”
Cherubini said cognitive storage aims to assess the value of data and apply it to storage, with class of media, level of data protection, and so on based on the importance of data.
Where storage currently comprises the media, data protection, access controls, and so on, cognitive storage will add a “learning system” that interrogates metadata and analyses access patterns, as well as learning from the changing context of data use to help it assign value.
“Administrators would help train the learning system by providing sample files and labelling types of data as having different value,” said IBM researcher Vinodh Venkatesan.
But would cognitive storage be useful to enterprises? Astronomical data is – despite potentially being of very large volume – relatively limited in terms of type of data handled, whereas enterprise data ranges from business-critical transactional data to emails, machine sensor data, and so on.
“For an enterprise, there are ‘must keep’ classes of data and these could be set to be of permanently high value,” said Venkatesan. “But that is a small proportion in an enterprise. The rest, the majority, which cannot necessarily be manually set, can be handled by cognitive storage – such as big data-type information and sensor information that might have value if analysed.”
IBM’s cognitive storage project is currently at the prototype stage. It will be used on the Square Kilometre Array for predictive caching, in which it will attempt to forecast the sets of data that astronomical researchers will want to download from the overall data set and save them time by preparing this in advance.
IBM is also looking for partners for beta testing in enterprise and other environments. Functionality will ultimately be provided in hardware/software product form as well as via the cloud, said Venkatesan.