leungchopan - Fotolia
Flash vendors told us that the hierarchy in data would end because the sheer speed of the technology meant that all data could be treated equally. That hasn't quite happened.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
In reality, as different types of enterprise all-flash arrays emerged, it became obvious that some were faster than others. So some form of tiering was required. This was simple to arrange when the all flash arrays (AFAs) were outside of the computing process. All the competing types of storage could be virtualised and held as resource pools. Which was exactly the same way it had always been done with spinning disk storage.
Then some vendors started pushing PCIe persistent memory. There was an anomaly here too, according to analyst and IT historian Clive Longbottom, senior research at Quocirca.
“The problem is that PCIe is server-side and dependent on the server it is tied to,” says Longbottom. Though Pernix Data promised interesting virtualising of storage to create availability, it was acquired before it could sort out the market.
Next came M2 memory cards, then NVMe storage, all of which only confused people even more. Then came a blizzard of acronyms: NVDIMMs, with the lightning fast bus speeds you get from dual inline memory modules (DIMMs), which are great for non-volatile storage. “Even here, we have massive disparities in performance between the various offerings,” says Longbottom. For example, SLC/MLC is a long way in performance from 3D X-Point/Optane.
Once we had the old storage class system with SAN as Tier 1, Tier 2 was NAS and Tier 3 was the archive. Now we have Tier 1 as NVDIMM X-Point, Tier 2 as NVDIMM SLC/MLC, Tier 3 is M2/PCIe, Tier 4 being AFA Virtual SAN, at Tier 5 is AFA physical SAN, then in Tier 6 comes AFA NAS. At the very bottom of the hierarch you have AWS Glacier and Tape systems.
“So many tiers bring with it so many fears,” says Longbottom. “Tiering is not easy. In theory the data that is needed at any time gets stored in the fastest possible storage system. But it’s not that easy once you’re faced with today’s big data issues.”
You cannot afford to have a petabyte database (or mix of databases) all in NVDIMM. It’s stupidly expensive, says Longbottom, a Dimm-Sum game.
But if you could split the data across multiple tiers in - say - some kind of hierarchical storage system, that would keep the CFO happy. It would work like this, says Longbottom: an analytics engine requests an item of data. Then NVDIMM acknowledges this, searches and confirms that it has the requisite item and releases it. “That way you get the benefits of the fastest possible speed possible,” says Longbottom.
In a relatively dumb tiering system, however, the request for an item of data would be passed several times around the storage system - from NVDIMM to PCIe memory system through the SANs and finall to archive - until one of the storage tiers actually has the data.
The problem is, this system could be even slower than the old single, lower tier storage system, once the requests have been passed round the entire system times, as 90% of them will. (Don’t we all know that feleing of making a call to an enterprise and being passed from pillar to post).
Intelligent tiering is the answer, says Longbottom. But in order for that to work, it needs to be powered by volatile memory – which is even more expensive.
Symbolic IO says it has the solution: battery-backed volatile DIMM storage and highly intelligent data compression to hold the metadata in place. It concentrates on appliance-based data handling systems and this, says Longbottom, “plays beautifully into the tiering issue. Blazing data speeds can be achieved across multiple different storage tiers.” All because they’ve intelligently compressed the metadata in battery-backed volatile stores.
So in other words, we’re now in the age of Big Meta Data. Even the data about the data is becoming a problem. Where will it all end? Are we generating too much information? Could that be the problem? Couldn’t we dump it?
Maybe this GDPR will make us rethink how we generate all this corporate SAN-fill.