jules - Fotolia

Storage arrays face two-pronged flash/cloud attack

The storage array as we know it could soon be a thing of the past, as hot data heads to flash and cold data goes to the cloud, leaving a slimmer array for bulk data on HDDs.

In this commentary, storage expert Chris Mellor predicts that storage arrays of the future will have a single tier of capacity and simplified controller software. 

Storage arrays are under assault. Hot data is heading towards flash in servers. Cold data is going to the cloud. All of that will result in, hopefully, a slimmer, fitter, leaner, meaner storage array.

In a world without this twin-pronged flash/cloud axis, the typical midrange and high-end storage array is a multi-trick pony, storing data with high access rates (hot data), medium access rates, low (nearline) and nearly no access (archival data) unless this cold data is pumped off to tape or an array with spin-down disks.

The flash side of the flash/cloud axis comes in a number of variants.

With the arrival of server flash, pioneered by Fusion-io and now being enthusiastically adopted by EMC, Kaminario, LSI, Micron, OCZ, SanDisk, Texas Memory Systems (TMS), Virident and others, it's becoming plain to see that the hottest data with the highest access rates will be cached in servers and then stored in all-flash arrays close to the servers.

All-flash array vendors like Nimbus Data, Pure Storage and Whiptail now sell such arrays. They’re high-performance flash storage silos and are gradually gaining data management and back-end storage array interoperability features.

Hybrid arrays, using SSDs and hard disk drives, offer a halfway house, combining flash speed and HDD capacity, with products from Tegile, Tintri, NexGen and others, and appear to be a great fit for small and medium-sized enterprise needs.

The trouble with existing storage arrays, even after they have had a flash infusion through SSD shelves or controller flash caching, is that they put a flash bandage on a disk drive legacy. We all know a ground-up redesign will offer a better product in the long term, and that's what the startups are counting on.

NetApp is going with a server-to-array flash caching strategy, saying it's simpler and more elegant. EMC duplicates, broadly speaking, NetApp's caching ideas but is adding an all-flash storage array using acquired XtremIO technology. HDS is going the same way with its new flash controller technology, and IBM recently bought TMS. Enthusiastic and excitable rumours are springing forth from Big Blue, mentioning the SAN Volume Controller, Storwize V7000 and TMS flash technology. 

Dell has added flash storage to its iSCSI EqualLogic arrays, and HP has a flashed version of its 3PAR arrays. It seems abundantly clear that hot data will migrate to flash, meaning that disk drive arrays will have less and less need for fast Fibre Channel drives, which existed to deliver data faster than slower drives. Even the 10,000 rpm SAS drives may be under threat from flash if its costs decrease enough.

The other prong of the flash/cloud axis is typified by the likes of Amazon Glacier's ultra-cheap cloud archive service. It appears to be based on an object storage technology scheme, layered on top of customised drives that can be spun down, saving power and cooling costs.

Other cloud storage providers will follow suit. It's conceivable that HDD array data tiering automated schemes, like EMC's FAST and IBM's EasyTier, could be extended to add cloud data archiving to their capabilities.

The net result will be networked storage arrays holding the medium-access data only. In NetApp's view they will need only one class of disk drive: 7,200 rpm, 3.5-inch, bulk capacity SATA drives, which are currently beginning to offer 4 TB of capacity. You won't need to add spindles to get performance because flash storage takes care of that. Nor will you need to add spin-down disks for archiving since the cloud will take care of that.

The classic modular, dual-controller tiered, high-end arrays will evolve into single-tier entities with software that moves data automatically into a flash performance tier and an archive tier. That software may well migrate into hypervisors such as VMware's vSphere and Microsoft's Hyper-V, leaving the storage array in the middle ground, with a single tier of capacity storage and much simplified controller software.

All of which raises the question: Will we still need Fibre Channel and still be engaged in debates about pure Fibre Channel and Fibre Channel over Ethernet (FCoE)?

Both networks add some degree of network latency to data access from a shared array. The all-flash arrays, with their emphasis on speed, could start using InfiniBand or 40 Gbps Ethernet and leave pure Fibre Channel behind.

FCoE adoption is troubled with Ethernet standards that are still developing and being implemented, meaning FCoE usage could fail to meet its potential. Also, 10 Gigabit Ethernet can be used for iSCSI at speeds that match or beat Fibre Channel, so a migration of hot data to flash could weaken the use of pure Fibre Channel, leaving iSCSI as the dominant networked flash storage data access protocol.

The classic SAN storage field is being upended as assaults from flash and the cloud bring about a simplification of its infrastructure and role, and it becomes more of a one-trick, medium-data-access-rate pony.

Chris Mellor is storage editor of The Register.

Read more on Storage fabric, switches and networks