At EMC World in Las Vegas, the Massachusetts-based storage giant has unveiled a software-defined storage platform that will allow customers to build hyperscale datacentres along the lines of those pioneered by Facebook and Google.
The platform, called ViPR, is based on EMC’s Project Bourne, and will allow customers to build large, single pools of storage with two levels of control possible, including big data capabilities. EMC bills it as enabling customers to build web-scale software-defined storage without needing an army of PhD-level technicians.
On the one hand customers can choose to use ViPR’s so-called control level, which will unify the management and provisioning of storage from heterogenous storage arrays/media from one screen. This could mean EMC or third-party storage vendor arrays or even commodity hardware. Where possible ViPR will allow intelligence native to the storage under this umbrella to carry out management functions.
On the other hand customers can make use of deeper controls in the so-called data plane. Here, ViPR uses its own Object Data Services that can be accessed via Amazon S3 or HDFS (Hadoop Distributed File System) APIs to enable management and analytics of data where it resides on the storage media.
ViPR – which will be available from the second half of 2013 – will integrate via APIs with VMware’s Software Defined Data Center and work with Microsoft and the OpenStack operating environments.
ViPR is EMC’s response to several trends. Firstly, there is the growth to prominence of new hyperscale computing environments. These have been pioneered by the likes of Facebook, Google, Amazon and Apple, and see the use of vast datacentres that use cheap commodity server hardware with redundancy at the level of the server/storage instance rather than components within, as with enterprise storage and its use of dual controllers, RAID etc.
ViPR offers the user the ability to virtualise storage beneath it to provide a common pool of capacity manageable from one screen. Storage virtualisation like this is not new, but a key difference is that ViPR can allow underlying arrays to use their own in-built management functions.
Here, EMC is providing what is becoming known as software-defined storage. This sees the unbundling of storage hardware, primarily the media, from the software intelligence that manages storage operations, such as cache, I/O etc.
ViPR is billed as a scale-out architecture that can operate on a distributed basis, which means data need not be retained in a single enterprise array, migration between arrays is not strictly necessary to achieve service levels and analytics can be run across the entire storage environment.
Then there is big data, in which businesses run real-time or near real-time analytics on huge data sets to glean information such as customer preferences from their web activity. ViPR allows organization to run analytics “in place” on the storage infrastructure via the HDFS Data Service.
Alongside this there is the growing prevalence or perceived need of businesses to operate on an internal cloud model. In such a scenario application teams can select service levels from catalogues and provision their compute and storage requirements on a self-service basis. This functionality will form part of ViPR.
ViPR is built on an object storage content addressing system. Object storage does away with the traditional tree-like file system and gives data objects a unique identifier in a flat system similar to the web’s Domain Name System (DNS). The lack of hierarchical structure sidesteps a processing overhead that can become onerous in very large storage systems.