Software-defined storage: Making sense of the data storage technology
A comprehensive collection of articles, videos and more, hand-picked by our editors
My last column highlighted three key trends we expect to drive the enterprise storage agenda in 2013. One of these was software-defined storage (SDS), a term that has spread like wildfire in a few short months.
Indeed, such is the interest in – or perhaps exploitation of – SDS that seemingly every storage supplier now offers a software-defined approach. So, in an attempt to cut through the hype and identify what, if any, substance may exist, here is my attempt to explain what we at 451 Research think is going on.
It is true to say that over the past 20 years storage systems have largely been defined by, and features mainly delivered through, hardware. So once a system had reached capacity or maxed out on performance, another had to be rolled in alongside – and managed entirely separately – or the customer had to go through a painful upgrade and migration, sometimes both.
Another consequence of this hardware-centric approach has been that there has never been such a thing as a single storage infrastructure; the storage systems landscape has been fragmented from day one. Transaction-centric block storage area network (SAN) systems were entirely separate from file-centric network attached storage (NAS) systems; backup and disaster recovery required another set of systems and processes – and each of these silos was optimised for low-end, mid-range or high-end requirements.
Silos are managed separately, and never the twain shall meet. The result has been that the storage infrastructure – already an island of complexity in its own right – has become a major drain on resources. Managing the storage environment – tasks such as provisioning, capacity planning and migrations – does not deliver any value to the business and, as data volumes grow, eats up an increasing proportion of the IT budget.
In response to this, recent months have seen a surge of activity from start-ups and larger suppliers that advances the notion of software-defined storage.
Clearly, it was no coincidence that the emergence of the term came shortly after VMware splashed out $1.2bn for software-defined networking specialist Nicira, and subsequently rebranded its entire strategy as enabling the software-defined datacentre.
More on software-defined storage
- Software-defined storage and SSDs
- Software-defined storage: Is hardware obsolete?
- The downsides of a software-defined infrastructure
- 2013: a year of software-defined storage, convergence and cloud
- Podcast: The effect of software-defined storage on VARs and their customers
- VDI storage should be easy: Software-defined storage for VDI
- Where does storage fit in VMware's software-defined datacentre?
As in so many other parts of the industry, Moore's Law sits at the heart of this shift.
Of course, storage systems are fundamentally comprised of many disk drives sourced from a small number of suppliers. Additionally, many storage systems now use standard Intel chips, rather than proprietary ASICs, for much of the core processing required of a storage system, such as Raid, I/O management, data management. This means most storage systems now run on commodity hardware.
In some quarters, this has had a liberating effect on storage architectures, since no longer being hardware-bound has enabled storage designs to become much more flexible.
When servers and storage systems are essentially made up of the same components, the distinctions between the two start to break down. This means a storage system is essentially defined by the software that runs on it. This, in a nutshell, is at the core of the notion of software-defined storage.
In recent years, we have seen something of a return to direct-attached storage (DAS). This is starting to reverberate around the industry, and there are signs that storage buyers are interested.
There are several catalysts for this beyond the use of commodity hardware. Some storage suppliers are able to run their storage platform in the hypervisor as a virtual storage appliance (VSA). This means it can run directly on a host, or even across multiple physical hosts in a cluster, taking advantage of local storage and at the same time providing some enterprise-class features, such as snapshots and thin provisioning, but at a dramatically lower price point.
Additionally, the emergence of flash-based solid-state storage means IT managers can deliver blazing-fast performance by adding a little flash into the server, either as part of a VSA or not.
When servers and storage systems are made up of the same components, the distinctions between the two start to break down
Simon Robinson, 451 Research
Besides enabling storage "personalities" to be applied to what, essentially, are servers, the use of commodity hardware is also acting as the basis for scale-out storage architectures.
This facilitates a software-defined approach: storage systems can be designed and configured to meet today's requirements – important in a budget-constrained era – with performance and capacity tiers seamlessly added as requirements grow over time, often independently of one another.
The appeal of scale-out storage is huge when you consider the time and effort spent by storage professionals on migrating data between systems. It is also a key requirement when building a large-scale cloud environment, as the likes of Amazon have demonstrated.
And scale-out capability is at the core of many of the new storage software architectures now entering the market. Whether this software is delivered standalone for buyers and channel partners to integrate themselves, or delivered as a finely-tuned package on commodity hardware, scale-out represents the future of enterprise storage.
Another key emerging theme is open source storage software. Encouraged by the commoditisation of storage hardware, and certainly motivated by the new economic imperative to do more with less, there has been a surge in activity in open source storage in recent months.
While it is unlikely that open source storage is going to transform the storage industry overnight – after all, storage strategy is still dominated by conservative, risk-averse thinking, and for good reason – there is already plenty of momentum in areas where open source may offer adequate performance and functionality at a much better price than traditional approaches.
Another attack point is in the cloud world, where service providers offering storage as a service have turned to open source storage in a bid to reach price points that come close to competing with the economies of scale enjoyed by cloud giants such as Amazon.
More on open source storage software
This is a particularly active space right now, especially from an object storage perspective. The main area of interest is via OpenStack-based efforts from the likes of Rackspace, HP and Dell, while the likes of Basho Technologies and DreamHost spin-off Inktank (with Ceph) are also lining up open source object-storage stacks that can underpin cost-effective, large-scale cloud storage services, and even potentially enhance or replace the Swift storage element of OpenStack.
Many other object storage suppliers are considering heading down the open source route, so activity here is likely to increase. Nonetheless, while open source storage may have its niche in smaller businesses and service providers, it has yet to penetrate medium-sized and large enterprises in a meaningful way.
451 Research is working closely with the team from TheInfoPro (a service of 451 Research) to ascertain whether there is any appetite among mainstream storage professionals for this direction.
Another important change that lends itself nicely to software-defined storage – unified or multiprotocol storage, supporting NAS, SAN and even object storage protocols – is increasingly becoming the norm, at least in mid-range systems.
Ultimately, this means IT managers can design and implement systems that meet their overall storage requirements without having to second-guess how their file and block needs will develop (although the degrees of unification vary quite broadly, depending on the supplier).
At the high end, it is still very much separate worlds for files and blocks, but this is likely to change over the longer term for all but the most mission-critical applications.
It is important that administrators and managers understand that different types of storage configurations will deliver different performance levels, but it is time they were able to do this without having to worry about the protocols and other facets their storage systems do or do not support.
The cloud model has taught us that there is a much simpler way of storage provisioning: via APIs. Traditional storage provisioning is a time-intensive and complex chore, though it has become more efficient in recent years through techniques such as thin provisioning.
But, in much the same way that the difficulties of server provisioning led to the emergence of the cloud model in the first place, now the principles of cloud storage provisioning are showing the way forward.
Amazon's S3 API is already the de facto standard in the cloud storage market, while the OpenStack community continues to develop a set of APIs to handle a wider range of storage requirements, the latest example being the Cinder API for volume management.
Meanwhile, down at the systems layer, VMware's storage APIs are being widely adopted by storage suppliers. VMware's upcoming VVols (Virtual Volumes) will provide a layer of storage connectivity abstraction, ultimately making distinctions between NAS and SAN moot and simplifying the planning and acquisition process for storage arrays. VVols are a key part of VMware's software-defined datacentre strategy.
Meanwhile, some VM-oriented storage start-ups are eschewing the traditional provisioning models of logical unit number (LUN) allocation, volume management and even Raid sets altogether, in favour of VM-aware storage.
There is some way to go before this becomes fully mainstream. Storage managers still feel more comfortable provisioning storage in the way they always have, and largely don't want to write a new API every time they roll out a new application.
But this will likely change as web-based applications become the norm over the longer term. Indeed, the entire cloud/converged model will only work if orchestration engines can easily provision the right type of storage.
The move to software-defined storage is likely to be a gradual process that may span a decade or more
Simon Robinson, 451 Research
The move to software-defined storage is likely to be a gradual process that may span a decade or more. Very seldom in IT do things change overnight, especially in storage, which for good reason tends to be more risk averse.
But in the era of virtualisation and software-defined everything, storage is going to play a role. Indeed, it could turn out that the effect may be the greatest in the storage arena.
These enablers are not necessarily the be-all-and-end-all of software-defined storage, nor is it likely that all suppliers and end users will embrace each of them equally. But they set out a framework that is likely to continue to reshape the storage landscape over the coming years.
Simon Robinson is research vice-president, storage and information management, at 451 Research.