We live in the Information Age, apparently, but sometimes it's far too easy to form the impression that decent information is at best hard to come by and at worst purposely withheld.
This week I spoke to Veeam, the virtualisation backup software specialist. The briefing was mostly a recap of some of the new features announced in version 7 of the Veeam product earlier this month, including WAN acceleration, array-based replication and tape support.
Also discussed was Veeam support for the latest versions of the VMware vSphere and Microsoft Hyper-V hypervisors; a result, said product strategy specialist Mike Resseler, of the company closely following virtualisation platform trends.
Now, if I'd wanted to check the relative penetration of the various hypervisors and the overall percentages of physical to virtual servers there was a time I could have quickly accessed Veeam's own V-Index surveys, carried out by Vanson Bourne, of more than 500 organisations in the US, UK, France and Germany.
There I could have seen stats for numbers of virtualised servers vs physical, the degree of server consolidation resulting from virtualisation and the relative penetration into datacentres of the various virtualisation hypervisors.
Sadly, however, V-Index was a very short-lived programme, lasting only, it appears, for one iteration of the survey.
Naturally, my suspicious journalist's mind suspected the results were not what Veeam wanted to see; the V-Index, for example, showed only about 35% of servers in the UK were virtualised in the last quarter of 2011. And that might be an argument for not buying Veeam, which only backs up virtual servers, and instead looking at a product from one of the larger incumbents that backup virtual and physical devices.
Veeam's public relations company reassures me my suspicions are wide of the mark, however, and that the company decided to concentrate its efforts on its annual Data Protection Report and to leave the kind of reporting done by the V-Index to the likes of Gartner and IDC.
But, Veeam's Data Protection Report clearly doesn't give the same metrics at all and you try finding Gartner or IDC figures that give the same information as the V-Index did. Oh, I'm sure they exist, but in nothing like the easily accessible format of Veeam's creditable efforts, which potentially could have provided a great resource that combined regular snapshots into a picture of virtualisation trends over time.
And that was Veeam's aim. In that July 2011 blog post Veeam VP for product strategy and chief evangelist Doug Hazelman told us about V-Index and how "very excited" he was about it. But yet by the end of that year the programme appears to have bitten the dust.
So, there's just one more thing, as I hover near the door Colombo-like, why was V-Index such a great idea in July 2011 but ditched less than half a year later? I can't help thinking I was right to be suspicious.
We live in the Information Age, apparently, but sometimes it's far too easy to form the impression that decent information is at best hard to come by and at worst purposely withheld.
Some use out-and-out distortions of commonly-understood technical terms.
Violin Memory, for example, loves to emphasise the second word in its name, ie "memory". A strapline it uses is, "Run business at the speed of memory" and we're asked to think not as storage admins but as "memory architects" using its "persistent memory architecture" etc etc.
But how does all that stack up? Memory traditionally is the media on the motherboard closest to the CPU where a portion of the data and the application reside during working operations. Now, Violin may produce some very fast all flash arrays but are we really talking "the speed of memory" here?
Its high-end 6000 Series, for example - "Primary storage at the speed of memory" - product specsheets don't distinguish between reads and writes and offer latencies of "under 250 μsec" for SLC variants and "under 500 μsec" for the MLC variants.
I asked Violin CTO Mick Bradley how they could call it "memory" when it doesn't appear to conform with the commonly understood meaning of memory either architecturally or in terms of performance. His reply was: "We call it memory because it's silicon and it is addressed in pages of 16K."
Hmmm, such a definition doesn't cut much ice, especially now there is flash storage that does operate at the speed of memory. An example of this so-called memory channel storage is Smart's UlltraDIMM, launched earlier this year. Such a product could claim to operate "at the speed of memory" with write latency of less than 5 μsec and being actually located in motherboard DIMM memory slots.
Meanwhile, others change the way they describe their product according to which way the wind is blowing.
Storage virtualisation software vendor DataCore is a great example here. At SNW Europe this week, DataCore EMEA solutions architect Christian Marczinke told us how the firm had been the pioneer of "software-defined storage".
Err, hang on there. DataCore's use of software-defined storage to describe its products is dates back less than nine months and is clearly a response to the use of the term by VMware with its overall software-defined datacentre push and EMC's ViPR hype.
In fact, until around a year ago DataCore referred to its product as a "storage hypervisor", clearly bending with the wind blown from VMware's direction. I dealt with that issue of nomenclature here.
Does all this quibbling over terminology matter? Well, on the one hand, it obviously matters to the likes of Violin and DataCore, who despite having great products, clearly feel the need to over-egg their puddings. And on the other hand to IT journalists it matters because it's our job to ensure customers get a clear view of what products actually are.
To be continued . . .
Here at VMworld Europe in Barcelona the term ecosystem is being thrown around with gay abandon. It's a lovely-sounding word. It evokes life, the planet, lush green rainforests, myriad plants and animals living in harmony etc etc.
IT vendors like to use it for those reasons and all its positive associations.
VMware is particularly keen on it, and it seems most apt. The layers of virtualisation they have laid onto physical servers are now being joined by levels of abstraction above the networks and storage infrastructures and into those hypervisor(s) they are gathering the intelligence to run nearly all aspects of the datacentre via ever fewer screens.
But stop for a second to think about what it means to step outside your ecosystem. Or alternatively, think about the movie Total Recall where the governor of Mars, Vilos Cohaagen, exercised his power through a total monopoly on breathable air.
Now, of course I'm not likening VMware's gathering of datacentre functionality to Cohaagen's tyranny, but look what happened when Cohaagen got sucked out of the safety of his ecosystem and onto the Martian surface.
Obviously this won't happen to you just because you deploy VMware in your datacentre, but there are good reasons to think deeply about what you're getting into.
Not least with storage, probably the area most affected by virtualisation. It accounts for something north of 50% of datacentre hardware costs and Gartner has predicted those rise by 600% upon virtualising your infrastructure. That's because packing lots of virtual servers into relatively few physical devices makes direct-attached storage unsuited to the massive and random I/O demands, and almost always means an upgrade to shared storage SAN or NAS arrays.
The day-to-day consequences of this are that storage will become more difficult to manage - masked by the VMware ecosystem - as it fills up more quickly, requires more rapid provisioning and generates ever more complex and potentially rapidly changing and easily broken dependencies and access profiles. And that's before we get to replication, mirroring, backup etc, all of which also presents a massively complex and dependency-heavy payload on the VM infrastructure.
All of which goes to show there's a downside to the concept of an ecosystem. VMware et al like to portray themselves as the Na'vi in Avatar, as guardians of their idyllic world. But the reality can end up more like Total Recall, where breathing the air is costly but stepping outside is even more difficult and dangerous.
For that reason it pays to exercise due diligence over the consequences of datacentre virtualisation, the likely costs and knock-on effects into storage and backup and to be sure to you have surveyed all the alternatives available in the market.
If you could build a datacentre - and more importantly its contents - from scratch chances are it wouldn't look much like many of them do now. Technologies have come along, have served their purpose as an advance on what went before, but later become the next generation's roadblock to efficient operations.
Take the x86 server. It replaced the mainframe or RISC UNIX server. In comparison to them it was cheap; you could put one app on each and keep adding them. But then, of course we ended up with silos of under-used compute and storage. And latterly, to this was added shared storage - the SAN and NAS - that solved many problems but has challenges of its own.
How would the datacentre be designed if it was built from the ground-up now?
Well, there are two answers (at least) to that one. The first is to look at what the likes of Amazon, Google et al have done with so-called hyperscale compute and storage. This is where commodity servers and direct-attached storage are pooled on a massive scale with redundancy at the level of the entire compute/storage device rather than at the component level of enterprise computing.
The second answer (or at least one of them) is to look at the companies bringing so-called converged storage and compute to the market.
I spoke to one of them this week, Simplivity. This four-year-old startup has sold Omnicubes since early 2103. These are 20TB to 40TB capacity compute and storage nodes that can be clustered in pools that scale capacity, compute and availability as they grow, and all manageable from VMware's vCenter console.
Omnicubes are essentially a Dell server with two things added. First is a PCIe/FPGA hardware-accelerated "Data Virtualisation Engine" that sees data on ingest broken into 4KB to 8KB blocks, deduplicated, compressed and distributed across multiple nodes for data protection as well as being tiered between RAM, flash, HDD and the cloud.
Second is its operating system (OS), built from scratch to ensure data is dealt with at sufficient levels of granularity and with dedupe and compression built in plus its own global, parallel file system.
With all this, Simplivity claims in one fell swoop to have replaced product categories including the server, storage, backup, data deduplication, WAN optimisation and the cloud gateway.
And to some extent the claim rings true. By dealing with data in an optimum fashion from ingest onwards, parsing it in the most efficient way and distributing it according to what's most efficient and safe, it has come up with something like how you'd deal with data in the datacentre if you were to design its parts from scratch right now.
That's not to say it's without limitations. Currently Simplivity is only compatible with the VMware hypervisor, though KVM and Microsoft Hyper-V variants are planned. And it is of course a proprietary product, despite the essentially commodity hardware platform (except the acceleration card) it sits upon, and you might not put that on your wishlist of required attributes in the 2013 datacentre.
Still, it's an interesting development, and one that demonstrates a storage industry getting to grips with the hyperscale bar that the internet giants have set.
EMC's refresh of its VNX line of unified storage arrays is largely based on an almost complete re-write of its near 20-year-old Flare operating system. Flare was on its 32nd release and has been replaced with a new OS, MCx, but what exactly has changed under the bonnet?
In short, MCx has been developed to take advantage of Intel multi-core processors where Flare was completely unable to do so. In addition VNX controllers also now use the latest Gen 3 PCIe cards and so physical bandwidth is also hugely increased.
All this amounts to unified storage arrays with much-boosted capabilities when it comes to exploiting the speed of flash storage.
Flare was originally developed by Data General (acquired by EMC in 1999) for the Clariion brand of arrays, and like storage operating systems from all vendors way back when, was written for single-core processors.
When multi-core processors arrived Flare was rewritten to allocate different functions - eg, RAID, memory, data placement, data services - to different cores, but with one function assigned to one core it was easy to max out.
But at the time with spinning disk the norm there was no great need to overcome this bottleneck. That all changed with flash storage, however.
So, MCx was rewritten for multi-core processors - the new generation of VNX uses Intel Xeon 5600 CPUs - and all processing functions are parallelised across the 32 cores and EMC claims something like 40,000 IOPS per core and into the hundreds of thousands per controller up to 1m IOPS in the VNX 8000 with latency staying below 2 microseconds.
Another key advance is the use of Gen 3 PCIe cards.
While the processors could be a bottleneck in legacy arrays so could connectivity in and out of the array via PCIe. Gen 3 boosts bandwidth and lane count between the processor and the storage as well as front (Fibre Channel, Ethernet) and back end (SAS) port bandwidth.
All of which helps put the new VNXs in a similar performance ballpark to the all-flash vendors. According to EMC mid-tier business director, Sean Horne, that means customers can now buy a midrange VNX array and look at having enough headroom for four or five years of growth in virtual machine count and performance requirements.
What EMC is hoping is that customers who have been, rightly, impressed with the performance of flash arrays from the startups will now be able to get similar performance for their virtual environments from EMC.
It certainly puts EMC at the forefront of the big six storage vendors; re-engineering an existing array family for flash, rather than throw flash at legacy OS and controller hardware. And for that reason it is a turning point in the flash storage story.
The aim of this blog post is to try to iron out some misunderstandings in two common terms in storage. Two terms that are actually really rather connected; storage virtualisation and software-defined storage.
First let's deal with storage virtualisation. Here at ComputerWeekly.com we're pretty certain that there's a good deal of confusion about this term. In our yearly poll of readers we keep hearing that "storage virtualisation" is a key priority on IT department to-do lists for the coming year. This year that figure was 36%.
That figure seems unusually high. It's an un-scientific measure, for sure, but as a storage journalist I get a fairly good idea of the type of projects that are hot by what comes across my desk, by speaking to customers, and to vendors, and I just don't hear much about storage virtualisation.
So, when those questioned in our poll ticked "storage virtualisation", what many probably thought we were asking was "is storage for virtualisation a priority?" Why? Because server and desktop virtualisation is a big priority for a lot of organisations right now and implementing storage and backup to support it is a key part of that process.
Meanwhile, storage virtualisation products allow organisations to build storage infrastructure from multiple vendors' hardware. Storage suppliers, of course, would prefer that they provided all of your disk systems. Consequently, while the key storage vendors have storage virtualisation products, it's not something they push particularly hard in marketing or sales terms.
Storage virtualisation products include EMC's VPLEX, IBM's SAN Volume Controller (SVC), NetApp's V-Series and Hitachi's VSP.
There are also the smaller storage virtualisation vendors and products, such as DataCore's SANsymphony, Seanodes' SIS, FalconStor's NSS and Caringo's CAStor.
These are all reasonably well-established products that allow users to create single pools of storage by abstracting the physical devices upon which they are layered to create a virtual storage array.
More recently, we've seen that capability emerge in the form of products at a higher, environment level.
Here, I have in mind, for example, VMware's plans for Virtual SAN, which will allow pooling, from the hypervisor, of server-attached disk drives, with advanced VMware feature integration, such as high availability and vMotion. It will scale to petabyte levels of capacity and will put some pressure on existing storage vendors playing in the SME up to small enterprise levels when it come to fruition.
And there is EMC's ViPR environment, announced at EMC World 2013, which merges storage virtualisation with big data analytics. Key to this discussion is ViPR's ability to pool storage from direct-attached hard drives, commodity hardware and other vendors' arrays into one single reservoir of storage that's manageable from a single screen.
These initiatives contain a large dose of what has for a long time been called storage virtualisation but are billed as software-defined storage.
So, to what extent are either of these terms accurate reflections of the technology they represent?
Well, of course both terms could be said to be so vague to be almost meaningless. After all, all storage is based on the retention of data on a physical drive, but that would be nothing without software that abstracted/virtualised for example, blocks and files to physical media, RAID groups and LUNs. In other words storage never exists without being defined by software or being virtualised in some sense.
So, how do we make sure we're using these terms clearly? Well, on the one hand it seems reasonable that storage virtualisation should refer to the abstracting of multiple vendor systems into a singly-manageable pool of storage. If there's anything such as historical usage in storage and IT then those systems ranging from IBM's SVC to the likes of DataCore seem to fit that billing and have done for some time.
Meanwhile, while we can recognise that VMware's planned Virtual SANs and EMC's ViPR are heavily based on storage virtualisation capability as defined here, they also go beyond this, to incorporate much higher level features than simple storage functionality, such as vMotion and big data analytics respectively.
Despite the efforts of some vendors, notably DataCore, which has gone from dubbing its products a "storage hypervisor" to software-defined storage according to the whims of the IT marketing universe, it seems reasonable to define storage virtualisation as quite narrowly as centring on the ability to pool heterogeneous media into a single storage pool.
Meanwhile, software-defined storage can be reserved for higher level function and environment-type products that also include storage virtualisation.
It's always a battle to define terms in an area so fast moving as IT, and with multiple vested interests and active marketing departments, but it's certainly valid to try and define terms clearly so the customer is able to see what they're getting.
A rather tiny bit of storage news this week illustrates the changes taking place as part of the flash revolution, and also where its leading edge lies.
The news is that Fusion-io has submitted proposals for standardised APIs for Atomic writes to the T10 SCSI Storage Interfaces Technical Committee.
Why this is interesting is that it's is all about the interface between flash memory/storage and some of the most business critical database apps.
Atomic operations are database operations where, for example, there are multiple facets to a single piece of information and you either want all or none of them read/written. Completing only one part is highly undesirable, such as a query for credit and debit in a banking system.
Until now, with spinning disk hard drives, supporting MySQL, for example, because of the possibility of disk drive failures writes took place twice before acknowledgement, as a failsafe. Clearly, such a doubling of operations, is not optimum in terms of efficiency.
What Fusion-io has done is to eliminate that duplication of effort with APIs that build in management of Atomic operations to flash memory.
The flash pioneer claims its Atomic Writes capability provides performance throughput increases of up to 50%, as well as a 4x reduction in latency spikes, compared to running the databases on the same flash memory platform without it.
Gary Orenstein, marketing VP, said: "The background is that many have adopted flash as a direct replacement for the HDD. But Fusion-io believes flash should do more than that and that we should be moving away from the last vestiges of mechanical operations in the datacentre."
"What we're looking at are new ways to break new ground that are faster, with fewer instructions," he added.
Currently these capabilities only come with Fusion-io flash products and are already supported in MariaDB and Percona MySQL distributions but upon T10 standardisation they will be open to all vendors.
Stepping back to take a big-picture view what this also illustrates is the contrast between the extremes of flash implementation in the datacentre.
One the one hand there is this type of work at the leading edge of flash storage use, squeezing ever-greater efficiencies from the interface between flash and the apps/OSs etc that it works with by use of software.
At the other there are the legacy arrays into which flash drives act as a pretty much dumb replacement for the spinning disk HDD.
Two years ago I wrote that the storage industry was apparently ripe for huge change.
The nub of my argument was that storage is a sector of the IT supplier world in which customers are forced to spend money on what are essentially a commodity - ie, drives - wrapped in proprietary software built into hardware controllers.
The argument progressed to take note of the revolution in the server world that had occurred as Linux effectively decoupled proprietary operating systems from RISC chip-based hardware in the previous decade, making open source OSs on x86 commodity hardware a much cheaper option.
The conclusion of the piece looked around at the likely candidates in the world of storage that might do what Linux did in the server world. These comprised storage software that could be deployed on commodity hardware and included GreenBytes and Nexenta as well as open source products such as Red Hat's and ZFS.
Two years on and it seems the hazy predictions based on a theory and a few small shoots of evidence have been validated by, among others, the biggest name in storage.
This week I spoke with Ranga Rangachari, VP and general manager for storage with Red Hat (not the biggest name in storage), who put forward a similar argument to the above, namely that: "Storage is dominated by 'tin-wrapped software' and customers are sick and tired of being locked into silos, with for eg, vendors with three different solutions."
Rangachari reiterated the argument that what happened in the move from RISC to x86 could happen with storage and that the drivers now are the cloud, the volume of unstructured data and the rise of online analytics platforms such as Hadoop, which requires co-resident storage and processing power, with data moving, as Rangachari put it "East to west not north to south" as in existing server-SAN infrastructures.
The rise of such hyperscale server/storage infrastructures has been pioneered by the likes of Google and Facebook and is exhibit A in the rise of architectures that challenge the existing enterprise storage paradigm.
Instead of shared, but remote, storage in, say, an enterprise SAN, these highly performant Web-serving and analytics stacks comprise converged server and storage hardware, all made of cheap commodity parts with redundancy at the level of the whole unit rather than components within.
Elsewhere - exhibit B - is the emergence of converged storage/server products that ape the hyperscale architectures and are usually geared towards virtual environments. These include Nutanix, Scale Computing and Simplivity.
Exhibit C is the continued rise of software-only storage products that customers can run on any hardware. Virtual storage appliances that will run on virtual or physical machines are available from all the big storage vendors as well as the likes of DataCore, Nexenta.
An important addendum to exhibit C is the plan by VMware to include storage software features in its virtualisation hypervisor products. VMware already has a virtual storage appliance, but it plans to include storage software capability in the form of its Virtual SAN which will allow users to create up to petabytes of capacity from existing unused disk. This threatens to seriously undermine the market of entry-level to midrange storage players.
Finally there is exhibit D - evidence for the prosecution, as it were - and this is EMC's recent announcement of its forthcoming ViPR storage virtualisation/private cloud/big data software layer.
The 800lb gorilla of the storage market justified ViPR as a response to a changing storage landscape, and is in large part a storage virtualisation platform that will knit together disparate storage systems from any vendor and from commodity drives.
On the surface of things it's quite remarkable that the biggest disk system vendor should potentially allow users to create storage from any other storage supplier. But, ViPR can give EMC very sharp and well-barbed hooks in a user's environment, as a software layer that embraces all storage underneath it.
Maybe it should have been called Python, for its ability to smother an organisation's storage systems, and is apparently the antithesis of the move to more openness in systems that I'm arguing is a trend here. So, why it is evidence for my case?
Because it is a recognition by EMC that storage will henceforth no longer solely reside on the enterprise storage array as such; that it will be distributed in traditional storage environments, converged hyperscale datacentres and edge devices and that these must be linked by a software layer that virtualises the capacity underneath it.
So, it seems the biggest player in the storage market has recognised that the dominance of the traditional storage array is a thing of the past. Having made that concession it will be interesting to watch whether the likes of EMC can transition to the new reality against its rivals that offer more open storage software.
Let's check back in another couple of years.
Flash is flavour of the month/year in enterprise storage, because of its ability to rapidly deliver the likes of virtual desktops and servers, as well as processing high-performance transactional databases.
You may have recently got to grips with the distinctions between MLC and SLC. In fact, we know that many of you have because our explainers on MLC vs SLC are among our most-read pages month after month.
That may be about to change, however, as the flash market evolves.
Namely, SLC seems to be all set to effectively fade from the flash acronym lexicon, while TLC enters it.
SLC - or single level cell - is the best-performing and most durable of the NAND flash types. It's also the most expensive per GB. And while many flash storage system vendors offer SLC, take-up seems to be far slower.
That is, admittedly, from the decidedly non-scientific viewpoint of a storage journalist to whom vendors are keen to trumpet customer wins. But what I see on a regular basis is the use of MLC/eMLC flash, which has had its shortcomings addressed by clever software error correction etc.
Meanwhile, there is evidence that TLC - triple level cell flash - is creeping up as an enterprise flash option. What's the evidence?
Samsung Semiconductor launched TLC-based flash products late last year. And speaking to the CEO flash array maker Pure Storage, Scott Dietzen, last week, he indicated it was only a matter of time (or more precisely, cost) until TLC makes an impression on the enterprise storage market.
The read latency of TLC is now nearly as good as MLC. Samsung pitches its TLC products for heavily read-intensive use cases, such as streaming media, for example. Dietzen expects TLC and MLC to be used in a tiered fashion in enterprise storage when the price of the former reaches two thirds of the latter.
That might not be too long. A quick look at flash market analyst sites such as the Taiwanese inSpectrum show the contract price for 128GB of MLC at an average of $8.72 while the same capacity TLC is about 75% of that at $6.60.
As the proportion of TLC flash manufacturing increases that price will decrease. Perhaps we'll see that 66% hit this year and TLC-based storage products emerge.
I think it's time to get writing that TLC vs MLC article.
What are the limits of cloud storage right now? We've examined it elsewhere. And you wouldn't necessarily ask the CEO of a cloud storage service provider that question. They have too many reasons to come a little fast and loose with the facts, purely in their commercial interest, of course.
But this week I asked those questions of the CEO of Egnyte, a US-headquartered cloud storage provider that focuses on providing file sharing and synchronisation that is breaking into Europe.
Egnyte has two US datacentres and one in Amsterdam and holds about 12PB of customer data in a hyperscale storage environment; ie Super Micro server chassis with direct-attached storage on 4TB commodity drives. It's all held together by a home-grown object storage file system with redundancy at server level rather than that of the components within. Added to this is a dash of Fusion-io and Intel PCIe flash for rapid caching of customer data.
Egnyte offers cloud storage to its customers, with data kept in its datacentres plus access to Amazon, Google, Microsoft and NetApp clouds.
It also offers customers a hybrid of on-site storage alongside the cloud and herein is the recognition that for most types of production data the cloud is simply not yet ready. That's because network latency is still too great for access to data to be swift enough for the most business-critical applications.
So, when will the cloud really break through as an option for production data storage?
Egnyte CEO Vineet Jain sees a tipping point when 5G mobile networks are established.
Jain said: "Today 47% of our users access Egnyte by mobile, and currently we have 4G networks that have a maximum of 100Mbps bandwidth. That's nowhere near what's needed, but 5G is expected to be 1,000 times faster than that. Until then the cloud will be good for some things but it will be hybrid [ie, with disk storage at the customer site] until bandwidth is reliably available with no chokepoints."
Of course, reliable bandwidth isn't the only obstacle to cloud adoption. Security and compliance are the other key concerns, which, says Jain, could be overcome if businesses think realistically about what the cloud is good for.
"Like the mythical paperless office, there's been too much cloud hype," he says. "There will be an increasing amount of data put into the cloud, but we'll see it skewed towards that large proportion of data that businesses must keep but is infrequently accessed."
It's good to hear a realistic view of the cloud. And it'll be interesting to watch how cloud develops over coming years. Ultimately, the onset of usable could storage could shake up the entire storage industry as we know it, with the current incumbent vendors needing to adapt to survive as hyperscale storage-driven service providers offer increasingly usable remote storage services. But that's a musing for another blog sometime.
It's been an interesting week for NetApp-watchers. On Tuesday we learned of the company's latest moves in the flash array space; the EF540 flash array and FlashRay, a new operating system (OS) that's optimised for flash. All of which screamed (to investors and potential customers), "We have a flash array too now! You can get it from us, not the upstart startups or our larger competitors."
Having said that, there's no doubt these represent progress for NetApp, which has had an odd relationship with solid state. For a long time NetApp's stance was that cache was the only place for flash, and it would not form a distinct tier in your storage infrastructure.
Tuesday's announcements are the manifestation of a Damascene conversion on the flash question for NetApp that had taken place over the past year. Now it seems flash is built firmly into the future of the company. Well, reasonably firmly; the EF540 actually looks like a rush to get something to market to position NetApp against the competition.
Let's deal with this first. Late last year there was some confusion at NetApp towers about whether the company would develop and all-flash array at all. First, CTO Jay Kidd told me NetApp wouldn't be playing in the all-flash market because it wasn't a big market and there were other ways to introduce flash into the server-storage infrastructure.
After that article was published you could hear the sound of NetApp backpedalling from several thousand miles away. I don't know whether Kidd was off-message, still expounding the no-flash-tier mantra, or simply had demonstrated some extreme failure to communicate properly.
Still, things were put back on-message three weeks later in this interview with our sister publication SearchStorage.com, where Kidd said NetApp had plans for an all-flash array in 2013. Although, why Kidd didn't mention the EF540 in November is a mystery, especially as it had been sold in a limited release prior to that.
Such twists and turns aside, questions must be raised about the EF540. Sure, it's an all-flash array, and it can give up to 300,000 IOPS, but it uses the existing E-Series operating system, SANtricity, which is not built for flash and neither is the controller hardware built for flash.
Is this a problem? More than likely not in the short-term. NetApp's all-flash start-up competition, like Violin Memory and Whiptail boast huge IOPS, into the millions, which is probably silly and un-necessary for all but the most extreme use cases on the planet. I mean, with virtual desktop I/O requirements of something between 10 and 200 IOPS per seat the EF540 has plenty to give.
But in the longer term the EF540 must have limited life. Leaving aside whether it relies on throwing sheer TB at getting the throughput it does, its OS and its controller hardware are not built for flash. The OS doesn't optimise operations for the vagaries of flash and its wear characteristics and the controller hardware/backplanes etc are not built for the speed of flash.
That NetApp has announced the flash-optimised FlashRay is a tacit admission of those points.
But what is FlashRay and what will be its significance for NetApp? Well, facts about FlashRay were thin on the ground in Tuesday's announcements, but it appears to be a flash-optimised storage operating system. And apart from the flash-optimisation bit it sounds almost exactly like NetApp's existing OS, Data ONTAP. Indeed, Lawrence James, UK products, alliances and solutions manager for NetApp, told me that FlashRay is "ground-up developed, but will inherit features" from ONTAP. Hmmm.
Leaving that bit of speculation aside, NetApp's launch of FlashRay has potential implications for its existing storage hardware range. FlashRay undoubtedly represents a progressive move by NetApp, but what hardware will it be allied with? After all, the new OS may well be flash-optimised, but the controller hardware on the FAS and E-Series families are not. So, will we see a new family of NetApp arrays, or will FlashRay be ported to FAS arrays with upgraded hardware?
They are interesting times indeed for NetApp watchers.
I caught up with flash array maker Solidfire this week, whose CEO Dave Wright was attending Cloud Expo Europe in London. What struck me was Solidfire's targeting of the cloud provider market and the architecture characteristics and features by which it does that.
Solidfire is a startup, a minnow in a flash array space where big vendors jostle for position and where new entrants hope to get bought by the big boys. But, the firm appears to be playing a long game, and has a number of unique features that make it suited to cloud providers' needs.
Those result from CEO Dave Wright's experience of building service provider Rackspace's Jungledisk cloud storage offering before launching Solidfire in 2010.
What Wright learned from that experience was how to build solid state storage for cloud providers. Where most flash array makers concentrate on the in-house requirements of businesses running virtual machines, Solidfire aims at large cloud environments, and in particular those that want to move away from providing less-performant services such as backup and archiving.
So, how do the workload profiles of such environments differ from, say, an enterprise delivering virtual desktops?
Wright says, "There's no difference between that scenario and the cloud in terms of how you'd see it from the desktop. But the difference from the point of view of providing that storage is in scale, multiple tenancy, providing a consistent service to all customers etc."
So, to meet those needs Solidfire's iSCSI block storage product has been architected to scale out like no other flash array provider's products, from 12TB or 22TB to 2PB. It is built on 1U nodes that scale from four or five up to 100, with 5m IOPS.
As you scale out you can assign performance to storage volumes, for example, providing a volume of 1TB with 1m IOPs and another of 500GB with 250,000 IOPS. This allows service providers to sell to such SLAs.
But while flash array sellers such as Violin Memory, Nimbus and Whiptail aim at providing a relatively small chunk of very high performance solid state storage at specific workloads, Solidfire - which targets cloud providers - attempts to be more than that.
In other words, as well as handling high performance workloads such as virtualisation it must also tackle the less performance-hungry aspects of a cloud provider's services. For this reason, says Wright, it has built in data deduplication, compression and thin provisioning that can help give a decent cost per GB price for operations outside of Tier 1 or Tier 0.
In addition, Solidfire is built for aspects of the cloud such as automation and multi-tenancy capabilities.
Solidfire is playing in what is a hot territory right now - the world of the flash array - that meets the need of highly randomised workloads that can't be matched by existing spinning disk architectures that were built down to the speed of HDDs.
So, it's a space replete with startups and big vendor acquisitions. IBM has bought Texas Memory Systems, EMC bought XtremIO, HDS has added an all-flash module to its VSP arrays. NetApp has indicated it may enter the all-flash fray.
So, is Solidfire looking to get bought? Wright's answer is that he doesn't expect Solidfire to be an acquisition target in the short term. "The big vendors are looking to upgrade their architectures and are in some cases buying startups to provide that. But, IT is moving to large cloud environments slowly and so what we do will only be recognised over time."
If I were a betting man when it came to the prospects of storage businesses I might be tempted to put some money on the mid-long term prospects of X-IO.
X-IO - which revealed an addition to its hybrid flash array line this week at SNW Europe - makes storage arrays, some pure HDD, some hybrid flash SSD-HDD, that target performance applications such as VDI, OLTP and business intelligence/data warehousing.
It doesn't offer the highest performance available, such as you might get from an all-flash array and the company poo-poohs the idea that you need expensive integrated database-specific compute/storage products to run what others might call 'big data' use cases. Instead it touts its ISE and Hyper ISE arrays with commodity servers and Microsoft SQL 2012 databases as adequate for most.
Nothing that unusual so far; it's a vendor selling wares that perform adequately for the job they aim at. Where X-IO is different, however, is that its arrays don't contain commodity hard drives, unlike just about every other storage array vendor.
Instead, X-IO products come with IP inherited from purchase of Seagate ASA in 2007, namely five-year guaranteed sealed unit 20 drive DataPacs that are engineered to be more reliable and longer-lasting than standard hard drives. X-IO says its drives last 2.4x longer than a Seagate HDD with an MTBF of 850,000 for an individual drive in a DataPac.
They achieve this by building in anti-vibration mountings, features such as diverting cooling intake on physical startup to stop ingress of gathered dust to the array, and details such as retained mounting screws; no lost screw, no chance of unwanted movement, is the aim here.
At the drive software/controller level firmware is stripped from standard Seagate drives and X-IO's installed, while data is written grid-pattern across drives in DataPacs. A fault-repair system goes through a triage process, starting with a reboot that fixes most drive issues. If this doesn't work the drive can be reformatted in situ and if a problem is found, a single surface and its head can be locked out of use while the rest of the drive is reinstated.
Why is this a good betting prospect? The next few years will likely see much of the intelligence of storage, the job of the controller in assembling and provisioning volumes of storage and handling features such as replication, thin provisioning etc, move to the virtual server stack. VMware, for example, recently signaled its intent to bring storage virtualisation capabilities to future versions of its hypervisor.
Should such moves come to pass, storage array vendors selling arrays up to Petabyte capacities could find the rug pulled from under them as the likes of VMware assemble and manage storage capacity from the virtual server.
You could, of course, build storage this way from all sorts of drives; in old arrays, as direct-attached storage, from JBODs full of commodity drives. But not everyone will be happy with that for reasons of reliability. And that leaves the way open for providers, like X-IO, of drive subsystems that specialise in the intelligence that is close to the drives and that provides reliability and resilience.
Maybe everything in storage will one day be controlled from the virtual server, but it feels like a fairly safe bet that the hypervisor vendors will not get into that level of drive management for the time being.
The subject of this blog is a briefing this week with Scale Computing. And a major factor in why it's getting written about here is the laughable levels of chutzpah involved on Scale's part.
The occasion for the conversation was Scale Computing's re-launch of its HC3 "datacentre in a box", which is a converged storage stack, that comprises server, storage and virtualisation hypervisor in one device. The HC3 comes in 1U nodes each holding four 3.5" SAS or SATA drives. You can have a minimum of three nodes and up to eight, which will serve about 100 VMs.
I say re-launch because they actually unveiled the device in August in the US at VMworld. This week's launch was at IP Expo in London. Why do vendors think we don't know this is a re-packaged, warmed-over, not-really-a-launch launch?
But, this was the killer. Scale Computing's HC3 has virtualisation built in. Is it VMware perhaps? Or Hyper-V? Or even Citrix? Nope. It's Red Hat's KVM. Naturally, I questioned the choice of only offering such a niche hypervisor.
Now, I'm not knocking Red Hat KVM's technology. It's a hosted hypervisor, and as such runs much closer to the hardware than any of the household names in virtualisation and is therefore more efficient.
In response, one of the Scale guys attempted to convince me, "It's the most popular hypervisor on the planet." I asked them to back this up and I'm still waiting for some emailed evidence, but I was told at the time that Red Hat KVM is in use with some big names in the cloud, like Rackspace, IBM and Google. I haven't verified this, by the way.
Anyway, it turns out Red Hat KVM doesn't even register on the V-index survey of most popular server virtualisation hypervisors, which at the last count had a ranking of: VMware 67.6%; Microsoft Hyper-V 16.4%, Citrix 14.4%, and; other 1.6%.
So much for, "The most popular hypervisor on the planet." Red Hat KVM comprises a fraction of 1.6% of hypervisors in use. That's not to say it'll always be that way but Scale's hyperbole here was wide of the mark for now, like several parsecs wide of the mark, and by the end of the call some rowing back had been done to say the least.
It also baffles me slightly why Scale Computing would try to tout the supposed high-end enterprise/cloud credentials of the Red Hat hypervisor in what is avowedly a mid-market play that aims to compete with the likes of Nutanix, Pivot3 and Simplivity.Anyway, the lesson, dear vendors, is if you don't have any actual news, then please feel free to tell me massive ridiculous porkies that I can call you on and use as the hook for an interesting discussion on hypervisor types and their relative popularity.
Almost exactly a year ago I spoke to Overland Storage about their then-new DX1 and DX2 traditional NAS boxes. At the time I questioned why clustered NAS capability had been omitted. After all, it's a technology that makes total sense; instead of buying traditional NAS devices that are doomed to become silos of data, customers would be far better served by the ability of clustered NAS to scale capacity, I/O and throughput and for all devices to see a single file system.
Well, this week Overland has announced the fruits of development following its acquisition of Maxiscale's clustered NAS intellectual property two years ago. Overland has taken this, added two years of engineering effort and developed a new clustered NAS OS, called RAINcloud OS.
RAINcloud OS is incorporated in the new SnapScale clustered NAS product. The product comes as a minimum of three nodes in a cluster, with a minimum of four drives in each. You can put a maximum of 12 drives in each node, or have less than full capacity while adding nodes to gain I/O and throughput. Drives are nearline SAS and can be under RAID levels 0, 5, 6 or 10. SSD will be added in 2013, as will automated tiering.
The RAINcloud OS can scale to a staggering 512 PB in a single file system and Andrew Walby, Overland's EMEA and Asia Pacific sales and marketing VP, says they've tested it with 200 nodes with no loss of performance.
So, what do you pay from clustered NAS capability? Well, for 24 TB of Overland's DX traditional NAS you would pay around $8,000 while for 24 TB of SnapScale clustered NAS you'd shell out around $20,000.
That's a $12,000 premium for some clever code, and according to Walby, that's cheap for clustered NAS compared to the likes of Isilon. He was at pains to point out the work that went into RAINcloud.
"The concepts of clustered NAS are simple but the engineering is very complex," said Walby, who added that he hoped it would usher in better times for the company, which has suffered in recent years. "It could be a game-changer for Overland. There are not as many players as in traditional NAS and we come in a lot cheaper than the competition but still have enterprise features such as snapshots."
I met with Violin Memory at VMworld Europe last week in Barcelona. It's always good to spend a bit of time talking with vendors and get to see under the skin of the company a bit.
Chief impression was that Violin spends a lot of time telling you what it's not about.
"We don't do cache," is one of its pronouncements. It believes the job of its flash is to act as super-fast storage in its own right, not as a cache corrective for the deficiencies of spinning disk. "You have to stop thinking about flash as disk augmentation," Violin technology VP for EMEA Mick Bradley told me.
"We don't believe in data tiering*," they also say. Here again they have faith in their ability to, "provide the performance of flash at a price comparable to tier 1 disk", meaning, in Violin world, that your hot data should be on their product and there's no need for it to be anywhere else except in backups and then archives.
Hybrid flash storage? "A race to the bottom in one use case", ie virtual desktops.
Server-side flash? Another compromise.
Perhaps its boldest and potentially most confusing claim is when it tells you they, "don't sell SSD".
It does, of course. Violin bases its all-flash array products on NAND flash chips it obtains via a supply chain deal with Toshiba. It puts this silicon on bespoke cards that carry all its special sauce; ie all the software that does that does the striping, data protection, wear-mitigation etc across these so-called VIMMs, or Violin Intelligent Memory Modules.
So, what Violin actually means when its says, "We don't sell SSD" is that it doesn't sell commodity SSD in 2.5" or 3.5" disk drive format.
You can't say Violin doesn't aim for a bold idea of what it does and doesn't do, and has a decent roster of customers including, most recently, the UK's air traffic control organisation, NATS.
But, informal conversations also reveal a frustration with a customer community that rarely looks beyond the big four or five storage array vendors. You know, the ones you'll never get sacked for buying from.
That, however, is the lot of the small storage vendor, especially one that so proudly ploughs its own furrow with technology that is obviously deeply proprietary. It's not like you could simply swap in commodity drives to a Violin array if the company or its arrangement with Toshiba went belly up.
It's one of those contradictions of the storage industry and of IT in general; the more you carve your own profitable proprietary niche the more you make yourself a potential single point of failure. And that's a fact that can't be lost on potential customers.
(* Despite not believing in data tiering it is planned to add it to future Violin Memory arrays, said Bradley at VMworld)
(For blog posts before mid-September2012 see UK Data Storage Buzz.)
Whiptail is another all-flash array vendor. Unlike Violin Memory it doesn't mince any words about being an SSD vendor or not.
It sells hardware that ranges from 3 TB to 72 TB; a head/controller on top of Intel 2.5" MLC drives. Its software provides buffering intelligence that deals with flash wear issues and RAID levels 0, 5, 6, and 10 are available.
In conversation with Whiptail EMEA VP and general manager Brian Feller he made a point of stating that the vendor uses "commodity drives". Here's how the conversation went after that.
Me: "Commodity drives, you say? So, I could buy drives from anywhere, as a commodity, and use them in a Whiptail array?"
Brian: "No. You have to buy them from us or you invalidate the warranty because we quality assure them. We remotely monitor all our arrays so we'd know, and we'd also cut technical support."
It's quite remarkable that "commodity" can come to mean "a product you can only buy from one company." It's also staggering that Whiptail sees the need justify this on the need to QA drives from Intel. It's not like they're some SE Asian white box no-name vendor.
But that's the world of storage, which sometimes feels years behind other areas of IT in terms of customer lock-in.
(For blog posts before mid-September2012 see UK Data Storage Buzz.)
On the streets around VMware's VMworld Europe 2012 event this week you could see a mobile advertising vehicle bearing a hoarding that declares: "NetApp Data ONTAP is the world's #1 storage OS? Yep."
It's a bold claim, and if true, NetApp would be right to plaster it to the side of vans. But it's not as straightforward as they'd like it to be.
In formal terms it's true. NetApp commands the second-biggest or close to second-biggest market share among storage vendors globally and all its storage arrays use the Data ONTAP OS. The world's biggest storage vendor, EMC, has plenty more market share but uses different OSs in its midrange and high-end arrays so loses out on the ability to declare any of them "the world's #1".
But, what is the claim to be world's number one storage OS really worth? Not a lot really. Firstly, NetApp gets to claim that mantle because their position in the market. It's a bit like Toyota declaring Toyota engines are the world's most popular, which is true because they are the biggest seller of cars worldwide. Just like you don't get a Toyota without one of its engines in it, you don't get a NetApp filer without ONTAP in it.
And, while NetApp probably has EMC's multi-OS product range in its sights as a subtext to the advert, the claim to have a single OS across all products is only worth anything if that means many of your devices can work together. NetApp has made strides towards this with its recent announcement of true NAS clustering that can scale to 20 PB and 20 billion files in ONTAP 8.1.1 but that is currently limited to five HA pairs of devices.
So, really, the NetApp ad should read: "NetApp Data ONTAP is the world's #1 storage OS? So what?"
(For blog posts before mid-September2012 see UK Data Storage Buzz.)