12_tribes - Fotolia

Can hyperscale computing eclipse enterprise storage?

Pioneered by Facebook and Google, hyperscale computing and storage is built on cheap commodity parts with redundancy at device level

To date, deploying enterprise storage has meant using costly arrays with dual controllers, packed full of high-end disk drives. This traditional approach to enterprise IT systems viewed storage and servers as systems with redundant components – power supplies, fans, drives and so on.

But new choices are emerging for enterprise-scale storage and server infrastructures, driven by the pioneers of the web and their so-called Hyperscale computing environments.

As organisations scale their physical infrastructure, two choices are available to manage that growth – scale-up and scale-out. Scaling up means increasing the resources within the server or storage device, for example adding processors, interface cards and memory. That’s the traditional enterprise model.

The problem with this way of doing things is that redundancy in a single device translates into high cost. The alternative is to scale out, adding additional servers and storage in a distributed computing environment.

hyperscale storage/computing is the term coined to represent the scale-out design of IT systems that caters for very high volumes of processing and data. Rather than use multiple redundant components within a device, the level of redundancy becomes the server itself, including its storage.

Let’s use an analogy from the storage world: Storage arrays are typically built with multiple disk drives, with the storage space on many drives aggregated and protected using RAID.

RAID initially meant Redundant Array of Inexpensive Disks and was so called because the use of cheaper, inherently less-reliable, disks provided a cost benefit over more expensive and reliable (although not totally error-free) enterprise hard drives of the day.

Rather than spend time recovering data from a failed enterprise drive, RAID enabled the failure to be handled automatically with no service outage and with failed drive replacement able to be scheduled for a convenient time.

These days, we are comfortable with the idea of replacing failed RAID drives in a storage array. The failed disk unit is treated like a commodity component and simply discarded and replaced.

In hyperscale computing, the server and its direct-attached storage (DAS) becomes the basic unit, and forms part of a network or grid of thousands of physical devices. Typically, the server is not built with redundant components and in the event of a failure, its workload is failed over to another server. The faulty device can then be removed for replacement and repair.

Hyperscale computing environments work with multi-petabytes of storage and tens of thousands of servers. To date, they have mainly been used by large cloud-based organisations such as Facebook and Google. The use of commodity components, allows for the much higher scale required of these environments while keeping costs as low as possible.

Hyperscale computing designs work well for such emerging web companies because they take a new approach to database and application design. These environments typically consist of a small number of very large applications, in contrast to typical enterprise IT environments where there are a larger number of specialised applications.

New open-source platforms have emerged that offer storage and data services in hyperscale computing environments. These include Ceph, a distributed storage platform, the Cassandra and Riak distributed database platforms, and Hadoop, a database and data analysis platform, developed and managed under the Apache Foundation.

These new tools offer a different operating paradigm. Data is spread across servers in a redundant fashion, enabling protection from any individual server failure and if a server fails there is no service impact. The use of multiple servers also distributes processing power across many devices, providing scale-out performance for large workloads.

Open Compute Project

In an attempt to standardise the development of hyperscale computing platforms, Facebook formed the Open Compute Project in 2010. The aim is to develop standard reference hardware architectures that can be deployed in hyperscale computing environments, covering compute (servers), storage networking and datacentre design.

Open Compute has seen an interesting development in which the major hardware suppliers have begun to develop products for hyperscale environments.

To date this has mainly been the processor manufacturers (Intel, AMD), but has recently widened with the announcement of Project Moonshot server platform by HP. As a result of the Open Compute Project we have also seen the emergence of new techniques in datacentre management, specifically in rack design and server cooling.

Building hyperscale computing systems

Building hyperscale architectures requires a fundamentally different approach than that taken with typical enterprise IT systems. Although at the hardware level, the implementation is pretty straightforward using commodity components, the application layer requires new thinking.

Rather than “monolithic” software platforms like Oracle, database design is implemented around distributed architectures such as Hadoop. The distributed nature of the hyperscale computing platform removes the need for expensive and dedicated storage devices, allowing commodity JBODs to be used.

Bringing it all together requires software and tools to automate node deployment, recovery from failure (rerouting of workloads) and other management and monitoring tools. Many of these don’t exist today and that represents a challenge for large organisations that are used to implementing existing software tools.

Who is hyperscale for?

The focus of hyperscale computing is the storage and processing of large volumes of data, so we will likely see development of the technology in organisations where large data volumes are present. Banks are already looking at hyperscale systems and it’s clear they provide advantages in big data scenarios such as analysing customer trends, fraud detection and so on.

Looking across at other industries, others with big data analytical needs such as mining and oil exploration and medical and pharma will also benefit.

To date we have seen hyperscale only deployed by the likes of Google, where computing is the core business and thousands of IT graduates are employed to develop the core software platforms.

But now we are starting to see commercial organisations offering hyperscale computing products. HPs Moonshot platform delivers server system-on-chip (SOC) solutions, starting with servers built with the low power Intel Atom processor. The Moonshot 1500 supports up to 45 hot-pluggable server blades in a 5U chassis with a range of other server products expected in the coming months.

Nutanix and Scale Computing are two good examples of companies using Hyperscale to deliver virtualisation solutions. These blend compute and storage resources to provide a scalable “virtual server farm” without the need for individual server and storage arrays.

Fusion-IO, a vendor of SSD PCIe devices, is looking to move into the hyperscale market too. It recently acquired ID7, a developer of SCSI target software, and are looking to promote their ioScale SSD devices for inclusion in hyperscale server deployments.

The move by traditional vendors into hyperscale is bringing new hardware platforms, management tools and support for organisations that perhaps would be wary of the home grown nature of hyperscale computing.

This means we’re likely to see the adoption of hyperscale techniques across the industry. This won’t be on the scale of Facebook or Google, but as organisations evolve it will become an option in IT deployment.

And as hyperscale computing becomes more prevalent, it will be interesting to see how today’s incumbent vendors will deal with the challenge to their existing product architectures.

Read more on SAN, NAS, solid state, RAID