Getty Images

Hybrid cloud storage: Options for rearchitecting enterprise IT

Hybrid cloud is seen as a way to have the best of both the on-premise and public cloud worlds. We look at the pros and cons of hybrid cloud storage

There is huge momentum in moving IT infrastructure to the public cloud. It enables IT departments to avoid the up-front costs of buying hardware and maintaining on-premise enterprise systems.

In Forrester’s The future of enterprise data storage report, analysts discussed how and why enterprise data stored in the cloud is increasing. Data from Forrester’s 2020 Analytics business technographics infrastructure survey found that storage and storage-related use cases represent more than one-third of current  and planned uses of cloud services.

With the increasing trend towards software-as-a-service (SaaS) adoption, legacy enterprise software providers are offering their own services to a cloud-based SaaS model. According to Forrester, this had led to a flood of enterprise data being pushed into cloud and supplier-hosted repositories and off on-premise hosted storage.

As Forrester points out in the report, the shift to subscription “as a service” pricing reduces overhead as part of the total cost of ownership. According to Forrester, although there are still plenty of reasons to spend money on a locally hosted storage system, cloud offerings remove or reduce many cost elements from the overall total cost of ownership, including datacentre floor space, scheduled downtime for storage management, outage risk, internal storage-related labour and hardware maintenance.

The report’s authors say the reasons to use on-premise storage will reduce over time as the differentiated elements of using storage on-premise are folded into utility offerings. At the same time, Forrester notes that storage hardware providers are also testing out opex (operating expense) models for on-premise storage, often coupling  them with value-added services such as those in Dell’s Apex or Hewlett Packard Enterprise’s HPE Greenlake IT-as-a-service offerings.

Complexity of hybrid storage

Among the challenges of dividing data across on-premise and cloud-based storage using a hybrid architecture is that it introduces a layer of complexity that would not exist if data was wholly stored on-premise or in the cloud. IDC’s Driving business value from data in the face of fragmentation and complexity report for Informatica notes that nearly 80% of organisations store more than half of their data in hybrid and multicloud infrastructures.

According to IDC, data fragmentation and complexity are a pervasive problem across data, technology, process and people. Stewart Bond, research director, data integration and data intelligence software at IDC, points out in the study that fragmentation and complexity divert data leadership away from innovation and increase risk.

Read more about enterprise storage

In a recent Computer Weekly article, IT expert Junade Ali wrote that while there will always be instances of hybrid cloud storage that can be beneficial – for example, where low-latency access speeds are needed on-site – it may not be the best option in many situations.

“It remains essential to consider whether it is the simplest solution to meeting the requirements at hand,” he says. “Far too often, for different reasons, we adopt technologies that introduce far more complexity than we need, causing us headaches later down the road. By focusing on simplicity and fulfilling the business requirements at hand, we are able to build solutions that are better for both the business and for technologists.”

Although the remorseless pursuit of simplicity is a hugely advantageous trait for an engineer, Ali says that in many ways, it flies in the face of human nature: “For engineers, achieving simplicity rests in satisfying the business requirements without adding unnecessary complexity which makes future changes harder.”

Adrian Bradley, a partner in KPMG’s technology practice specialising in cloud transformation, says hybrid storage is intricately tied to how businesses get value from cloud investments. At the highest level, IT’s role is about delivering a service at the right cost point and ensuring it has a positive impact on a business, he says. The challenge is how to continue to deliver this service at the price originally planned and maintain continuous value.

How Ocado divides up its IT

Online food retailer Ocado is a business that makes extensive use of multiple cloud providers. It uses AWS for applications and GCP for analytics. The company generates masses of data at the edge from robotic pickers at its customer fulfilment centres. While data storage and processing on-site reduces latency and avoids the need to push masses of data into the cloud, Ocado wants to reduce the footprint of its on-premise datacentres.

When considering the storage of data and the journey the data needs to take, James Donkin, CTO at Ocado Technology, says: “Data starts on-site at a robot arm or on a grid. We want data to end up in the cloud. There is a lot of overhead in the maintenance of on-site systems. We don’t want people to have to swap out specialist devices.”

The architecture uses microservices hosted in AWS. Amazon’s Kinesis serverless streaming data service is used to push data onto Google, where Google BigQuery is used for data analytics and forecasting. But Donkin sees opportunities to run more processing at the edge. 

“I see the datacentre as a squeezed middle,” he says. “The direction of travel is to pre-process data at the edge.”

For instance, in machine vision, a smart video camera equipped with a GPU (graphics processing unit) could be used to pre-process a video stream, he says. “You don’t want to catch every motion of every robot – you need a pipeline to lower the amount of video that is stored.”

For Donkin, there are opportunities to use heuristics running on a GPU-equipped video camera to capture data in a smart way. The heuristics would be able to filter out “normal” activity from the video stream, and only the video data of unusual events is sent up to the cloud.

Storage is a commodity and IT leaders will tend to look for the lowest-cost platform required to achieve a business objective, says Bradley. Moving to the cloud may seem the logical choice, but there is data egress cost associated with moving data to the cloud and an implicit cost arising from losing legacy skills that are no longer needed for cloud storage, he warns.

In theory, it makes a lot of sense to deploy as much storage in the cloud as possible to reduce the costs associated with maintaining enterprise-class on-premise hardware. But IT leaders unpack the theory from the reality, says Bradley. “The trigger point is purely financial. You’ll exceed your  IT budget when you spend significantly in the cloud and it becomes unsustainable.”

Some workloads require real-time processing of data at the edge and James Donkin , CTO at Ocado Technologies, believes edge processing can be used to reduce the amount of data that actually needs to be stored on-premise. For instance, streaming every piece of video data into the cloud is costly, and storing a video data stream on-premise is unnecessary if will hardly ever need to be accessed. 

“You need a pipeline to lower the amount of video that is stored,” says Donkin. “You don’t want to catch every motion of every robot. You need to figure out what is useful. If it is possible to model normal operations, these activities do not require capturing. Then, only unusual activities require streaming.”

Cloud pitfalls

A problem that KPMG’s Bradley has seen in some organisations is that when they migrate to the cloud, their business case is not robust enough. For instance, some organisations fail to factor in the degree of transformation necessary to move to the cloud or their rate of consumption, he says.

What is clear from the conversation with Bradley is that IT teams need to manage their cloud environments closely to avoid costs spiralling. “I have one client who did a reverse migration from the cloud for cost reasons,” he says. The client migrated from the public to a private cloud because the IT team was unable to drive enough value from the public cloud deployment and struggled to optimise it, says Bradley.

As Ocado’s Donkin notes, processing of data can occur at the edge. Depending on the type of data being collected and what it is needed for, this can reduce the need for on-premise enterprise storage and lower the amount of data that needs to be pushed into the cloud. Using machine learning and data models can give edge devices the know-how to understand what data is useful and what can be discarded. This can help IT departments to reduce the size of the on-premise enterprise storage they require.

It makes sense to store as much as possible in the cloud to take advantage of the powerful data-processing available on-demand. But where data is being collected at the edge, the logical place to process this data is at the edge, rather than using an on-premise enterprise storage system.

Next Steps

TGI Fridays dishes up modern hybrid cloud storage

Read more on Computer storage hardware

Data Center
Data Management