The volume of digital data stored globally is expected to reach something like 163ZB (zettabytes) by 2025, but despite the rapid growth of cloud computing, only a tiny fraction of that data is currently stored in the cloud.
According to Raj Bala, a research director at analyst group Gartner, cloud storage accounts for just a fraction of 1% of all digital data. Cloud-based data storage is, of course, certain to grow.
The challenge for CIOs is to choose the workloads that lend themselves to data storage in the cloud and to plan for the movement of data between or back from cloud storage providers.
“There are two common errors around cloud storage,” says Bala. “That the public cloud is a cheap dumping ground for data, and that it’s easy to integrate tiering of data to the cloud.”
In fact, with current technology, cloud storage can be complex and surprisingly expensive. Things are also complicated by the different approaches to cloud storage by the main cloud providers.
“Microsoft has always aggressively positioned Azure storage as an extension of on-premise storage,” says Edwin Yuen, analyst at Enterprise Strategy Group. “AWS [Amazon Web Services] is more focused on migrating storage into the cloud to be used by cloud-based solutions. And we have seen a surging effort by Google to work with partners to leverage Google Storage.”
Cloud storage barriers
At the most basic level, moving data to the cloud is a simple enough process.
Raw storage capacity is available from Google, AWS and Microsoft Azure, as well as a host of smaller providers. Even long-established enterprise storage companies now allow customers to buy storage capacity with a credit card.
NetApp, for example, does this with Azure NetApp Files. But integrating cloud storage with existing IT infrastructure, or with cloud-based compute instances, is much harder.
CIOs considering moving storage to the cloud need to consider data formats, the ease – or otherwise – of integration with applications, and bandwidth.
So far, the most mature part of the cloud storage market is backup and recovery, and archiving.
These use cases involve relatively few data movements, and some providers offer data ingress free of charge. Long-term archiving in the cloud can be cheap: IBM charges as little as ¢0.2 per gigabyte for archiving. And for backup, cloud can provide resilience and recovery times that exceed local provision, at a much lower cost.
“Backup can be costly, but you can do the same at a cheaper price and with a better recovery time objective (RTO) in the cloud,” says Rahul Gupta, a business technology expert at PA Consulting Group.
However, some storage systems and cloud storage gateways use proprietary data formats rather than formats native to local, or cloud-based applications. This reduces data portability, and will slow down transfers as data is converted when transferred to the cloud. Organisations with highly sensitive or critical data might be concerned about data integrity.
Storage is also a bandwidth-intensive use of the cloud. According to Gartner’s Bala, a business that generates 1TB (terabyte) of new data a day would need a 10Gbps link to transfer it. “That is a lot for a single use case and workload,” he warns.
The distance between applications and their data is also a potential barrier. This applies to conventional, on-premise business applications, such as enterprise resource planning (ERP), and cloud-based applications such as analytics and machine learning. Data is best kept close to compute resources.
“People recognise data is a bit sticky, and wants to stay where it’s born,” says Alex McDonald, a director of SNIA Europe. “People are moving compute to where the data is. Some people are moving data to cloud, as they’ve moved compute to the cloud.”
This trend is set to accelerate, as more software suppliers move to a cloud-centric or cloud-only offering. This will generate ever-larger volumes of data in the cloud, but businesses still face the challenge of integrating legacy data with cloud-based applications and updating storage architectures so that they can tap into cloud storage as its economies of scale improve.
Suppliers make cloud a tier
Storage providers are updating their on-premise and datacentre storage systems for greater compatibility with the cloud. This includes conventional storage arrays and network-attached storage (NAS) systems, as well as software-defined storage architectures.
The model is to allow bulk storage to stay on-site – or in a private cloud – and to hand off storage to a cloud-based provider where it makes sense to do so. This way, IT departments should be able to deploy more cloud storage as the economic and practical barriers fall.
For this to work, however, organisations need to adopt a hybrid model for their data storage and, most likely, a multi-vendor or hybrid model for their cloud provision. And CIOs need to consider supplier lock-in, as well as cost, when it comes to assessing their market.
Current generation hardware, although cloud compatible, uses proprietary rather than cloud-native protocols. The only way to retrieve data is to bring it back to on-site storage, with all the associated hardware and data exit fees. The business benefit of cloud storage needs to be sufficient, to outweigh these costs.
“Whether businesses should consider a multi-cloud strategy will depend on their maturity,” says PA Consulting’s Gupta. “When you are building solutions, look what happens when you move [the data] to someone else.”
Read more about cloud storage
- Computer Weekly looks at the biggest four cloud storage providers – Amazon, Azure, Google, IBM – how they stand in the market, the products they offer, and which offers the widest range of products and features.
- We run the rule over cloud NAS products that allow customers to build single-namespace file systems, including between in-house datacentres and public cloud storage.