2jenn - Fotolia
Cloud’s low-hanging fruit: Backup, tiering and data sharing
Native cloud operations can have a steep on-ramp in terms of enterprise IT readiness, but some things are relatively easy to port to a tier of storage in the cloud
The flexibility and scale of public cloud storage offers the potential to solve a wide range of enterprise technical challenges.
But not all cloud technology is easy to integrate into existing, traditional (on-premise) infrastructure; so if we want to take advantage of the cloud, how do we do that without major upheaval?
In this article, we will explore the areas of IT that are most easily moved to the cloud, such as backup, tiering and data sharing.
Where cloud can help
The need to deliver with speed, at scale and while reducing costs are common challenges many of us face in our enterprise.
These are often difficult to achieve with more traditional approaches, but this is where cloud excels; with scale, agility and flexibility – so how can we use it to meet enterprise demands?
Why not native cloud?
While our existing technology may restrict our ability to do all we want to do, it is deeply built into in the way we operate.
But shifting to native cloud services – in other words, those that were built for and run in the cloud – is not a trivial task, and may require the rewrite of applications, workflows and retraining of staff. All of these cost time and money, and have the potential to introduce risk.
However, integrating cloud technology with familiar enterprise technologies can help simplify use of the cloud, and allow us to more easily and widely adopt it.
Cloud fixing enterprise problems
How can we best use the cloud to integrate with and enhance existing datacentre functions?
In this section, we will look at some areas of datacentre functionality that can be most easily ported to the cloud, often as hybrid operations with the use of the cloud as an adjunct to on-premise working.
Cloud as a tier
The ever-increasing amounts of data we hold are a real challenge. As well as production data, there are also backups and other “cold”, infrequently-accessed data.
Where to store different classes of data, so that it is held on the most cost-efficient tier – including on-premise or in the cloud – presents a real technical and business issue.
Questions that arise include: how do we size accurately and easily grow our capacity on demand? How do we manage our data so that backups and infrequently used data do not consume expensive production tiers but remain accessible?
The idea of tiering data to lower cost disk is not new, but cloud storage with its scalability and commercially compelling pay-as-you-use model has created the almost perfect long-term repository.
Major storage suppliers have recognised this, and are now starting to integrate a cloud-based tier directly into their production arrays. For example, NetApp’s FabricPool allows its ONTAP operating systems to move data from production into backend cloud, doing it invisibly to storage teams and users alike.
It’s not just the major storage suppliers either. Microsoft’s Azure File Sync service integrates a similar idea directly into Windows Server.
This technology is not without challenges. Cloud costs and the performance impact of retrieving data from a cloud repository must be taken into account, but the benefit that a cloud storage tier delivers make it worthy of consideration.
Data protection to the cloud
Data protection is a high priority in any enterprise and a growing challenge as we need to protect more data, more often and for longer, while meeting ever more stringent recovery objectives and compliance legislation. All of this puts a huge strain on our data protection infrastructure.
Data protection suppliers have seen how large-scale, relatively low-cost repositories such as Amazon S3 or Azure BLOB can help alleviate some of these problems with the scale and flexibility our data protection requires. But these repositories are not easily accessed natively and often need some conversion mechanism to present them to traditional enterprise infrastructure.
Increasingly, the major data protection suppliers have integrated cloud storage repositories directly into their platforms and allow cloud storage to easily become part of your backup infrastructure. That includes newer services like Cohesity and Rubrik, but also more established ones like Veeam.
Such functionality allows for data to be moved to a cloud location based on policies defined to meet the needs of the enterprise and do it as part of your standard backup operations.
But be aware of limitations of this approach. Associated cloud costs need to be understood as well as the impact on your recovery capability of restoring large amounts of data from a public cloud.
Geographic data sharing
One of the longest-standing issues many organisations face is finding an effective way to share data across multiple locations.
The challenge is complex, and involves moving large amounts of data while maintaining file integrity and control. Traditionally, this has been dealt with via a distributed file system, which often relies on replicating large amounts of data across an organisation. This comes with management challenges such as maintaining authoritative copies, ensuring security and global file locking.
Spreading data over large geographic areas is a staple capability of cloud, and a number of suppliers have recognised this.
Read more about cloud storage
- We look at the big five storage array makers’ efforts to connect on-premise hardware with cloud storage and find automated tiering, on-ramps, and backup and archive capability.
- Use of the public cloud for backup data is something all the backup software suppliers provide, but implementations range from simple S3 connections to expansive software offerings.
They have built solutions that use cloud storage with a local cache to allow an enterprise to publish its data shares across multiple locations effectively and efficiently. The central store acts as the authoritative location while the local cache presents standard and familiar file protocols at the remote locations. Companies like Panzura and Nasuni offer these types of platforms.
The central repository solves a number of the issues of a traditional approach. It acts as the authoritative copy, and handles file locking and security centrally while remote sites only hold cache copies of data. That means there’s no need to replicate all data to each location and the only files needed are retrieved from the central repository.
These solutions are a very effective way of addressing a longstanding problem for many enterprises, but they can be expensive and many are hardware-based with physical appliances in remote locations.
They also normally sit outside your current file-sharing infrastructure and require all data to be imported onto these new platforms. That may or may not work seamlessly with existing management and protection tools.
However, companies such as Hammerspace have started to rework this to allow data to be presented in multiple locations without the need to import it into a separate platform.