Kalawin - stock.adobe.com
Not all clouds are built the same. On-premise clouds built from traditional virtualisation software are very different from public clouds in terms of availability, features and operation.
Hybrid cloud and multi-cloud strategies allow IT departments to make use of these different cloud offerings, but getting data into the right place at the right time is a challenge.
We look at storage in the cloud and how to implement a multi-cloud strategy.
Defining hybrid and multi-cloud
Typically, hybrid cloud is used to describe implementations where data and applications are extended from on-premise into one or more public clouds, for example AWS, Microsoft Azure or Google Cloud Platform (GCP).
Originally, this may have been to “burst” workloads, but increasingly can mean spanning both locations indefinitely. Multi-cloud takes things a step further, running applications across multiple clouds as part of the design, which may include running nothing on-premise.
Multi-cloud cost benefits
Why is this of benefit?
The most obvious reason is one of cost. Suppliers continually offer cheaper solutions, with reducing prices and discounts for committed use or cheaper spot pricing.
Although public cloud storage hasn’t really come down in price much recently, savings can be made on virtual instances, especially where applications need to quickly scale up and down, so data needs to be made available quickly in multiple public cloud environments.
The second, probably more important reason for running multi-cloud is features.
Amazon, for example, is notorious for its speed of innovation. In 2017, the company released 1,430 new features to AWS – almost four per day.
Not all of its releases are major changes or new products, but many are continual enhancements to existing offerings. Cloud suppliers are moving to offer more platform and software-as-a-service offerings (like machine learning and AI) that simply need enterprise data made available to them.
We should also remember that cloud choice can be influenced by design and operational benefits. Running multi-cloud could provide against failure of a single supplier.
Ultimately, getting the most benefit from a multi-cloud strategy means balancing the advantages of flexibility with the challenges of implementing security, networking and – key to this discussion – data availability, latency and consistency.
Cloud storage solutions
There are at least four ways in which data can be stored in public cloud solutions.
Native services are those implemented by the cloud supplier directly. Typically, all cloud providers will offer block, file, object and some application specific (eg, database) storage services. Block storage is usually restricted to internal use to connect to virtual instances, whereas file, object and database can be exposed to external connectivity. This has been a problem for some cloud users, where object and database data has inadvertently been defaulted to public access, so security settings are important to validate here.
Vendor-integrated services have started to emerge, with NetApp being the most prominent player in the market. Microsoft Azure NetApp files, for example, provides the benefits of NetApp ONTAP, integrated directly into Azure using Azure APIs and security capabilities. Supplier-integrated solutions are usually more feature-rich and higher performing than native services.
Read more about cloud storage
- Computer Weekly looks at the biggest four cloud storage providers, how they stand in the market, the products they offer, and which offers the widest range of products and features.
- We survey the big three cloud providers – AWS, Azure and Google – and find a range of mostly block storage flash storage options with performance choices available.
Marketplace services are storage offerings that can be deployed in virtual instances from cloud application market places. There are a huge number of solutions in this area, from traditional storage platforms, to data protection and analytics. The availability of large cloud compute instances and direct-connect NVMe devices means these solutions are practical for high-performance production use cases. Marketplace offerings are great at providing a consistent look and feel to on-premise and cloud implementations.
Co-located services are deployed by storage vendors in nearby (or sometimes the same) datacentre to the cloud provider and connected using high-speed networking such as AWS Direct Connect or Azure ExpressRoute. The storage vendor provides the capability to provision storage that looks and feels like a traditional storage solution and can connect to the customer’s own on-premises storage solutions.
With such a range of storage offerings, it seems like there’s a confusing array of choices to be made. So, why move away from the native offerings put forward by the cloud providers?
A big reason to use a third-party solution is that of performance. Cloud vendors don’t provide any performance or throughput guarantees on native services, other than for block storage.
Even here, the assurances only cover throughput and IOPS and are generally in very limited or rigid specifications. Azure NetApp Files, for example, offers much higher performance than native file services and co-located services can deliver features like Quality of Service at a granular level.
Cloud providers offer fixed levels of availability, typically around three to four “nines” (99.9% – 99.99%) of uptime. Using co-located services could provide higher availability and enable applications to be quickly moved between clouds if an outage occurs. This is because the data is stored outside of the cloud provider equipment.
Although vendors are developing and enhancing services continually, most don’t have the breadth of features offered by suppliers that have been in the market for many years. Enhanced features may include better security integration and data protection (snapshots and replication).
Building a multi-cloud strategy
Getting back to the original subject of this article, how can these services be used to build a multi-cloud storage strategy?
Probably the first question to ask is how often data and applications are likely to move between clouds.
If public cloud is being used, for example, for data protection, then applications are unlikely to move around much and so data mobility isn’t a big factor.
However, at the opposite end of the scale, where applications could run in any cloud at any time, then full flexibility is needed.
Implementing data mobility is probably the biggest challenge in deploying multi-cloud applications.
Data has inertia and takes time to move around. Cloud suppliers charge for egress – data accessed outside of their cloud – so mass migration of data from one cloud platform to another isn’t really a practical solution. As a result, the strategies for multi-cloud tend to fall into one of the following categories.
- Burst “on-demand”. Here, application and data are migrated to the public cloud when required. The actual implementation of this scenario could be through permanent virtual machine replication, for example, as achieved via Datrium, Velostrata or Zerto. Alternatively, data can be replicated via storage as achieved with NetApp Private Storage or HPE Cloud Volumes.
- Replicate data. Data can be replicated across clouds, keeping copies closely in synchronisation. File-based solutions such as Elastifile CloudConnect or Qumulo QF2 provide the capability to span on-premises or multiple cloud environments, making it easy to expose on-premises data to, for example, analytics services in the public cloud.
- Abstract the data. These solutions separate the presentation of data from the underlying physical storage platform. This provides a single global view of the data, while the physical storage can be across one or many public and private clouds. Examples of this solution are Zenko from Scality or CTERA Enterprise File Services Platform. With full abstraction, data can be physically redirected or replicated between cloud providers to meet the needs of availability, cost and performance.
The key to making these solutions work are features such as automation – being able to run storage services from command line interfaces or APIs. Solutions also need to understand incremental changes in data and be able to move only the changed data between locations as this reduces the impact of egress charges.
If there’s anything to learn from existing solutions in the market, it’s that they are very much self-design and build in nature. Public cloud providers don’t expose block storage outside of their environment and don’t offer native interfaces for replicating file and block data. At this point in time, there’s no reason to expect that public cloud service providers will change this position.
IT organisations thinking of implementing a multi-cloud strategy therefore need to look long and hard at the solutions they will use, as the risk of lock-in could be greater in multi-cloud storage than it ever was in the private datacentre.