Most IT departments face the occasional spike in computing demand. Sometimes they are predictable (a sports betting site knows what to expect when the cup final is on, for example), and sometimes not.
Buying enough equipment to handle those demand peaks internally isn’t always cost-effective, and one oft-touted solution to this is cloud bursting. In this scenario, applications can automatically offload work to a public cloud service provider when their on-premise resources get stretched.
The appeal of this approach is not difficult to see, as it means enterprises are not footing the bill for surplus, on-premise computing power during periods of normal operation.
This emerged as a key marketing tactic a few years ago for suppliers intent on selling the concept of hybrid cloud to the enterprise
Andrew Reichman, research director for cloud data at analyst group 451 Research, was one of those people, having tried to market the idea in his previous job at Amazon Web Services (AWS).
“While I was at Amazon, I was looking for examples of cloud bursting that we might be able to use as a reference in some marketing, but I could never find one,” he admits.
Reichman isn’t bullish that enterprises will suddenly catch the cloud bursting bug. Why? “The top line: it’s complicated.”
Read more about cloud-bursting
The much-vaunted benefits of cloud computing can be overshadowed by portability and interoperability issues.
Cloud bursting helps organizations use the public cloud to manage sudden spikes in demand. But what challenges or issues might it introduce.
Clive Longbottom, founder of tech advisory firm Quocirca, backs this view and says the time it takes to send a query to the cloud and back again can rule out the use of cloud bursting for applications built to expect local network response times.
“It tends not to work well because you are trying to use remote resource against local applications on local data,” he says, arguing that this builds a half-second of latency into your system. “Any gains you’re getting from the bursting are negated by that latency.”
To give remote applications fast access to data, IT teams may need to replicate entire datasets there. Then, if the data is used for real-time operations, they face the challenge of constantly updating the remote copy of the data to keep it concurrent with what they have locally.
The problems continue to mount when attempting to cloud burst applications at the data layer. If the application is a relational database that relies on atomicity, consistency, isolation and durability (Acid) in transactions, cloud bursting gets even hairier. The remote application’s transactions must stay in sync with the on-premise one while avoiding any inconsistencies in the transaction processing.
None of this makes cloud bursting applications impossible, but it does make it far more complex unless the functionality has been designed in at the application layer, says Dante Orsini, senior vice-president of business development at cloud services and hosting company iLand.
“We’re not seeing on-premise environments with applications that are intelligent enough to understand what’s happening with load and programmatically instantiate additional workloads in a cloud environment,” he says.
Divide and conquer
There are some more niche applications that make sense for cloud bursting, says James Butler, CTO at managed IT services firm Trustmarque. Applications that can divide jobs into lots of smaller workloads and then wait for the results are
“It’s a distributed computing challenge,” he says. “Applications that can deal with those scale-out, distributed computing scenarios are naturally good for cloud bursting without a lot of changes.”
What kinds of applications are those? Just 10 years ago, the IT world was agog over grid computing, in which applications would divide computing tasks between computers across the company, using spare CPU cycles wherever they could get it.
Number-crunching applications such as SETI@Home pushed this into the consumer space, using vast swathes of spare computing power on home PCs to analyse data from outer space. It’s these kinds of easily divisible, high-latency tasks that Butler is talking about. Historically, high-performance computing applications were the primary ones to divide tasks this way, processing data for statistical models or climate simulations, but things are changing. Analytics is becoming a focal point for businesses, and it is a candidate workload for cloud bursting.
“I’ve seen it among insurance companies doing data modelling analytics and wanting to burst this to the cloud,” says Butler. Software platforms designed to support parallel processing in analytics, such as Hadoop, are well-suited to dividing workloads between on-premise and cloud environments, as are Apache’s Cassandra and other NoSQL products such as CouchDB.
Organising your work
Dividing up the work isn’t enough, however. True cloud bursting needs an administrative layer that manages how backup resources are created and allocated in the public cloud.
“It’s not really about physical infrastructure. It’s more about the management and orchestration layers that you have in place to detect the spikes and drive the automation through into the public cloud,” says Butler. This could be scripted manually if an administrator needed to, such as if they notice a spike in demand, or they could schedule it through a Cron job in anticipation of a demand spike occurring. For example, if the admin worked for a retailer and Black Friday was coming.
The scripts might be quite complex unless other tools were bought in to help. In a more sophisticated environment, modern application performance management software would typically be able to kick-off a cloud instance programmatically when a peak in demand occurs. It would communicate with a tool that configured cloud workloads automatically, such as Puppet, Chef or SaltStack, which could fire up the necessary virtual machine in the cloud and configure it for the right workload.
Simple to say, difficult to do
Even with the appropriate tools in place, none of this is as simple as it sounds, warns Dan Jablonski, director of cloud and IT solutions product management at Verizon.
The orchestration system has to create the server, configure its IP address, install the appropriate application and pull down the correct configuration files. If the virtual machine is to be an application server, it must register itself to download the right content. The machine must connect itself to the pool of load balancers that will allocate its jobs, ensure the correct ports are open to it on the firewall, and then connect to the monitoring system.
“That’s a challenge. Even though the scripts are doing the work, it could be 30 minutes before the system is ready,” says Jablonski. When the peak is over, he adds, the whole thing needs to be unpicked and powered down to avoid paying for services that are not being used.
The point of cloud bursting is to have that ad hoc resource ready when needed, not to waste time waiting for it, so some healthy capacity planning and predictive provisioning is also a good idea, but this adds yet more complexity. In light of these these challenges, it is little wonder that 451 Research’s Reichman could not find a good cloud bursting story to tell.
A lot of this will be out of reach for smaller companies and they may be better off looking at alternatives. These include using workload scheduling to smooth out peaks in demand for tasks that are not time-sensitive. As early as 2010, eBay was cutting its application server count from 6,000 to 2,000 using simple methods before reducing it further with cloud bursting techniques.
Why not go all-in on cloud?
There’s a simpler alternative to cloud bursting, says iLand’s Orsini. “If someone is really building an application designed to burst across certain geographies based on certain thresholds, why wouldn’t they just put the whole thing in the cloud to begin with?”
Most of it comes down to economics, says Jablonski. “A lot of enterprises have an incredible amount of cost sunk into software licences for WebLogic, WebSphere and other products,” he says, and it makes financial sense to keep using legacy kit until it depreciates to the point of decommissioning.
In time, he sees those systems retiring, and companies moving further towards the cloud. “As long as they are able to build a path to not having everything running on top of an enterprise cost-based e-commerce platform or expensive database licences or infrastructure that is difficult to contain and manage, then it’s all good,” he says.
Cloud bursting is a niche practice for most companies. The closest many will get is manual provisioning of cloud resources for tasks such as development and testing, but this doesn’t really fit the definition. Many companies simply will not have the resources or the expertise to make this work, but for those running the kinds of applications for which cloud bursting is realistic, they must go into this with their eyes open and their senior architect handy.