In this guest post, Steve Lowe, CTO of student accommodation finder Student.com, weighs up the risks and benefits of relying on machines to carry out Ops jobs within cloud infrastructures.
The “if it ain’t broke, don’t fix it” mantra is firmly ingrained in the minds of most engineers, and – in practice – it works when your resource is fixed and there is no upside to scaling down.
However, when you are paying by the hour (as is often the case with cloud) or the second, it can make a huge difference.
Assembling a team of engineers that are able to make quick, risk-based decisions using all the information at their disposal is not an easy feat at all.
Worryingly, within modern-day teams, there is often a tendency to think things will work themselves out in a few minutes. Or, if something is working fine, the best approach is to let it run and be safe than scale it down.
People vs. machines
In a high-pressure situation, even some of the best decision makers and quickest thinkers can be hard-pressed to come up with a viable solution within the required period of time.
This is where the emotionless machine wins hands down every time. Wider ranges of data sets can be joined together and analysed by machines that can use that data to make scaling decisions.
On top of that, these decisions are made in a fraction of the time a team of engineers would use. As such, if you tell your machine to follow your workloads, it will do just that.
Another benefit of relying on emotionless machines is their automation and reliability, meaning workloads can be easily repeated and delivered consistently to the required standard.
Here comes Kubernetes
As enticing as all this may sound, especially to organisations wanting to scale-up, the devil is in the implementation. For a number of years, a significant amount of work was required to implement auto-scaling clusters in a sensible way. This was especially the case with more complex systems.
The solution? Kubernetes. As long as your software is already running in containers, Kubernetes makes implementation much simpler and smoother. And autoscaling a Kubernetes cluster based on cluster metrics is a relatively straightforward task.
Kubernetes takes care of keeping services alive and load balancing the containers across the compute environment. And, finally, enterprises get the elastically-scaling compute resource they always dreamed of.
What to do with the Ops crew?
Now the machines have helped the organisation free up all this time, the question of what to do with the people in the Ops team now needs answering.
Well, with cyber attacks increasing in both number and sophistication, there’s never been a better time to move free resources into security. With a little training, the Ops team are perfectly positioned to focus on making sure systems are secure and robust enough to withstand attacks.
With the addition of Ops people with practical hands-on experience, the organisation will be better positioned to tackle future problems. From maintaining and testing patches to running penetration testing, the Ops people will add direct value to your company by keeping it safe, while the machines take care of the rest.