Capacity planning - aiming for the sky, but hitting the cloud ceiling?

At a recent round table event hosted by Sumerian (a provider of IT capacity planning tools), discussions took place around what role capacity planning should be playing in an organisation.

Sumerian had commissioned some research to see how capacity planning was perceived.  Some headline stats are worth mentioning – whereas 55% of businesses already say they are using some form of capacity planning tooling, in reality, for 45% this is simply an Excel spreadsheet.  56% of respondents felt that capacity planning would be increasingly important for them in the coming year – but 40% perceived a shortage of skills and 36% stated that they had a gap in their planning capabilities.

Such findings drove the discussion – why should anyone be bothered about capacity planning; what does capacity planning mean for an organisation’s IT platform; and where are the skills going to come from to provide a suitable system to manage planning to the level the business will demand?

Starting with the first issue, Quocirca’s own research over the years has shown that utilisation rates of servers in a physical, one-workload-per-server (or cluster) environment rarely reach above an average of 5-10%.  Storage utilisation rarely gets above 30%.  Network utilisation often gets above 80% due to poor data management, leading to sawtoothing and data collisions.  Such low utilisation rates in servers and storage combined with poor overall performance due to overwhelmed networks is becoming apparent to the business – just why should they pay for systems that are 90% underutilised yet underperforming due to network issues?  Why can’t IT get more out of the systems?

On the second point, what does it mean for IT platforms, it was felt that the complexity of a modern platform, with a mix of physical and virtualised environments along with private and public cloud means that it is becoming impossible to effectively plan for highly flexible existing workloads, never mind for when new workloads are implemented.

This then led on to the third issue – are the skills required human or can technology replace them?  The general impression was that it will be a mix of both, but that the main ‘grunt work’ will have to be automated to as great an extent as possible.

For cloud to fulfil on its promise, it will have to be dealing with multiple workloads on one flexible and elastic virtualised platform that mixes compute, storage and network capabilities.  Each aspect of this mix will be dependent on the others – for example, a storage issue may be ‘cured’ by throwing more storage capacity at it, but this can then cause a network issue that does not fully solve the actual problem of the performance of the end-to-end system.

Tools will be needed that can rapidly learn how business workloads operate; create patterns of usage; predict future usage and advise accordingly – or can take immediate action to prevent problems from occurring in the first place.

Behind this, there will be a need for business architects who can listen to what the business needs and come up with a range of possible solutions that they have investigated, fully understanding what extra capacity is required.  These architects can then report back to the business in terms of costs, risks and value for the business to make the final decision on which solution fits best with the business’ own risk strategy.  Only then should the IT department implement the solution.

The truth is that for an organisation to have an effective IT platform that is responsive current business needs and can support future needs, requires capacity planning.  Otherwise, the organisation will be faced with supporting a platform that is heavily over-engineered, resulting in excess licensing, consuming more energy than is necessary (not just in powering servers, storage and networks, but also in cooling them) and wasting of space in the private or co-location data centre that also costs money. Even with this over-engineering, it is still likely that a basic misunderstanding of the contextual relationships between the various components of an IT platform will still lead to the business being badly supported.

Another finding from the research was that many organisations who go into public cloud computing to save money do not find those savings.  From Quocirca’s point of view, this is not surprising – we always advise that any change is done for the benefits it provides to the business, not the cost savings it is meant to provide.  However, in this case, the higher than expected costs could well be down to the lack of effective capacity planning being in place, leading to over-provision of resources in the cloud ‘just in case’. It is important for organisations to ensure that any agreement with a public cloud provider is based on correct capacity planning up front, followed by the use of cloud elasticity to provide for additional resources when required.  By using good capacity planning tools, the need for considerable ‘overage’ charges (where excess resource is required on a regular basis) can be avoided.

Again, good capacity planning tools can ensure that the right amount of cloud resources are planned for and built upon as required – resulting in the cost savings that are being searched for.