In this guest post, Frank Denneman, chief technologist of storage management software vendor PernixData, sets out why datacentre management could soon emerge as the main use case for big data analytics.
IT departments can sometimes be slow to recognise the power they yield, and the rise of cloud computing is a great example of this.
Over the last three decades IT departments focused on assisting the wider business, through automating activities that could increase output or refine the consistency of product development processes, before turning its attention to the automation of its own operations.
The same needs to happen with big data. A lot of organisations have looked to big data analytics to discover unknown correlations, hidden patterns, market trends, customer preferences and other useful business information.
Many have deployed big data systems, forcing end users to look for hidden patterns between the new workloads and consumed resources within their own datacentre and see how this impacts current workloads and future capabilities.
The problem is virtual datacentres are comprised of a disparate stack of components. Every system is logging and presenting data the vendor seems appropriate.
Unfortunately, variations in the granularity of information, time frames, and output formats make it extremely difficult to correlate data and understand the dynamics of the virtual datacentre.
However, hypervisors are very context-rich information systems, and are jam-packed with data ready to be crunched and analysed to provide a well-rounded picture of the various resource consumers and providers.
Having this information at your fingertips can help optimise current workloads and identify systems better suited to host new ones.
Operations will also change, as users are now able to establish a fingerprint of their system. Instead of micro-managing each separate host or virtual machine, they can monitor the fingerprint of the cluster.
For example, how have incoming workloads changed the clusters’ fingerprint over time, paving the way for a deeper trend analysis into resource usage.
Information like this allows users to manage datacentres differently and – in turn – design them with a higher degree of accuracy.
The beauty of having this set of data all in the same language, structure and format is that it can now start to transcend the datacentre.
The dataset gleaned from each facility can be used to manage the IT lifecycle, improve deployment and operations, optimise existing workloads and infrastructure, leading to a better future design. But why stop there?
Combining datasets from many virtual datacentres could generate insights that can improve the IT-lifecycle even more.
By comparing facilities of the same size, or datacentres in the same vertical market, it might be possible to develop an understanding of the TCO of running the same VM on a particular host system, or storage system.
Alternatively, users may also discover the TCO of running a virtual machine in a private datacentre versus a cloud offering. And that’s the type of information needed in modern datacentre management.