weerapat1003 - stock.adobe.com
Datacentre outages are increasing in frequency and severity, as operators grapple with managing increasingly complex server farm environments, according to the Uptime Institute.
The think-tank’s eighth annual Data centre survey shows that, over the past year, the rate at which outages occur appears to have risen, with operators also reporting a marked rise in the severity of downtime episodes.
Of the 900 datacentre operators and IT practitioners who took part in this year’s global survey, 31% said they had suffered an outage or period of “server service degradation” over the past 12 months, which is up from 25% in the 2017 report.
This figure rises to 48% when participants were asked if they had experienced a downtime incident in their own datacentre, or one operated on their behalf by a third-party service provider, in the past three years.
The majority (80%) of outage sufferers said the situation could have been prevented, with human error, power outages, network failures and configuration issues flagged as leading causes of datacentre downtime.
Andy Lawrence, executive director of research at the Uptime Institute, said this observation coincides with a surge in the complexity of the datacentre environments that operators are having to manage, as they adapt their facilities to accommodate hybrid cloud setups.
“The rapid growth in the implementation of cloud and hybrid IT approaches has ushered in a period of great change-creating technology, organisational and management complexity,” he said.
“And these new challenges are many times unlike anything previously seen in the industry at this magnitude. It’s a perfect storm.”
Even so, 61% of respondents said adopting a hybrid IT consumption model, whereby workloads are spread across a variety of environments – including on-premise, at colocation facilities or in the cloud – had made their IT more resilient overall, despite the uptick in outages reported elsewhere.
Read more about datacentre outages
- Visa has offered a retrospective analysis of what went wrong in its datacentre during its UK-wide outage on Friday 1 June, in response to a request from the Treasury Select Committee for more detail about the downtime.
- British Airways has blamed “human error” for its bank holiday datacentre outage, but the Uptime Institute suggests there may be more to it than that.
As time goes on, said Lawrence, datacentres are only going to become more complex and difficult to manage, as operators continue to build out their hybrid IT and edge computing capacity.
“Looking ahead, many are expecting to deploy significant new hybrid and edge computing capacity, which will support new services, but will add a further layer of complexity in doing so,” he said.
On this point, 40% of the survey’s respondents said they anticipate needing edge computing capacity at their disposal in the coming years, as the need to process data closer to where it is generated and used grows.
“Edge computing is exciting because of the improved performance and scale it can offer to next-generation technologies like artificial intelligence, the internet of things and even autonomous driving applications,” said Lawrence. “We expect to see substantial growth in the edge over the next few years.
“Edge has the ability to keep building upon each set of application improvements and the advances of previous versions, which will cause rapid improvement in capability development and implementation.”
Read more on Datacentre disaster recovery and security
Emerging digital service models: Addressing the need to prevent downtime
Uptime Institute highlights patchy reporting of water use by datacentre operators
Uptime Institute: Networking issues to overtake power problems as main cause of datacentre outages
The OVHCloud fire: Assessing the after-effects on datacentre operators and cloud users