A new research report from Quocirca, Damage Control – The impact of critical IT incidents, shows the scale of the challenge faced by organisations as they struggle to address the volume of incidents that impact their IT infrastructure, especially those considered critical. The research was sponsored by Splunk.
The average organisation logs about 1,200 IT incidents per month, of which 5 will be critical. It is a challenge to wade through all the data generated by the events that lead to these incidents and prioritise dealing with them. 70% say a past critical incident has caused reputational damage to their organisation, underlining the importance of timely detection and to minimise impact.
The mean cost to IT of a critical incident is US $36,326, the mean downstream cost to business is an additional US $105,302. These two costs rise together, suggesting high cost to IT is a proxy for poor event and incident management, which has a knock-on effect for business operations.
80% say they could improve their mean time to detect (MTTD) incidents, which would lead to faster resolution times and decrease the impact on businesses. The mean time to repair (MTTR) for critical incidents is 5.81 hours, this reduces if there are fewer incidents to manage in the first place. On average, a further 7.23 hours are spent on root cause analysis, which is successful 65% of the time.
Duplicate and repeat incidents are a persistent problem. 97% say their event management process leads to duplicates, where multiple incidents are created for the same IT problem; 17.2% of all incidents are duplicates. 96% say failure to learn from previous incidents through effective root cause analysis (RCA) leads to repeat incidents; 13.3% of all incidents are repeats.
The monitoring of IT infrastructure to log events and identify incidents could be improved; 80% admit they have blind spots, leading to delayed detection and investigation of incidents. The complexity of IT systems and the tools that monitor them leaves many organisations without an adequate, holistic end-to-end view of their IT infrastructure.
Dealing with the volume of events generated by IT monitoring tools is a challenge. 52% say they just about manage, 13% struggle, and 1% are overwhelmed. Those with event management processes which enable them to easily manage the volume of events have a faster mean time to detect incidents and fewer duplicate and repeat incidents.
Quocirca will be presenting the report findings in a series of webinars, in conjunction with Splunk.