Does the following scenario sound familiar? A vital part of your IT infrastructure has failed causing serious service impact. Your technology teams are working on the incident but no-one can provide a plausible explanation of the underlying root cause, or tell you when it will be fixed. The network manager is blaming the server. The server manager is blaming the network. Your expensive infrastructure monitoring solution has either a) remained completely silent throughout the entire event or, b) flooded the monitoring console with abstruse alerts.
If so, you have an issue with the effectiveness of your infrastructure monitoring solution. You are not alone. IT departments like to pretend that the Operations Bridge watches over the estate, 24 by 7, ever vigilant for the slightest technology wobble and ready to respond with the full force of ITIL aligned Event Management procedures and automatically triggered self-remediation scripts. The reality is often very different. So why does your expensive monitoring solution rarely tell you what you need to know?