Unauthorised changes or errors in configuration files are
the primary reason for networks failing, networking experts warned
this week.
Stan Hulme, group IT manager at retail insurance broker and
financial services company Bland Bankart, said, "Human error is now
the primary cause of network downtime, whether or not the industry
is prepared to admit it."
Hulme said contemporary network operating systems, served
applications and client systems are, for the most part, inherently
stable. Software and hardware providers have made significant
advances in all aspects of systems availability, with uptime being
a central performance metric publicised in most new product
releases.
"Downtime in a modern networked environment is most likely caused
as a result of a poor change management process. In particular,
failures within, or even non-existence of offline test procedures,
prior to modification of a production environment," said
Hulme.
He pointed out that every change made to a network operating
environment contained a degree of risk. "Good planning
incorporating impact analysis, training, testing, sign-off controls
and a contingency scenario will reduce network downtime," he
said.
Steve Broadhead, director at independent network testing lab
Broadband-Testing, concurred with Hulme's assessment. He said,
"Most of the problems I have encountered at end-user organisations
have been as a result of misconfigured devices - human error in
other words."
In Broadhead's experience, no one in the IT department wanted to
admit to the failure. "What we need is an automated spy - something
which checks configurations changes and reports them immediately.
That way you at least know exactly when a change was made, and to
what," he said.
Alan Lawson, research analyst at Butler Group, said the problem of
network errors was due to the complexity of modern networks, which
need to balance quality of service between several applications.
Lawson said this complexity made network administration prone to
error.
One supplier which hopes to address the problem of human error is
Intelliden. The company has taken what it describes as a
model-based approach to network management. Its approach attempts
to provide a holistic view of network management, where
configuration changes to network devices can be centralised and
configuration can thus be automated.
Ravi Pather, Intelliden's vice-president for Europe, said, "Our
research has found that one in every three network changes
generates an error." He said the Intelliden R-series product
tackles this problem by maintaining a knowledge base containing the
configuration of all devices on the network.