As VoIP moves from the labs to production environments, the
network becomes a more important corporate asset and thus the
effect of downtime will be more acute. While companies spend
millions on upgrading infrastructure for VoIP, little attention is
given to solving the largest source of downtime –
configuration-related outages due to human error. A well-defined
change management process built around a configuration management
system can virtually eliminate the "self-inflicted" errors, which
currently account for about 60% of all network outages.
The problem: The blind leading the blind
Most organisations lack network-wide control of configuration
baselines and changes, which directly leads to an unnecessarily
high number of network outages. This lack of information also
impairs a network operations team's ability to quickly find and
repair the event that caused the outage.
Many organisations are in constant fire-fighting mode, so much
of the troubleshooting is done ad hoc. In fact, while
troubleshooting, the engineer may make several configurations
changes without documenting what has been done. With networks
growing rapidly, there is a continual need to establish and
maintain baseline configurations as well as having the ability to
audit them for all of the network devices at any moment in time. A
collection of ad hoc tools and poor process cannot do this, which
leads to many of the following common problems:
- Configuration drift:
When different individuals make a number of changes to many network
elements, device configurations tend to become inconsistent. This
leads to elements that are similar in profile, but have widely
different configurations where the baseline for each device is
lost. - Loss of configuration information:
When changes are done ad hoc and the network engineer attempts to
document the change after the fact, information is invariably
lost. - Unnecessarily long downtime:
When troubleshooting network problems, it is important that
engineers be able to restore a network device to a stable state.
This, at least, puts the device in a functional condition while the
engineer continues to identify the root cause of the problem.
Without a tool to automatically restore the device to the baseline,
the device remains down for an extended period, further impacting
the business. - Increased mean time to repair:
Problem isolation and incident repair take much longer because the
engineer needs to manually put the device into a stable condition
through ad hoc trial and error.
The sum of all of these problems is a higher cost of downtime,
longer repair times and overall lower service reliability. The
configuration of a network object and the impact the device has on
dependant devices is one of the first things an engineer
investigates during an outage. Without a consistent process to
device changes it is almost impossible to correlate these changes
manually.
 |  |  |  |  | I can't express strongly enough
how important this aspect of running a network is for supporting
real-time applications such as VoIP. , |
|  |  |  |  |  |
|  |
 |
Managing the network ad hoc puts the operations team in a position
where they are always working from a position of weakness, fixing
one problem after another and in continuous fire-fighting mode. One
of the keys to a successful operational strategy for the network is
a consistent process built around a network configuration
management tool.
The solution: Network configuration management
In the FCAPS (fault, configuration, accounting, performance and
security) model for network management, the "C" is often
overlooked. Most of the network management vendors focus on
everything but the configuration aspect so the bulk of
configuration management tools have been delivered by the equipment
vendors. The vendor-supplied tools provide some strong features,
but the scope is limited to that particular vendor. As a result,
most network managers require many configuration management tools
to support the entire network making correlation of information
very difficult.
However there are a number of independent vendors such as
Intelliden, Voyence, Opsware and Tripwire that offer multi-vendor
products that can be used as the focal point for the change
management process.
I can't express strongly enough how important this aspect of
running a network is for supporting real-time applications such as
VoIP. I've talked to many organisations that have gone through the
laborious task of deploying VoIP only to have the implementation
suffer due to poor change management process and tools. As
companies build more automation into the network, manageability
will be a key to success with configuration management enabling
it.
Companies implementing a configuration management strategy will
realise the following benefits:
- Faster, more accurate configurations and changes:
A multi-vendor tool can be used as a centralised resource for
faster changes and provisioning leading to more manageable
devices. - Configuration change tracking and more
accountability:
Network engineers can see when changes are made and who made the
change. Also, companies will gain the ability to detect when
unauthorised changes are made and tie the configuration change to a
particular individual. - The ability to return the device to a known state:
Most of the products today contain the ability to roll the device
back to the previous or predefined state. Without this, engineers
usually rely on TFTP servers or simple cut and paste from
configurations stored on laptops.
Overall companies will see the benefit of a more consistent,
uniform set of configurations that are easier to troubleshoot and
maintain. Also, by removing the ad hoc configuration changes, the
majority of self-inflicted errors will go away.
What to look for in a product
There are many features and functions to these products and your
criteria will be different from other companies, but here are the
main things I would recommend looking for:
- The ability to support as broad a variety of vendors as you
need
- Rollback and restoration tools
- The ability to create an event based on a configuration
change
- An intuitive GUI for ease of use
- Roles-based access and permissions
- The ease of exporting the information to other
systems
I stated this before, but it's important to not underestimate
the importance this can play in the long-term success of running a
network capable of supporting real-time applications such as VoIP.
It's realistic to expect a 20% improvement in the efficiency of the
network operations team and a 25% reduction in overall mean time to
repair. Think of it this way -- it's the same impact as adding an
additional one headcount for every five in network operations. More
importantly, it will allow your network operations staff to scale
as more network-dependant applications are deployed.
So, add implementing a configuration management tool to your New
Year's resolutions! Happy New Year!
About the author:
Zeus Kerravala is senior vice president of Yankee Group's
infrastructure research and consulting. His areas of expertise
involve working with customers to solve their business issues
through the deployment of infrastructure technology solutions,
including switching, routing, network management, voice solutions
and VPNs.
Before joining Yankee Group, Kerravala was a senior engineer
and technical project manager for Greenwich Technology Partners, a
leading network infrastructure and engineering consulting firm.
Prior to that, he was a vice president of IT for Ferris, Baker
Watts, a mid-Atlantic based brokerage firm, acting as both a lead
engineer and project manager deploying corporate-wide technical
solutions to support the firm's business units. Kerravala's first
task at FBW was to roll out a new frame relay infrastructure with
connections to branch offices, service providers, vendors and the
stock exchange. Kerravala was also an engineer and technical
project manager for Alex. Brown & Sons, responsible for the
technology related to the equity trading desks.
Kerravala obtained a B.S. degree in physics and mathematics
from the University of Victoria (Canada). He is also certified by
Citrix and NetScout.