The idea of a completely automated, darkroom network has been a
holy grail for network managers - but it'll never happen.
Danny BradburyNetwork managers are a unique bunch: software suppliers are
constantly trying to make them redundant. The pages of most
computer publications have advertisements showing happy network
managers with their feet up, drinking coffee while the network runs
itself. This raises an interesting question: if software tools can
really make a network run itself, why do you need network
managers?
In truth, there's a big difference between real-world networks
with all their inherent problems and the idealistic,
self-sufficient, zero-maintenance network promoted by software
houses. Yet the computing world is constantly striving to achieve
the "darkroom" network, and put IT personnel out of a job.
In an ideal world it wouldn't be so difficult, but unfortunately
it's the little things that get in the way, according to Robert
Haines, chief technical officer at Entuity.
Haines used to be a network analyst at investment firm Goldman
Sachs before starting Entuity, which sells a network management
product called Eye of the Storm. He says different types of network
problem occur at different layers of the networking model. The old
seven-layer standard model categorised network views at different
layers of complexity (the lowest being the physical cabling and the
top layers handling applications and presentation issues).
Thanks to the rise of the Internet, it has been largely replaced
by the TCP/IP stack, which maps the different levels of complexity
into various TCP-related protocols.
Configuration problems
Layer four is where switching and load balancing tools and
components generally reside.
There are two things that can go wrong here, he says. Incorrect
network configuration is one issue, where mistakes have been made
while setting up a switch or load balancer. Another, more
ubiquitous danger is that problems can arise with devices without
necessarily rendering them inoperable. This can mean that a device
is performing abysmally, but is still considered to be working by
the network. Often, such underperformance can be difficult to spot,
and so network managers need to scrutinise systems management data
manually.
Down at layer three, where the routers live, configuration can
be a major problem. Routers can be configured with static routes by
mistake, or inaccurate configuration can result in sub-net
problems, where traffic flows in one direction but not the
other.
Open or broken?
Haines recalls one problem when working at Goldman Sachs, where
the company upgraded from hubs to routers. During peak periods of
traffic flow, the hubs would drop packets to avoid congestion.
"When we installed switches, and all the packets got through to
the router, we started to see queue overruns," he says.
One of the biggest barriers to building a completely automated
network is in compatibility, says Miles Cunningham, a network
consultant at Comdisco Services.
Unfortunately, the idea of open technology, which arose in the
early 1990s, gradually gave way to the real-world concept of broken
technology, in which things that were meant to work together
didn't.
"There are proprietary extensions, as people want to get product
differentiation in more than just hardware," he complains, pointing
to differences in supplier implementations of the IP security
protocol, for example. "I would like to see greater adoption of
directory standards like lightweight directory access protocol," he
adds. "People support it, but offer it as a kind of 'backwards'
compatibility."
His point is a valid one. But at the same time it is impossible
to stop suppliers adding extensions to industry-standard protocols.
The market has always worked this way, and doubtless always will,
unless the open source movement manages to grab the network
standards market by the scruff of the neck.
In the meantime, network managers have another problem to
contend with, in the form of network immaturity, he argues. While
standard IT disciplines emphasise the need for stable technology
rollouts, business people want sexy technology, and they want it
now. A good example of this is voice-data convergence, says
Cunningham.
The world has only just got to grips with data - indeed, many IT
managers would secretly admit that they haven't even done that yet
- and now everyone is asking them to start shoving low-latency
multimedia packets down the line along with everything else, using
technology that still has to undergo long-term, real-world
testing.
Still, just because we can't make networking staff redundant, it
does not mean that we can't make their lives a lot easier by
automating the administration of the network as much as
possible.
George Georgiou, product manager of Siemens Network Systems,
says a big barrier to automation is unspecified rules. You can
programme systems management software to carry out specific tasks
when predefined situations arise, but there will always be
something unexpected to handle. He envisages a time when networks
will be able to monitor how policy-based rules affect the
performance of the communications infrastructure, and modify them
to optimise performance. The technology to facilitate such a
positive feedback loop is not yet commercially available, Georgiou
says, but adds that it can be programmed from scratch by user
organisations and consultants.
Best behaviour
While the world waits for such intelligent networks, heuristic
systems are currently available which monitor patterns in network
activity to make predictions about behaviour.
Computer Associates, for example, has been using its Neugent
technology in its network management product, NetworkIT 2.0,
announced in April. The neural network-based technology analyses
patents in network traffic to predict problems before they arise,
claims the supplier.
Smart software notwithstanding, it is difficult to see how
problems such as a faulty device can be solved without human
intervention.
Ultimately, such devices will need to be replaced or fixed by a
human. Nevertheless, there are ways of introducing redundancy into
the network so that faulty network hardware doesn't bring the
infrastructure down completely.
The Internet Engineering Task Force has developed the Virtual
Router Redundancy Protocol, which enables multiple routers to share
the same IP address. Consequently, if one router goes down, all
traffic directed to that address will reach the other router
instead. This concept of resilience should be carried as far as
possible in your infrastructure. Things like dual power feeds and
connections to switches should be de rigueur on the network.
Policy-based networking
However, beware of relying on back-up equipment too much without
testing your network on a regular basis, says Haines. If you rely
on failover equipment without constantly checking each component to
see whether it is working, your redundancy can disappear without
you knowing it. Suddenly, large parts of your infrastructure can be
operating on back-up devices without you realising, he warns.
Haines remembers one time at Goldman Sachs, when the company
powered down the network and powered it up again, only to see
widespread failure on the network as devices that had already been
operating in a back-up capacity refused to reboot.
This is not the only type of smart technology available to
network managers trying to automate their infrastructures.
Policy-based networking has come to the fore in the past two years
as a means of making networks more efficient by reflecting business
requirements.
Using policy-based networking, you can use the network to
prioritise certain users or applications, or even certain types of
data such as voice packets, for example. This can be particularly
useful if network performance has decayed due to a faulty device,
for example. If you are running on 50% bandwidth, then you want to
ensure that your voice packets are getting through the network
first, and then perhaps that your accounts department is able to
communicate across the network so that they can finish their
interview results, for example. Meanwhile, other people playing
Doom on the network or surfing for the football results will be
locked out.
Richard Benwell, European marketing manager for Riverstone
Networks, explains that it will be much easier to do this when
Multi-Protocol Label Switching is ratified as a quality of service
standard by the Internet Engineering Task Force. This will enable
layer three devices to analyse and make decisions about packets
using electronic "labels" attached to them. In the meantime, such
policy-based networking is handled at layer four. "We have smart
hardware that drills down to a level four packet and looks at the
source destination addresses in the IP header, and the application
port numbers and port socket numbers in the TCP header," explains
Benwell. It enables him to prioritise packets into different queues
or route them differently.
Other means of maintaining your network efficiency and avoiding
problems before they happen are even simpler, says Cunningham.
Gathering as much information about your infrastructure as possible
will help your network management system and staff to spot and
avoid problems early, he says.
Using technologies such as Simple Network Management Protocol
and Remote Monitoring (RMon) will help you to gather the relevant
data. "RMon got a bad reputation for making problems worse, because
of the amount of information it could pass back to the probe,"
warns Cunningham. "RMon 2 is much lighter and more efficient, but
SMon looks like being the next standard for future management."
The SMon, short for Switch Monitoring, standard was ratified
last year and goes beyond the capabilities of its predecessor. It
is designed to monitor all the traffic moving through the switch,
rather than simply evaluating particular ports. The protocol was
originally designed by Lannet, which was purchased by Lucent
Technologies in 1998.
Check and retest
Finally, says Cunningham, good old-fashioned change management
will go a long way towards making your network more resilient and
less vulnerable to failure. Many of the issues that require human
intervention on a network are often caused by humans in the first
place. We are an unreliable lot compared to the machines that we
operate, and a proper set of procedures need to be put in place to
make sure that when changes are made to the system they are
scrupulously checked and tested.
"There needs to be a perimeter wall between live systems and
others," says Cunningham. "Developers should never have access to
live systems." By the time something such as a switch, router or
even a basic driver upgrade is installed on a live network, it
should have been through a battery of tests in a sandbox
environment.
Thankfully, for network managers, it is doubtful whether we will
ever achieve a true darkroom network. Human intervention will
always be a requirement.
All systems go for Bluetooth
Why the network manager's lights will stay on
The darkroom network scenario will never arise because:
- Problems can arise on devices on the network without rendering
them inoperable. Such problems need to be picked up from network
management data by the human eye
- In the absence of open technology, issues of compatibility will
always complicate network configuration and upkeep
- No forward-looking company is content to make do with stable
technology, meaning that network managers will always be chasing
the "next big thing".