Working in the dark room

The idea of a completely automated, darkroom network has been a holy grail for network managers - but it'll never happen.

Network managers are a unique bunch: software suppliers are constantly trying to make them redundant. The pages of most computer publications have advertisements showing happy network managers with their feet up, drinking coffee while the network runs itself. This raises an interesting question: if software tools can really make a network run itself, why do you need network managers?

In truth, there's a big difference between real-world networks with all their inherent problems and the idealistic, self-sufficient, zero-maintenance network promoted by software houses. Yet the computing world is constantly striving to achieve the "darkroom" network, and put IT personnel out of a job.

In an ideal world it wouldn't be so difficult, but unfortunately it's the little things that get in the way, according to Robert Haines, chief technical officer at Entuity.

Haines used to be a network analyst at investment firm Goldman Sachs before starting Entuity, which sells a network management product called Eye of the Storm. He says different types of network problem occur at different layers of the networking model. The old seven-layer standard model categorised network views at different layers of complexity (the lowest being the physical cabling and the top layers handling applications and presentation issues).

Thanks to the rise of the Internet, it has been largely replaced by the TCP/IP stack, which maps the different levels of complexity into various TCP-related protocols.

Configuration problems

Layer four is where switching and load balancing tools and components generally reside.

There are two things that can go wrong here, he says. Incorrect network configuration is one issue, where mistakes have been made while setting up a switch or load balancer. Another, more ubiquitous danger is that problems can arise with devices without necessarily rendering them inoperable. This can mean that a device is performing abysmally, but is still considered to be working by the network. Often, such underperformance can be difficult to spot, and so network managers need to scrutinise systems management data manually.

Down at layer three, where the routers live, configuration can be a major problem. Routers can be configured with static routes by mistake, or inaccurate configuration can result in sub-net problems, where traffic flows in one direction but not the other.

Open or broken?

Haines recalls one problem when working at Goldman Sachs, where the company upgraded from hubs to routers. During peak periods of traffic flow, the hubs would drop packets to avoid congestion.

"When we installed switches, and all the packets got through to the router, we started to see queue overruns," he says.

One of the biggest barriers to building a completely automated network is in compatibility, says Miles Cunningham, a network consultant at Comdisco Services.

Unfortunately, the idea of open technology, which arose in the early 1990s, gradually gave way to the real-world concept of broken technology, in which things that were meant to work together didn't.

"There are proprietary extensions, as people want to get product differentiation in more than just hardware," he complains, pointing to differences in supplier implementations of the IP security protocol, for example. "I would like to see greater adoption of directory standards like lightweight directory access protocol," he adds. "People support it, but offer it as a kind of 'backwards' compatibility."

His point is a valid one. But at the same time it is impossible to stop suppliers adding extensions to industry-standard protocols. The market has always worked this way, and doubtless always will, unless the open source movement manages to grab the network standards market by the scruff of the neck.

In the meantime, network managers have another problem to contend with, in the form of network immaturity, he argues. While standard IT disciplines emphasise the need for stable technology rollouts, business people want sexy technology, and they want it now. A good example of this is voice-data convergence, says Cunningham.

The world has only just got to grips with data - indeed, many IT managers would secretly admit that they haven't even done that yet - and now everyone is asking them to start shoving low-latency multimedia packets down the line along with everything else, using technology that still has to undergo long-term, real-world testing.

Still, just because we can't make networking staff redundant, it does not mean that we can't make their lives a lot easier by automating the administration of the network as much as possible.

George Georgiou, product manager of Siemens Network Systems, says a big barrier to automation is unspecified rules. You can programme systems management software to carry out specific tasks when predefined situations arise, but there will always be something unexpected to handle. He envisages a time when networks will be able to monitor how policy-based rules affect the performance of the communications infrastructure, and modify them to optimise performance. The technology to facilitate such a positive feedback loop is not yet commercially available, Georgiou says, but adds that it can be programmed from scratch by user organisations and consultants.

Best behaviour

While the world waits for such intelligent networks, heuristic systems are currently available which monitor patterns in network activity to make predictions about behaviour.

Computer Associates, for example, has been using its Neugent technology in its network management product, NetworkIT 2.0, announced in April. The neural network-based technology analyses patents in network traffic to predict problems before they arise, claims the supplier.

Smart software notwithstanding, it is difficult to see how problems such as a faulty device can be solved without human intervention.

Ultimately, such devices will need to be replaced or fixed by a human. Nevertheless, there are ways of introducing redundancy into the network so that faulty network hardware doesn't bring the infrastructure down completely.

The Internet Engineering Task Force has developed the Virtual Router Redundancy Protocol, which enables multiple routers to share the same IP address. Consequently, if one router goes down, all traffic directed to that address will reach the other router instead. This concept of resilience should be carried as far as possible in your infrastructure. Things like dual power feeds and connections to switches should be de rigueur on the network.

Policy-based networking

However, beware of relying on back-up equipment too much without testing your network on a regular basis, says Haines. If you rely on failover equipment without constantly checking each component to see whether it is working, your redundancy can disappear without you knowing it. Suddenly, large parts of your infrastructure can be operating on back-up devices without you realising, he warns.

Haines remembers one time at Goldman Sachs, when the company powered down the network and powered it up again, only to see widespread failure on the network as devices that had already been operating in a back-up capacity refused to reboot.

This is not the only type of smart technology available to network managers trying to automate their infrastructures. Policy-based networking has come to the fore in the past two years as a means of making networks more efficient by reflecting business requirements.

Using policy-based networking, you can use the network to prioritise certain users or applications, or even certain types of data such as voice packets, for example. This can be particularly useful if network performance has decayed due to a faulty device, for example. If you are running on 50% bandwidth, then you want to ensure that your voice packets are getting through the network first, and then perhaps that your accounts department is able to communicate across the network so that they can finish their interview results, for example. Meanwhile, other people playing Doom on the network or surfing for the football results will be locked out.

Richard Benwell, European marketing manager for Riverstone Networks, explains that it will be much easier to do this when Multi-Protocol Label Switching is ratified as a quality of service standard by the Internet Engineering Task Force. This will enable layer three devices to analyse and make decisions about packets using electronic "labels" attached to them. In the meantime, such policy-based networking is handled at layer four. "We have smart hardware that drills down to a level four packet and looks at the source destination addresses in the IP header, and the application port numbers and port socket numbers in the TCP header," explains Benwell. It enables him to prioritise packets into different queues or route them differently.

Other means of maintaining your network efficiency and avoiding problems before they happen are even simpler, says Cunningham. Gathering as much information about your infrastructure as possible will help your network management system and staff to spot and avoid problems early, he says.

Using technologies such as Simple Network Management Protocol and Remote Monitoring (RMon) will help you to gather the relevant data. "RMon got a bad reputation for making problems worse, because of the amount of information it could pass back to the probe," warns Cunningham. "RMon 2 is much lighter and more efficient, but SMon looks like being the next standard for future management."

The SMon, short for Switch Monitoring, standard was ratified last year and goes beyond the capabilities of its predecessor. It is designed to monitor all the traffic moving through the switch, rather than simply evaluating particular ports. The protocol was originally designed by Lannet, which was purchased by Lucent Technologies in 1998.

Check and retest

Finally, says Cunningham, good old-fashioned change management will go a long way towards making your network more resilient and less vulnerable to failure. Many of the issues that require human intervention on a network are often caused by humans in the first place. We are an unreliable lot compared to the machines that we operate, and a proper set of procedures need to be put in place to make sure that when changes are made to the system they are scrupulously checked and tested.

"There needs to be a perimeter wall between live systems and others," says Cunningham. "Developers should never have access to live systems." By the time something such as a switch, router or even a basic driver upgrade is installed on a live network, it should have been through a battery of tests in a sandbox environment.

Thankfully, for network managers, it is doubtful whether we will ever achieve a true darkroom network. Human intervention will always be a requirement.

All systems go for Bluetooth

Why the network manager's lights will stay on

The darkroom network scenario will never arise because:

  • Problems can arise on devices on the network without rendering them inoperable. Such problems need to be picked up from network management data by the human eye
  • In the absence of open technology, issues of compatibility will always complicate network configuration and upkeep
  • No forward-looking company is content to make do with stable technology, meaning that network managers will always be chasing the "next big thing"

Read more on Network software