The invincible Web site

Can there ever be such a thing as an 'always-up' Web site? Some people think so, especially if you use a combination of load...

Can there ever be such a thing as an 'always-up' Web site? Some people think so, especially if you use a combination of load balancing, clustering and IP traffic monitoring.

Anyone who trades on the Internet, whatever market they may be in, relies on getting high on business basics: high availability, high reliability, high security and high speed.

But with the e-commerce boom still on a rising curve, albeit a less steep one than this time last year, already overloaded networks are facing the inevitable strain that ever increasing amounts of traffic across the public Internet bring to the Web party.

The buzzwords for savvy, modern business during recent years have been Internet Protocol and convergence - and for good reason. IP is the infrastructure technology of choice as we start life in the 21st century, so naturally its benefits of budget savings, scalability coupled with relative simplicity, equally simplified support and management, have led to it becoming the focus of converging technologies and markets.

A single wire providing myriad services has always been the holy grail, but now the search is over and the masses are starting to coalesce around it. But therein lies the rub, because much like the plain old telephone system was never intended to carry data and therefore voice users can get a rough ride when congestion reveals its hand, so IP was never meant to be a multi-discipline traffic carrier.

With traffic types as varied as TCP/IP, bridged Ethernet, Novell IPX, VoIP and other applications fighting for space across the enterprise network, and the fact that their bandwidth requirement and latency 'sensitivity' can vary dramatically, it should come as no surprise that business-critical application performance can become degraded.

Think about it - you've got TCP/IP and non-TCP/IP traffic competing for network resources, along with multiple types of TCP/IP traffic competing with each other, just to make things more complicated.

Under such circumstances can there be such a thing as the 'always-up' commercial Web site? Surprisingly, the answer is a qualified 'yes, just about'. The trick lies in getting the balance right through a combination of load balancing, clustering and IP traffic monitoring techniques.

Take a long-term view
Sitara Networks' Jane Cox outlines some of the options: "Throwing more bandwidth at a corporate network is one option, but this is a short-term approach. Companies have upgraded the 10Mbps LANs of 1995 to the 100Mbps fast Ethernet LANs that fuel IT networks today, and some are implementing gigabit Ethernet links.

Even as the cost of bandwidth continues to fall, there are still congestion points affecting the enterprise and service provider's ability to meet service commitments for critical business traffic flows. Mission-critical business suffers as bandwidth does not recognise nor prioritise data flow.

Best-efforts delivery may be acceptable for Web traffic, but real-time applications require a guaranteed delivery mechanism that meets the most stringent quality of service standards."

So if increasing bandwidth doesn't solve network availability problems because many IP-based applications such as e-mail eat up as much bandwidth as they can, squeezing business-critical applications into the slow lane, what about intelligent routers?

"Routers have been designed to be high performance solutions for forwarding traffic. Increasingly, though, vendors have been adding some intelligence for QoS management, but not the depth of intelligence that effective QoS services require," says Cox.

"Routers address only a subset of traffic management necessities, not the extensive policy setting capabilities QoS appliances possess. In addition, most routers offer limited classification capabilities and do not provide Web caching or visual real-time monitoring and reporting.

Regardless of whether they use routers for traffic management, most businesses have to implement complementary QoS devices for classification, hierarchical policy setting, Web caching and real-time monitoring and reporting."

How about QoS bandwidth management then because QoS devices can enable e-business applications - including e-commerce, intranet access, Internet browsing and multimedia streams - to share bandwidth with legacy and other existing applications, while ensuring response time for business-critical applications is always protected.

"The QoS process can be divided into basic functions: monitoring, policy setting, policy enforcement and reporting," says Cox. "These form a circular process that provides feedback to ensure the desired results are not only achieved, but maintained. Networks and applications are dynamic environments and QoS implementations need periodic tuning to stay effective. With these elements in place, network managers can implement QoS policies that effectively support their business processes."

Clustering versus load balancing
Weynand Kuijpers, vice president of European operations at Web hosting company NTT/Verio, considers the differences between these two approaches to Web site uptime.

"Load balancers bring multiple sources for getting the same content, either locally or geographically spread. A load balancer provides 'instant redundancy' in terms of simple content sources. It allows a general Web site architecture to be very scalable, by having capacity on demand (for example, by deploying more content serving servers as the engine behind the load balancer or deploying more geographically spread content origins).

"In addition, they bring a logical decoupling of inbound Internet traffic to content serving servers - the general public does not talk to the content serving servers, which 'hides' flaws in the Web servers and Web demons.

"Clustering, on the other hand, in general isn't a device-based technology, but represents clustered operating systems, where it provides redundancy for higher order processes than a simple Web server. Mission-critical parts of a Web hosting infrastructure such as databases and application infrastructures require hot standby platforms to run the application/database."

The NTT/Verio approach is to combine all the technologies, providing customers with a total managed system. Load balancing and clustering are technologies that, used in the correct combination, augment each other and make the Web infrastructures more resilient, scalable and manageable.

Traffic shaping
We mustn't forget IP traffic management either which, although a broad term, can best be explained as IP filtering and firewalling to shape the traffic, which means making sure only authorised traffic will reach the target infrastructure. The difficulty is that traffic shaping slows down access to the Internet site. When asked if there were any credible alternatives to the load balancing/clustering/IP traffic shaping trio, Kuijpers responded with a qualified 'no'.

"I would position the load balancer as the most important component, clustering as the second most important and then IP traffic shaping. As the Internet grows in terms of points of presence, it gets easier to have content in multiple places, tied together by global load balancing technologies."

Nick Bond, technical manager at Radware UK, knows a thing or two about traffic management because it's what his company specialises in - and his words should be comforting to all involved in serious e-business.

"There is such a thing as an invincible Internet site if you don't put all your network eggs in the one basket," he says, instead advising that "to build a fail-safe site you must disperse content - use different ISPs in different locations, then you can look to acquire 100 per cent availability".

Essentially, you are ensuring content is not exposed to the same dangers. This can be taken a step further by using different devices in different locations so that potential vulnerabilities will not put all sites out of action simultaneously.

Ask Bond for his tip for the future and he'll nod in the direction of Web-based global load balancing management service outsourced to a third party. "It is the next wave. First we had connectivity, then security and now we are seeing the development of highly resilient, global solutions."

Next time your business suffers as a result of unexpected downtime, don't say we didn't warn you

The fault-tolerant Web
Nick Cheetham, UK managing director at Stratus Technologies, believes that although nothing is invincible if you need a transactional, e-commerce Web site up and running 24/7, you can get pretty close.

"Some Stratus customers have gone 10, 12 and as many as 17 years without experiencing any unplanned computer downtime," Cheetham claims. "This level of performance is important when people rely on applications supported by servers to make credit card purchases, trade stock, book airline reservations or fill prescriptions around the clock and around the world.

"To illustrate this near-invincibility, after the 1989 San Francisco earthquake registering 7.1 on the Richter Scale, and the 1993 Los Angeles Northridge earthquake measuring 6.9, Stratus electronically accessed all customers' machines in the area. All were running and none required service.

One customer's system had danced across the floor, held fast only by its strained power cord, and kept on processing. At last a support coordinator connected with a customer employee by phone, who reported, 'The server is fine. Can't talk. Everything else is down.'"

So industrial-strength Internet business is possible, but how do the technologies we've discussed here fit into this heavy-duty equation?

"I take issue with the idea that to achieve the highest stability clusters are the answer," Cheetham says. "Clusters are often quoted as providing 99.9 per cent uptime (so called three 9s availability). While this sounds impressive it actually translates to a downtime of eight hours a year. The problem is you can't predict when this will occur, so you must assume it happens at your worst time to arrive at a potential downtime cost to the business. Load balancing, on the other hand, allows a service or application to span multiple hosts. This may be desirable for two reasons; first, to allow an application to scale above the capability of any single system; second, to allow a single system to fail and for the other systems in the group to recover what threads of the service are left, reconnect the users and pick up the load, if they have the spare capacity.

"A fault-tolerant approach provides 99.999 per cent uptime (five 9s availability). This translates to a downtime of five minutes a year. This is 100 times better than a cluster, so the potential downtime cost to the business is 100 times less. On IP load balancing, our fault-tolerant servers provide support for all the leading approaches (Adapter Fault Tolerance, Adaptive Load Balancing, Gigabit EtherChannel). The key issue is not whether a user should focus on a hardware or software approach, but rather on how available the system chosen will be."

What about cost factors - after all, fault-tolerant servers have traditionally been seen as the domain of the multinationals, banking industry and the like?

"There are comparable services costs and cheaper IT support costs with fault tolerance. Clusters are difficult to build, configure and maintain. The technical advantage of the Stratus is that it looks externally like a regular Windows 2000 server, so any Windows 2000-trained Microsoft Certified Professional (MCP) can maintain it.

With fault-tolerant servers in failure situations there is no loss of service, no lost transactions and downtime is 100 times less than with clusters. This is the key point - downtime equals lost revenue. High-profile examples of extended downtime at the London Stock Exchange and the Egg bank illustrate that companies often lose millions during downtimes."

Read more on Voice networking and VoIP