Server virtualisation and the network: Site consolidation's impact on latency

Server virtualisation and site consolidation bring concerns about the WAN developing latency issues as application servers' physical locations change.

Server virtualisation holds exciting promise for companies in reducing overall administrative costs, consolidating their physical infrastructure, and improving the ability to dynamically roll out new services. It also presents new challenges for network managers in terms of capacity planning, consolidation, latency considerations, and managing dynamically changing networks, which can be altered far more easily by the user base.

This tip is the first in a series that will explore the implications of some of these issues and look at some of the strategies for keeping the network humming along in the face of these new challenges.

The challenge of site consolidation

The biggest challenge of server virtualisation is that it has become synonymous with consolidation, according to Harold Byun, senior product marketing manager at Riverbed. Many companies are pulling out branch office data systems and consolidating multiple data centres along with the virtualisation effort. The concern for IT is whether performance and throughput will suffer over the wide area network (WAN) as application servers' physical locations change.

As the organisation begins to move forward with these efforts, the network manager needs to know whether a sufficient network is in place or whether the network can be adjusted to rise to the challenge. He has to catalogue what is being virtualised and consolidated -- and understand what those applications are and how they perform over the network.

In many cases, the biggest issue is not the total amount of bandwidth; it's the increase in latency that site consolidation can introduce into the networking environment. "We hear a lot of stories about the network manager getting complaints about poor network performance from the CEO who asks them to do what it takes to upgrade the network," Byun said. "They upgrade the link from T1s (1.5 Mbps) to T3 (45 Mbps), only to discover a modest increase in network performance."

Transport latency and application protocol chattiness and inefficiency both work to degrade the total application performance when delivered over a wider area. Byun likes to use the analogy of a vehicle taking 200 passengers between New York and LA. A small car can carry only five passengers at a time and would need to make 40 roundtrips to get everyone to LA. It does no good to build a bigger freeway with wider lanes because the car would still have to make 40 trips. There is no overcoming the distance and latency barrier by installing a better freeway.

The key in dealing with chatty applications lies in figuring out how to minimise the number of trips the chatty applications need to make across the network. By encapsulating the overhead of standard office application protocols, which are running over a wider network in a more consolidated virtual environment, the network needs to make far fewer journeys to carry the same amount of data. This would be like replacing the cars with large buses that can carry 50 passengers, thereby reducing the number of trips from 40 to four.

WAN acceleration strategies

One way to address these latency issues is through the use of WAN acceleration equipment from vendors like Riverbed, Juniper Networks, Exinda Networks, Avail, and Silver Peak. For example, Riverbed's Steelhead networking appliance does three things to minimise the number of WAN trips between remote offices and data centres:

  • Data de-duplication
  • Transport layer optimisation
  • Layer 7 protocol optimisation

With data de-duplication, the server creates a cache of packets sent at the byte level. Since it operates at a lower layer of the network, the appliance would still be able to reduce significantly the number of packets that would need to be shipped across the WAN, even if someone were to make changes to a file, because most of the bits would remain unchanged. The Riverbed Steelhead appliances are able to recognise data and send reference pointers to one another in lieu of retransmitting all of the relevant packets across the WAN.

Transport layer optimisation and Layer 7 protocol optimisation work to reduce the chattiness of different classes of applications across the wide area. For example, Windows file sharing, which uses CIFS, can take 1,000 roundtrips to transport a simple file over a network. This works fine in a LAN environment, but a WAN with a 50 ms delay between roundtrips would create a five-second delay in transmitting the file. The delays become even more onerous on the network when the same file is being transmitted in an email attachment to 50 people. "There are a lot more roundtrips," Byun said, "which is where you see significant slowdown."

Other chatty protocols include HTTP, Microsoft Exchange traffic (MAPI), MS SQL, and Unix Network File Sharing (NFS). Unfortunately, network managers don't have the luxury of selecting which protocols they are going to use. Byun noted, though, that "they do have the luxury of deciding whether they want to keep the protocols local."

For example, the Defense Contract Management Agency (DCMA), which monitors work on military contracts, consolidated 18 data centres down to two and 625 physical servers down to 200 using VMware ESX servers. When DCMA first consolidated the servers, however, its users complained about performance -- even though there was enough bandwidth to handle the traffic. Byun said that Riverbed helped DCMA develop an architecture that confined application chattiness, thereby reducing overall network latency and improving performance to acceptable levels.

Read more on Network monitoring and analysis