Danny Bradbury
computer.weekly@rbi.co.uk
For as long as data networks have existed, people have wanted
faster performance. When Ethernet was invented at Xerox Parc in the
1970s, it ran at a puny 3mbps. Before long, a 10mbps Ethernet
became a standard, but in the early 1990s the industry decided this
was not enough.
It tussled for a couple of years over different standards for a
100mbps Ethernet technology, and soon all network interface cards
were able to handle these speeds. A few years later 1gbps Ethernet
came along, but even this was not enough to sate our need for
speed.
Now, companies are exploring options for 10gbps technology,
mainly as a way to increase performance within datacentres.
Clusters, storage architectures and blade servers all need very
fast communication mechanisms to exchange increasingly large
amounts of data.
The problem with increasing the speed of the network is that it
can adversely affect the server. When an application sends Ethernet
traffic across a network, it traditionally uses the computer's
processor to manage the communications.
The processor interprets an application's data output and feeds
that through to the network. The more traffic you put through it
over time, the fewer CPU cycles the processor has available to do
its other jobs. Consequently, using traditional Ethernet
technologies, the server's performance is likely to decrease as the
speed of the network increases. So, what do you do?
One option is the TCP offloading engine (Toe). This takes the
TCP/IP stack that traditionally runs in software as part of the
operating system and puts it into firmware, usually on the network
interface card. The idea is that, just as with high-end graphics
applications that use dedicated processors for rendering, a
dedicated hardware TCP/IP stack will maintain network performance
without hindering application processing.
However, not everyone is convinced by Toe. One problem is that
using a coprocessor to manage TCP/IP may only solve the problem in
the short term. As the network speed and processor speed increase,
the Toe may find itself increasingly strained by higher
workloads.
Unlike graphics processors, there is nothing in the TCP
instruction set that enables it to scale, says Steve Pope, chief
technical officer at Solarflare, which designs high-performance
networking Asics (application specific integrated circuits).
Instead, experts argue that the best way to solve the problem is
to do away with the problem of TCP/IP communications altogether,
and get applications to write directly to another server's memory
across the network. This concept, called remote direct memory
access (RDMA), lies at the heart of high-performance connectivity
technologies such as Infiniband.
"Infiniband was originally designed for processor-to-processor
data transfers, where you wanted to move data very quickly and
transparently with as little overhead as possible," explains
William Terrill, an associate analyst at Info-Tech Research.
Infiniband's very high speeds make it particularly useful for
applications such as clustering, where multiple machines have to
speak to each other to provide workload sharing and failover
services. However, the drawback has been a lack of
standardisation.
Bill Boas is vice chairman of the Open Fabrics Alliance (OFA),
formerly the OpenIB Alliance, which was founded in 2004 as an
industry effort to produce an open source Linux stack for
Infiniband.
"We did not want to have to figure out whether we should be
using this or that supplier stack, and we did not want to buy
Infiniband hardware from just one supplier," he says.
This is one of the reasons that Infiniband failed to take the
world by storm. Shortly after launch, advocates of the protocol
tried to broaden its scope, proposing it as a solution to many more
things than mere clustering. But Infiniband did things in new ways,
was relatively complex and promised to cause headaches for those
who adopted it within commercial datacentres.
Another problem has been incompatibility. Bob Noseworthy,
technical director at the University of New Hampshire's Dartmouth
Testing Lab, says, "There has been no desire to have a high-end
financial institution or national lab to rewrite their applications
every time they change a supplier's hardware." Dartmouth Lab tests
different suppliers' high-speed networking equipment to make sure
that it works together.
To overcome such limitations, the industry is working on iWarp,
which some are hoping will provide the industry standard
connectivity that Infiniband has not. iWarp is supposed to bring
traditional Ethernet technology and RDMA together, giving users the
best of both worlds.
For Rick Maule, chief executive at iWarp Asic developer
NetEffect, the bottlenecks in handling TCP/IP traffic arise due to
the need to move data between different parts of a system's Ram and
getting the operating system involved in packet processing. This
leads to a slowdown in networking speeds by as much as 40%,
according to Maule. iWarp attempts to solve these problems while
keeping datacentres grounded in the Ethernet world.
The ultimate aim for iWarp, says Maule, is to create a single
network encompassing three distinct fabrics: storage, networking
and clustering. If the protocol can pull this off, it would be of
huge interest to network managers who currently have to manage
different fabrics for protocols such as Infiniband, Fibre Channel
and conventional Gigabit Ethernet.
But iWarp has been in development for some years. The University
of New Hampshire's Dartmouth Testing Lab formed the iWarp Testing
Consortium in 2004, and in the past couple of years, organisations
such as Network Appliance have made noise about it. Today, many of
them do not want to discuss it. What happened?
"I think the demand may not have been there," says Anne
MacFarland, director at consultancy the Clipper Group, who wrote a
report on iWarp in 2004. "People are very careful about spending
these days, and unfortunately that is very inconvenient for things
like iWarp, which need the demand to push them into adoption."
Another reason for some suppliers' reticence is that they could
be preparing iWarp products. Noseworthy says several of the
companies engaged in iWarp interoperability testing at the lab are
preparing for significant developments in the next few months.
iWarp is a collection of four different standards that have been
making their way through the Internet Engineering Task Force
(IETF), and which have now all been ratified. It seems as if the
road is clear for the technology to move forward.
Maule believes that the IETF, which is defining the standards,
will not break with the existing Ethernet standard in its pursuit
of higher speeds. He says a server with an iWarp-enabled network
interface card should be able to run the same applications on the
same network whether the iWarp acceleration on the card is turned
on or not.
However, iWarp sceptics paint a different picture, arguing that
RDMA requires a different type of interaction with the machine
transmitting the data. Because it bypasses the operating system and
goes straight into the machine's memory, they say that a different
protocol stack is necessary.
Why the different stories? Although iWarp cards will communicate
with each other using the same Ethernet standard, the RDMA
technique means they will speak to computer applications
differently, and will therefore need a protocol stack linking the
card to the application.
Where will this come from, and do we face the same application
interoperability problems with iWarp that we faced with Infiniband?
Will companies have to rewrite their applications every time they
switch iWarp suppliers? Not necessarily.
The Open Fabrics Alliance changed its name from the OpenIB
Alliance for a reason. It has been working on the open source
Infiniband stack that it was originally formed to produce, but has
since expanded the scope to produce a similar Linux-based protocol
stack for iWarp.
What does this mean for the positioning of the two technologies?
Infiniband has its root in the high-performance computing market,
and even with the new stack already shipping, it will take the
technology some time to penetrate the datacentre in sufficient
numbers, if this happens at all. iWarp, on the other hand, comes
from low-end roots.
Maule, who is now selling iWarp-enabled cards, hopes that the
technology will take off as quickly as previous high-speed Ethernet
standards have. At the end of the last decade, people started
piloting 1gbps Ethernet, and datacentre deployment was robust in
2000, he says.
"By 2001, there were no data-centre servers that did not have an
adapter. By the time we reached 2003, 1gbps was shipping in
notebooks." There is no reason why iWarp shouldn't follow the same
course, Maule predicts, hoping for ubiquity within five years.
As datacentres are forced to shift increasing volumes of
information between clustered servers and storage devices, and as
interoperability testing between iWarp suppliers appears to be
reaching some sort of pivot point, we may only have to wait 12 to
18 months to find out.
➔ www.openfabrics.org
➔ www.iol.unh.edu/services/testing/iwarp