Virtualisation is widely seen as the best way to construct new
datacentres. It allows datacentre managers to reduce the number of
physical servers, thus lowering electricity costs and increasing
server efficiency.
But DFS Deutsche Flugsicherung, the air traffic operator for
Germany, has found major limitations in the current crop of
virtualisation products.
The company has made savings by switching its test environment
onto the open source virtualisation software,
Xen hypervisor. But it is still a long way off from running the
critical air traffic control system in a virtual server
environment.
The first hurdle DFS has faced is that its traffic control
system requires certification from the German government. There is
only one government body authorised to certify the system and the
process is time-consuming. This means, says Alexander Schanz,
datacentre manager at DFS Deutsche Flugsicherung, that by the time
equipment wins its certificate, it is often already out of
date.
And because most virtualisation hypervisor software is
proprietary, it difficult to verify the integrity of source code -
vital in a safety critical system such as air traffic control, said
Schanz, presenting at the
Burton
Group Catalyst conference today in Prague.
"A server may last 18-24 months, but it takes 18 months to
certify air traffic control systems, so the system cannot be
certified before the server hardware is replaced," he said.
DFS chose the Xen software because
it was able to assess Xen’s freely available source code. Schanz
says the German government certification authority certified the
Xen hypervisor rather than rival products because Xen made the
source code available.
DFS Deutsche Flugsicherung has also experienced problems with
systems administration of virtual server environments. "Some of our
asset management tools could not support virtual machines. We also
found that our internal processes only worked with physical
hardware," says Schanz. For instance, the company's asset tracking
tools could not tell IT how many virtual servers were running.
DFS discovered the hard way that IT administrators need special
training to work in virtual environments. "In one case, an admin
staff pressed the restart button on a physical server, which
stopped and rebooted 38 virtual machines," says Schanz. The
administrator only needed to reboot the virtual machine that had
crashed, not the physical server. "The IT administrator has to
understand what virtualisation means."
Grappling with legacy systems is another challenge. The software
for air traffic control was written 10-15 years ago when different
programming practices were the norm. These older applications try
to allocate as much memory as it can when it starts up, says
Schanz. In practice this means the hypervisor is unable to get the
memory it needs to run virtualisation. As a result says Schanz, "We
cannot find a complete solution to support virtualisation on air
traffic control client systems."
DFS Deutsche Flugsicherung faces challenges reducing the volume
of client hardware needed to run air traffic control. Air traffic
control users run large, specialist monitors, with screen
resolution of 2048 by 2048 pixels. This poses two technical
problems. The first issue is how to connect the air traffic control
2k by 2k displays which need specialist graphics cards, to virtual
servers.
The second problem is that the connection between the display
and the virtual machine is not fault tolerant. Schanz has had to
develop custom fault tolerant switching code, which would allow air
traffic control users to switch seamlessly between two air-traffic
control systems, if one fails, because no protocols exist to
support this.
Virtualisation is a "no-brainer" in many situations. But DFS
Deutsche Flugsicherung's experience shows that rolling out
virtualisation is not always easy. It requires forward-planning, IT
staff training and even the redevelopment of legacy
applications.
Lessons learnt
- In regulated industries, certifying hypervisor code can be
problematic because, apart from Xen, the source code is not
available.
- Legacy code cannot easily be migrated to a virtual server
environment. An application written 10-15 years ago will need to be
modified.
- The industry has yet to develop fault tolerant switching, which
would enable users to run virtual client environments that
switch-over seamlessly, if there is a failure. DFS Deutsche
Flugsicherung has had to develop its own fault tolerant
protocol.
- IT admin staff will need training to understand IT management
of a virtual server management to avoid them managing physical
servers.