German air traffic controller exposes virtualisation limitations

Virtualisation is widely seen as the best way to...

Virtualisation is widely seen as the best way to construct new datacentres. It allows datacentre managers to reduce the number of physical servers, thus lowering electricity costs and increasing server efficiency.

But DFS Deutsche Flugsicherung, the air traffic operator for Germany, has found major limitations in the current crop of virtualisation products.

The company has made savings by switching its test environment onto the open source virtualisation software, Xen hypervisor. But it is still a long way off from running the critical air traffic control system in a virtual server environment.

The first hurdle DFS has faced is that its traffic control system requires certification from the German government. There is only one government body authorised to certify the system and the process is time-consuming. This means, says Alexander Schanz, datacentre manager at DFS Deutsche Flugsicherung, that by the time equipment wins its certificate, it is often already out of date.

And because most virtualisation hypervisor software is proprietary, it difficult to verify the integrity of source code - vital in a safety critical system such as air traffic control, said Schanz, presenting at the Burton Group Catalyst conference today in Prague.

"A server may last 18-24 months, but it takes 18 months to certify air traffic control systems, so the system cannot be certified before the server hardware is replaced," he said.

DFS chose the Xen software because it was able to assess Xen’s freely available source code. Schanz says the German government certification authority certified the Xen hypervisor rather than rival products because Xen made the source code available.

DFS Deutsche Flugsicherung has also experienced problems with systems administration of virtual server environments. "Some of our asset management tools could not support virtual machines. We also found that our internal processes only worked with physical hardware," says Schanz. For instance, the company's asset tracking tools could not tell IT how many virtual servers were running.

DFS discovered the hard way that IT administrators need special training to work in virtual environments. "In one case, an admin staff pressed the restart button on a physical server, which stopped and rebooted 38 virtual machines," says Schanz. The administrator only needed to reboot the virtual machine that had crashed, not the physical server. "The IT administrator has to understand what virtualisation means."

Grappling with legacy systems is another challenge. The software for air traffic control was written 10-15 years ago when different programming practices were the norm. These older applications try to allocate as much memory as it can when it starts up, says Schanz. In practice this means the hypervisor is unable to get the memory it needs to run virtualisation. As a result says Schanz, "We cannot find a complete solution to support virtualisation on air traffic control client systems."

DFS Deutsche Flugsicherung faces challenges reducing the volume of client hardware needed to run air traffic control. Air traffic control users run large, specialist monitors, with screen resolution of 2048 by 2048 pixels. This poses two technical problems. The first issue is how to connect the air traffic control 2k by 2k displays which need specialist graphics cards, to virtual servers.

The second problem is that the connection between the display and the virtual machine is not fault tolerant. Schanz has had to develop custom fault tolerant switching code, which would allow air traffic control users to switch seamlessly between two air-traffic control systems, if one fails, because no protocols exist to support this.

Virtualisation is a "no-brainer" in many situations. But DFS Deutsche Flugsicherung's experience shows that rolling out virtualisation is not always easy. It requires forward-planning, IT staff training and even the redevelopment of legacy applications.

Lessons learnt

  • In regulated industries, certifying hypervisor code can be problematic because, apart from Xen, the source code is not available.
  • Legacy code cannot easily be migrated to a virtual server environment. An application written 10-15 years ago will need to be modified.
  • The industry has yet to develop fault tolerant switching, which would enable users to run virtual client environments that switch-over seamlessly, if there is a failure. DFS Deutsche Flugsicherung has had to develop its own fault tolerant protocol.
  • IT admin staff will need training to understand IT management of a virtual server management to avoid them managing physical servers.

Read more on Server hardware