Problems with virus scanning XML files could slow Office 2003 uptake

Security experts have warned IT directors considering an upgrade to Office 2003 that the problem of identifying macros in XML...

Security experts have warned IT directors considering an upgrade to Office 2003 that the problem of identifying macros in XML files will substantially slow down the process of stopping viruses and could require additional investment in hardware.

If unresolved by the summer launch date, the problem could hinder take-up of Microsoft's latest productivity suite, they said.

The problem is that macros - pre-recorded menu selections and commands - in Office 2003 documents are saved in random places when the documents are saved as XML files. This means the entire file has to be parsed to identify them, creating more work for the scanning software.

"This is a big issue," said Alex Shipp, senior anti-virus technologist at security firm MessageLabs. "It is important because of the slowdown impact. The big danger is that users will turn off virus scanners because they are too slow. Anti-virus software companies are very worried about this and want Microsoft to do something."

Shipp believes scanning XML files for viruses would be "at least twice as slow" as scanning regular files and he expects it to be worse on desktops than on servers and gateways. The problem could affect the take up of Office 2003, he said.

One option open to Microsoft is to include information on macros in the file headers. "I think that is the only possible solution," said Shipp.

Graham Titterington, senior analyst at Ovum, said, "It looks like a pressing problem". Titterington said the problem lies not with the XML language itself but with the XML schemas used to define documents. Like Shipp, he suggested including an index for macros in the file headers. "I would have thought it would be a relatively easy thing to slot in," he said.

However, Stuart Taylor, head of virus labs at security firm Sophos, pointed out that anti-virus software also needs to scan embedded objects, such as pictures or hyperlinks, because they can be infected executables.

As embedded objects may also be scattered throughout a file, if Microsoft were to put information in the headers about macros, they would need to do the same for embedded objects, he said.

Taylor also warned that the text-based format of XML files will make it easier to inject a virus into the file without having to open Word.

Titterington said IT directors running anti-virus software on XML files would need to invest in more hardware to get the job done to current timescales. A short-term alternative would be to stick on older operating systems and versions of Office software, he said.

XML functionality is central to Office 2003, and Microsoft hopes it will persuade users to upgrade. However, IT user groups such as Tif and Elite have questioned whether users need the extra functionality.

Simon Marks, Office product manager at Microsoft, said the company is working closely with anti-virus software suppliers and a solution to the macros problem would be ready in "a number of weeks".


Why do XML files take longer to scan?   

In .doc files macros are always kept in certain locations, so anti-virus software knows where to scan for malicious code. However, in the text-based XML file format, macros can be held anywhere in the file. This means the entire document has to be scanned to find the relevant tags and ensure the file is virus-free.

Read more on Antivirus, firewall and IDS products