Scanning tool gets to meat of the matter

When Derek Wyatt, Labour MP for Sittingbourne and Sheppey, called last month for ISPs to be made responsible for any unsolicited...

When Derek Wyatt, Labour MP for Sittingbourne and Sheppey, called last month for ISPs to be made responsible for any unsolicited e-mail received by their subscribers, he did more than simply bring attention to the number of pornographic images he receives on a daily basis - a startling amount, by his own account.

The MPalso highlighted a growing problem in this country for all those who rely on the Internet, and on e-mail in particular, as a primary means of communication.

Spam is massively on the increase - a recent MessageLabs' survey of IT managers showed that one in seven business e-mails received are unwanted, and about 10% of the working day is spent dealing with them.

And the problem is set to get worse. It is often stated that when the US sneezes the UK catches cold - and the ratio of spam per e-mail for respondents to a similar survey in the US is around a staggering one in three. So the situation is going to get worse.

The resultant problem is not just one of public decency, however. Even "harmless" content, such as the marketing of get-rich-quick schemes or the latest consumer products, has enormous impact on firms in terms of the loss of employee time, bandwidth and storage space.

As with e-mail viruses, spam is now something that needs to be properly contained if it is not to swamp our e-mail system and, in the words of one commentator, "kill the Internet's killer app".

What is unique to the problem of dealing with this type of mail, however, is not so much its quantity but the difficulty of its definition. If you want to stop spam e-mails getting through - or in Wyatt's case, make someone responsible for that on your behalf - you have to be very clear about what is and isn't "unsolicited".

With viruses, it is easy - stop the lot. But one man's spam is another's useful information. While the vast majority of respondents to the MessageLabs survey were clear that marketing material or news from someone they didn't know constituted spam, a third said that a promotional e-mail or newsletter from a known source would also be classified as such.

For technology to match up to this variation in definition requires sophisticated filtering processes. Traditional black and white lists, which pick up only on e-mails from known spammers and work on an "all or nothing" basis, simply are not up to the task. Instead, you need a flexibility of approach that will allow firms and individuals to set their own definition and filter accordingly.

It is appropriate that "heuristics" scanning, a technology that has transformed virus detection from a hit-and-miss affair to a highly intelligent and accurate process, should also be the solution to the current spam dilemma. Essentially, heuristics enables the detection of new viruses without the need for signatures - a technique that is best employed at the Internet level so viruses can be stopped before they reach the recipients' inboxes and start doing damage.

These heuristic techniques have now been adapted to identify and stop spam by scoring each e-mail against a set of rules based on known spam subjects and techniques. For example, the word "erection" might be worth five points, but if appeared in the same sentence as the word "Viagra", the score could be doubled. If the message achieves more than a specified score - set in advance by the recipient - then the e-mail is immediately identified as spam and stopped.

By using heuristics, more sophistication can be built into stopping spam on a user-by-user basis - allowing greater control over what does and what does not get through. For example while some companies may filter any e-mails containing the word "erection", a firm of architects probably would not.

The percentage success rate for the heuristics method of spam filtering is in the high 90s, with minimal false positives. Indeed, successful elimination of false positives will become the holy grail of anti-spam services.

Currently, ISPs hold the opinion that they should not be responsible for stopping Wyatt's spam e-mail. However, if they had the technology to do it successfully, and Wyatt was willing to pay for it, I am sure it is a service they would be happy to provide.

Mark Sunner is chief technology officer at MessageLabs

Read more on IT risk management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.