Stopping the Spammers
- Posted:
- 09:03 26 Jun 2003
- Topics:
- e-mail | Spam & Phishing | Security
With nearly one in two e-mails being
junk, consumers could stop doing business electronically. Danny
Bradbury finds out how spammers work and what IT departments can do
to fight them.
Asimple invitation enticing people to a reception for a new DEC-20
machine from an engineer in 1978 got the spam ball rolling. Now, it
is threatening to seriously hinder e-mail as a medium for
commercial communications. E-mail filtering company Brightmail has
reported that 46% of all mails that it encounters are spam. If
nearly one in two e-mails received are junk, consumers could easily
revert to other means of communication. Clearly, something has to
be done.
The problem is defining what spam is. Most people with no
commercial interest in e-mail marketing would probably define all
mass e-mailers as spammers. Monica Seeley, founder of IT
consultancy Mesmo and author of a book on using e-mail correctly,
believes that any mail not directly helpful to doing a job is
spam.
Perhaps predictably, mass mail companies have a different view.
Loren McDonald, vice-president of marketing for MailLabs, which
manages mass e-mail marketing campaigns, agrees with most anti-spam
advocates that a company should have a person's permission before
it e-mails them. But he argues that there are several types of mass
e-mailer. The kind that obtains e-mail addresses without consent -
perhaps by surfing websites or buying a CD of names - and uses them
to send untargeted e-mail is clearly a spammer. But there are
others who are simply unaware that you should have the recipient's
permission before you e-mail them, he says, and may take e-mail
addresses from trade directories to build up a base of contacts. A
third group is legitimate, but their e-mails are so badly designed
that they are dismissed as spam.
This group is the most interesting because it obtains opt-in
consent almost by stealth. When you register your address on a
website legitimately offering a product or service, a box asking if
you want to receive related marketing information is already
checked, and tucked away at the bottom of the form. More reputable
companies these days will adopt a double opt-in system, requiring
you to confirm your wishes by replying to an e-mail or accessing a
URL sent to you in an e-mail after your initial registration.
Most spammers do not make any attempt to get opt-in permission from
the recipient. They rely on volume for their commercial return, and
they use a variety of techniques to stop people spotting where
their mails are sent from, and to avoid anti-spam software.
Jim DiDominicus, chief information security officer at the New York
Board of Trade, reveals that he recently set up a secure system for
a group of Florida spammers to help them foil denial of service
attacks from disgruntled spam recipients. "I got to look at their
tools," he explains. "There are people out there that definitely
have a better understanding of sendmail than most of us do and they
are able to exploit that very well."
William Plante, director of worldwide security and brand protection
at Symantec became interested in techniques used by spammers after
he began to see pirate copies of his company's software being sold
openly via spam mail. "At the worst time we were seeing tens of
thousands of complaints a month," he says, adding that every copy
surreptitiously purchased by Symantec was counterfeit. "It was a
real threat to our business, and the threat has not gone," he
says.
Getting e-mail addresses is the easiest part of the process for
spammers, who can buy hundreds of thousands of them on CD for just
a few dollars. These addresses are collected in a number of ways.
Software robots surf for websites with e-mail addresses listed as
"mailto:" hyperlinks and collate them into lists. Dictionary
attacks are another popular method of generating addresses. A
spammer will take a domain name and automatically generate likely
prefixes for e-mail addresses in the hope that some will
work.
Perhaps the most underhand way of acquiring live addresses is the
false opt-out scam. Many spam mails will include a link that you
can follow to get your name removed from the distribution list. In
many cases they are genuine, but in some they are simply used to
identify live mail addresses. This enables spammers to prioritise
their e-mail targets.
Once they have obtained the addresses, open relays, e-mail servers
that by default allow the throughput of third-party e-mails, are
the most useful tools for spammers. Even today, people leave their
SMTP e-mail relays open so that anyone can use them, instead of
locking them down to a set of users. The Open Relay Database
(www.ordb.org) is a publicly available list of these relays, which
systems administrators can use to identify culprits.
Address obfuscation is an important part of the equation when
sending spam, especially in areas where commercial e-mailers are
legally required to prove that the recipient has opted in. Alyn
Hockey, director of research at anti-spam company Clearswift,
explains that spammers use tools to fabricate e-mail headers that
help hide their own addresses. "They just put in the various
options to build the message and the client that sends the message
does it all for them," he says. The spammer's real IP address will
always be there somewhere, "but you could have started with half a
dozen fake ones before you get to the legitimate one".
Spammers attempting to stay one step ahead in the war against
unsolicited e-mail are now trying a new tactic: open proxies. Many
home computers on broadband links are unprotected by firewalls, and
even those that are behind firewalls can be infected by trojan
programs. One recent trojan turns the host machine into an e-mail
server that is then used to send spam e-mail, hiding the real
sender's identity completely.
But why should spammers attempt to hide their addresses if they
ultimately have to be contacted by potential customers? "What
they're doing is contracting a direct mail company to send the job
for them," says Chris Miller, group product manager for e-mail
security at Symantec. That way they can abdicate responsibility for
how the contracted company does the job, he says.
Fighting the spammers is becoming harder, but suppliers are rising
to the challenge. Content scanning is a traditional way of blocking
spam mail. Clearswift, with its Mimesweeper and Enterprisesuite
products, looks for key words in e-mails and also uses wildcards,
says Hockey. It also uses reverse address look-up to try and
identify false mails. However, both of these techniques have their
problems. Spammers are beginning to misspell subjects, using
"secks" instead of "sex", for example. Even wildcard checks may not
pick these up.
Clearswift employs another technique called fingerprinting. It uses
decoy e-mail accounts as spam traps and then analyses the incoming
mail, creating fingerprints of mails which it then lists as data
files on its website. These can then be downloaded by users to
update their own Clearswift server software. Brightmail, which
offers anti-spam software to both enterprises and ISPs, uses a
similar network of addresses which are analysed by a team of
experts and used to produce anti-spam rules.
Real-time blackhole lists have long been an accepted way of
fighting spam. ISPs or corporate customers using this approach on
their e-mail servers check the originating relay server for an
e-mail. If that server is listed for sending large amounts of spam,
indicating that it is either unprotected or that the company
running it is knowingly sending spam, then it will be blocked and
messages will not be delivered.
The biggest problem with real-time blackhole lists is that the
number of false positives - legitimate mails that do not make it
through - are high. The alternative to this - whitelists, in which
only domains trusted by the user get through - can make the problem
worse, especially as spammers are beginning to spoof legitimate
corporate mails in an attempt to sneak past the lists.
DiDominicus says whitelists and blacklists are far from ideal in a
corporate environment. "Whitelists are not good for corporate use
because you never know who is going to try and do business with
you."
Anti-spam firm Mailkey uses a modified version of the whitelist
approach which chief executive officer Tim Dean-Smith thinks will
put an end to the spamming problem. "We let in spam based on what
we think is legitimate rather than blocking what we think is
illegitimate," he says.
The caveat is that if the program blocks a mail, it sends a reply
back to the sender asking them to confirm that it is genuine. As
most spams are automatically produced and replies will not reach
the sender because the return addresses are false, a positive
response usually guarantees a genuine sender, he says. The company
will be releasing a corporate version of its consumer product in
the next month.
Perhaps the most interesting and accurate anti-spam technique
today, however, uses the group consensus technique, which takes the
fingerprinting concept a stage further. Cloudmark, formed in 2000,
uses the Vipul's Razor algorithm, developed by co-founder Vipul Ved
Prakash who used it with friends to try and reduce his personal
spam intake.
A bolt-in to Microsoft Outlook analyses incoming mails and checks
them against a central server holding fingerprints of known spam
e-mail. If a mail slips through, a "block" button lets the user
manually classify the mail as spam, fingerprinting the mail and
uploading it to the central server. At this point, the server
automatically evaluates the mail based on certain criteria,
including the user's past reliability when it comes to classifying
spam.
One of the benefits of this system is that it cuts through the
whole tangled mess of what is and is not spam. The program lets the
user community - currently numbering over 400,000 - decide for
itself. In the future, CEO Karl Jacob says Cloudmark will introduce
a version of the system that gives users more options, using its
consensus model to identify companies offering genuine opt-out
models and offering a separate opt-out button within Outlook for
them to unsubscribe from lists.
However, the group consensus model requires communication between
the corporate software and its central server, which Jacob believes
could present security concerns for corporate customers. So
Cloudmark's Authority product uses yet another anti-spamming
technique, involving a modified Bayesian algorithm. This is a
mathematical theory using probability to make assessments.
One of the disadvantages of running an in-house server-based system
like Authority or Enterprisesuite, as opposed to an outsourced,
ISP-based service such as Brightmail, is that it presents a
processing overhead. New York Board of Trade's DiDominicus says, "I
think the outsourced services did a better job and the performance
was better on our end."
As spammers continue to develop new techniques, anti-spamming
software will also become more innovative. One of the great things
about consensus computing is that it uses what Sun Microsystems
calls the net effect - the idea that the power of a network
increases with the number of people using it. Perhaps now, as spam
mail reaches the point where it begins to represent a serious
threat rather than a mere irritation, we have finally found a way
to dismiss it once and for all.
Legislation in the pipeline
Until recently, anti-spam legislation in Europe has been unsatisfactory to lobbyists because of the lack of an opt-in law. Opt-in restricts commercial e-mailers to send mails only to those people with whom they have an existing relationship and who have permitted them to send mail. Opt-out simply requires spammers to stop sending unsolicited mail after the user asks them to stop. The DTI is currently in consultation over implementing the EC Directive on Privacy and Electronic Communications, which applies opt-in laws to commercial e-mail. The consultation period has just finished and it will become UK law by 31 October.
The problem is that most spam originates outside of Europe, says
David Naylor, a partner at UK technical specialist solicitors
Morrison and Foerster, making it difficult to enforce the law. In
the US, the Can Spam Bill 2003, originally proposed by Senator
Conrad Burns, would impose strict controls on unsolicited
commercial e-mail in the US, including prohibiting the use of
deceptive subject lines and requiring opt-out instructions. The
Computer Owners' Bill of Rights, introduced to the Senate in March,
asks the Federal Trade Commission to establish a "do not e-mail"
registry of opted-out addresses.
Top 10 spam subject lines
E-mail management company Surf Control found that spam mails with these subject lines were the most popular in 2002.
1. XXX Your free adult sites password
2. Check out our new lower prices. Many "drug" types available. (Viagra)
3. Get cash out! Refinance while rates are still low
4. Urgent and confidential (Nigerian hoax)
5. Remote control car the size of a hot wheel!
6. Rated #1 best online casino
7. #1 Pasta pot as seen on TV
8. Get out of credit card debt
9. Meet singles in your area
10. Copy any DVD in one click.
Pros and cons of anti-spam techniques
Blacklists
Pros: Effective for stopping spam from known open relays
Cons: Can result in many false positives if used on its own
Whitelists
Pros: Allows addresses only from known senders
Cons: Potential for many false positives
Bayesian analysis
Pros: Analyses structure of mail without relying on content
Cons: Works on probabilities rather than certainties
Content filtering
Pros: Looks for obvious words or phrases in content. Intelligent use can minimise false positives
Cons: Easy to circumvent with incorrect spellings and innovative content structuring
Fingerprinting
Pros: Creates a unique identifier for a spam mail, almost like a virus signature
Cons: Each spam can be slightly changed, which can confuse the more basic fingerprinting algorithms
Consensus filtering
Pros: Manual element combined with group consensus makes this approach very accurate
Cons: Communication with back-end server is bandwidth-heavy and may concern security-conscious corporates.