CastleCops, Internet Crime Fighters
Need help? Click here to register for free! Absolutely zero advertisements on this site!

Donation/Premium
spacer
block bottom
Security Central
spacer
· Home
· PIRT/Fried Phish
· MIRT
· SIRT
· Deutsch
· Wiki
· Newsletter
· O16/ActiveX
· CLSID List
· Contest2007
· Downloads
· Feedback (send)
· Forums
· HijackThis
· Hijacktrend
· LSPs
· My Downloads
· O18
· O20
· O21
· O22
· O23
· O9
· Premium
· Private Messages
· Proxomitron
· Reviews
· Search
· StartupList
· Stories Archive
· Submit News
· WsIRT
· Your Account
· Acceptable Use Policy
block bottom
spacer spacer
image Privacy: HOWTO: Fight Spam with SpamProbe image
Linux
By Steve Hastings
May 1, 2003

How to set up this trainable e-mail filter to eliminate false positives, work with IMAP and run as a cron job.

I get a lot of spam e-mail. These days, however, most of it doesn't go to my e-mail Inbox, because I'm filtering my e-mail with SpamProbe. SpamProbe is a spam detector; you train it to recognize what you consider to be spam. It builds databases of keywords from your e-mail messages and then uses the keyword databases to decide whether incoming e-mail messages are spam.

In this article I explain how to set up SpamProbe to intercept spam e-mails and file them into a folder named Spam. If you prefer, you also may set it up to delete these messages. The setup I describe enables spam checking on a per-user basis, and users control which of their messages are considered to be spam. The setup is completely server-based and thus works with any e-mail client. Users need to understand only how to move messages from one mail folder to another.

Because it handles spam completely on the server, SpamProbe is great for users who must access their mail over a slow link, such as a modem. Client-based filters must download all the mail, spam and non-spam alike, while a server-based filter can keep all the spam on the server.

The setup described in this article works with any trainable spam filter, not only SpamProbe.

Why SpamProbe?

Why use SpamProbe instead of another spam filter? I argue you should you use it because it is a Bayesian filter with some advanced features. Bayesian spam filters work by building two databases: a database of keywords from spam e-mails and a database of keywords from nonspam e-mails. They then analyze each new e-mail message, comparing keywords against the two databases and estimating the probability the message is a spam message. You train a Bayesian spam filter by feeding spam messages to it so it can build a spam keywords database; or, you can feed it nonspam messages so it can build a nonspam keywords database. Whoever controls the training of the filter thus controls what that filter considers spam.

As the filter processes incoming e-mail messages, it continues to update its keyword databases. Each message it flags as spam also is used to update the spam keywords database. As users feed corrections back into the system, the filter becomes better and better at detecting spam.

Bayesian spam filters are efficient: they don't load down a server too much, and they don't depend on a connection to an external server to access a spam database. Once they are trained, they can block almost all spam messages, with few or no false positives.

SpamProbe builds its database using not only single keywords but pairs of keywords too. The word money, by itself, might not indicate spam reliably; the phrase "make money" are a much better indicator. An ideal spam filter might use even longer chains of words, but that would be quite expensive computationally.

SpamProbe also correctly handles e-mails and attachments in BASE64 or quoted-printable encoding, and it has a feature for handling Asian character sets. SpamProbe is released under the QPL, so it is free for use by anyone.

Full Story
Linux Journal

Posted on Thursday, 01 May 2003 @ 20:24:51 UTC by cj (2032 reads)
[ Trackback ]
image

"Privacy: HOWTO: Fight Spam with SpamProbe" | Login/Create an Account | 0 comments
Threshold
The comments are owned by the poster. We aren't responsible for their content.

No Comments Allowed for Anonymous, please register
 
Login
spacer
Nickname

Password

Security Code: Type Security Code: Usage signifies AUP acceptance
· New User? · Click here to create a registered account.
block bottom
Related Links
spacer
· del.icio.us!
· digg it!
· reddit!
· TrackBack (0)
· Linux.com
· PHP HomePage
· HotScripts
· Linux Manuals
· W3 Consortium
· Spam Cop
· More about Linux
· News by cj


Most read story about Linux:
The world's easiest Linux desktop deployment and management - NOW FREE!

block bottom
Article Rating
spacer
Average Score: 1
Votes: 1


Please take a second and vote for this article:

Bad
Regular
Good
Very Good
Excellent


block bottom
Options
spacer

Printer Friendly Page  Printer Friendly Page

block bottom
spacer spacer