Blocking Spam

Tue Dec 26 22:09:05 UTC 2006

From: "James Wilkinson" <fedora at aprilcottage.co.uk>

>I wrote:
>> Also, I'd strongly recommend training SA's Bayesian analysis, using
>> the sa-learn program. SpamAssassin won't use Bayesian analysis until
>> it has learnt 200 good ("ham") e-mails and 200 spams.
> 
> Tim wrote:
>> Isn't that supposed to be the point of the junk/not-junk buttons on mail
>> clients?
> 
> Usually, that trains the mail clients *own* spam filter.
> 
> Unlike many other things, it's not really ideal to have multiple
> different spam filters. This is because if one thinks an e-mail is a bit
> iffy, but probably OK, it will let it through. But if multiple separate
> spam tests all think that an e-mail is dodgy, one can reject it with a
> lot more confidence.
> 
> SpamAssassin is designed to incorporate different styles of checks --
> bayesian, DNSBL checks on the sending mail server, and *lots* of fixed
> rules -- and come up with one overall spam score. The more checks that
> SpamAssassin can do reliably (which in the case of Bayesian analysis,
> means training SA's own Bayesian engine), the more accurately it can
> spot spam and let through good e-mail.

Scores. The magic is in scores. No single rule (usually) should be
allowed to define spam. (BAYES_99 is good enough here I score it
high enough to guarantee markup as spam. Then I rely on the small
number of negative scoring rules to save random ham messages that
might get all the way to 0.99 BAYES spam probability.)

Besides, WTF good is Bayes with image spam?
{^_^}