Spamassassin and Spambayes

jdow jdow at earthlink.net
Tue Jun 27 08:19:11 UTC 2006


From: "Claude Jones" <claude_jones at levitjames.com>

> On Monday June 26 2006 21:30, jdow wrote:
>> Puny results, kid. {^_-} With SpamAssassin, rules, and a carefully hand
>> fed Bayes I'm not kidding when I say I get about one spam in 1000 that
>> creeps through. (And for the most part those train at near Bayes 0.50.)
>>
>> >> Am I missing something here? Is there a better way to train spamassassi
>> >
>> > Some people find it helpful to change the BAYES_99 test to me equal to
>> > the spam cutoff or slightly below it. If the spam cutoff value is 5.0,
>> > set BAYES_99 test value is set at 5.0 or 4.9.
>>
>> NOOOOOOOOOOOOOOOooooooooooooooo!
>>
>> If you are going to automatically train Bayes widen the automatic
>> thresholds from the stock settings, at least at first. Once you have
>> the weight of a working Bayes behind you the stock settings might
>> work OK. I studied how the automatic classification system was
>> supposed to work, thought about it a little while, and decided I
>> am a big girl and can spoon feed SpamAssassin. Over the years I've
>> been running Bayes (I forget when it appeared. I first hit SA at
>> 2.43 I think it was - or maybe even 2.2 something.) I've trained on
>> less than 2000 hams and 2000 spams. Bayes 99 alone catches 85% of the
>> spam and hits almost no spam. Bayes 80 and 95 account for another
>> almost 6%. The rest comes from the various rule sets I have running.
>> I suppose I should feed the Bayes a little more. I've seen it doing
>> better. But at the scoring I have (Bayes 99 is 5.001) I see such good
>> results I am in the "if it ain't broke, don't fix it" mode. {^_-}
> 
> I was hoping you'd chime in, Joan. I looked at the article Aaron linked to, 
> but anybody who claims to have tested five programs in depth for a column, 
> and presents results like that is just not convincing to me. You have 
> obviously figured out spamassassin - every time I've tried, I've found the 
> documentation cryptic and tedious - maybe there's better out there, and I 
> need to work on it some more, but, in the spirit of your last quoted sentence 
> just above, after getting Spambayes working yesterday afternoon, and training 
> on a couple of hundred messages, I came home this evening and found only two 
> spam mails in my inbox - there were 313 classified spam mails in the trash, 
> and after going through those, there was only one false positive, and that 
> was from a commercial advertising list I'm subscribed to - I guess my 
> solution ain't broke either... 

Sounds like it isn't. Double check that you do not see any "ALL_TRUSTED"
messages. That means SA could not guess your "trusted" mail server(s).
And that "trusted" is a rather loose trust. It's the furthest our from
your site that all email passes through and you trust not to lie to you
about the message headers it inserts. In my case that's the Earthlink
servers and my own machine. (Fetchmail often confuses poor little old
SpamAssassin. So I set it explicitly.)

For mailing lists to which you are subscribed you can use the
"whitelist_from_rcvd", which says you are whitelisting messages that
claim to come from a sender and always goes through a specific mail
server name.

{^_^}
{^_^}




More information about the fedora-list mailing list