spamassassin/user_prefs

Mon Mar 22 15:36:47 UTC 2004

I kept all my spam mail that I recieved over the last year in a separate
mailfolder in Outlook. I did that because now and then Outlook would
identify a valid email as spam so I kept them in a separate folder for
sorting out every week or so. 
When I moved to Fedora a good 2 months ago I imported the spam and
trained spamassassin with it.
I must say that here it now works excellent with Evolution.

--Yves

On Mon, 2004-03-22 at 16:12, Nigel Wade wrote:
> Charles Howse wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > Hi,
> > 
> > While reading another thread, I remembered I had no custom preferences for 
> > spamassassin, and decided to create some.
> > 
> > I use the default settings for starting spamassassin at boot, and the 
> > following filters in KMail:
> > 1. In KMail menus, select Settings->Configure Filters
> > 2. Create a new filter with filter criteria:
> >     <any header> matches regular expression .
> >     (the regular expression is just the character "." meaning 
> >     "any character")
> >     and filter action:
> >     pipe through spamc
> >     Uncheck the box "stop processing if this filter matches"
> > 3. Add a second filter below the one created in step 2, with criteria:
> >     <any header> contains X-Spam-Flag: YES
> >     and action:
> >     move to folder trash
> >     (or whatever you want to do with your spam)
> >     check the "stop processing..." box
> > 
> > These filters are working fine, with the exception of those html spams with 
> > all the random words in the body when viewed in text mode.
> > 
> > I was just wondering if anyone would like to share some _generic_ preferences 
> > for ~/.spamassassin/user_prefs, or comment.
> 
> 
> The way to catch those is with Bayesian filtering. You need to teach the 
> Bayesian filter with sufficient messages so that it learns what is spam and 
> what is not (at least 1000 of each is a good rule of thumb for best accuracy).
> 
> The random words don't have a significant effect on the Bayesian scoring as 
> it uses words which are the most like spam and least like spam to determine 
> the overall "spaminess" of the message.
> 
> When it's trained well it's very good. I recently installed SA on our mail 
> server, trained with about 5000 spam and 3000 ham. Out of the last 3000+ 
> messages I've received I've got no false positives and it's only failed to 
> identify 2 spams.
> 
> But if you are going to do spam filtering in the mail client, why not use 
> Mozilla/Firebird? It has Bayesian filtering built in, and it's pretty good 
> once it's been taught. It's much easier to teach than SA for a single user - 
> a single mouse click is all that's required for each message.
> 
> -- 
> Nigel Wade
>