spamassassin doesn't seem to be using bayes

jdow jdow at earthlink.net
Fri Oct 21 23:13:29 UTC 2005


From: "D. D. Brierton" <darren at dzr-web.com>

> I'm using FC4 with spamassassin-3.0.4-1.fc4. fetchmail delivers mail to
> a locally running postfix. spamd is running as a service, and spamc is
> called by procmail on my mail. My setup is almost identical to that
> desribed here:
>
> http://wiki.apache.org/spamassassin/UsedViaProcmail
>
> However, despite the fact that I have trained spamassassin on a vast
> amount of both ham and spam using sa-learn, I suspect that Bayesian
> testing is not being applied. I became suspicious that this might be the
> case after receiving over a dozen almost identical messages and despite
> training spamassassin on them they are still not being identified as
> spam. So I started looking at the headers that spamassassin adds to each
> message more closely. Here is the header it added to a recent message
> from this list:
>
> X-Spam-Status: No, score=0.0 required=5.0 tests=RCVD_BY_IP
> autolearn=failed version=3.0.4
>
> And here is an example of an incorrectly identified spam message:
>
> X-Spam-Status: No, score=2.8 required=5.0 tests=HELO_DYNAMIC_IPADDR,
> RCVD_BY_IP autolearn=no version=3.0.4

So we come back to this message and note that indeed for spamc as he
has it invoked the Bayes scores are not working.

So what we need to know is what spamd options are used and how he is
calling spamc.

Spamd in the setup I have gets called this way:
SPAMDOPTIONS="-d -c -m5 -Hi -A 192.168.0.,127. --max-conn-per-child=15"

The procmail recipe I have is rather complex. I use a lot of "full" and
"all" based rules in my user_prefs file. So I have installed a work-
around for a bug, presumably in perl itself, which these rules trigger
on a seemingly random basis. The spamassassin part of the .procmailrc
file looks like this:
===8<---
# Remove some spurious markups that some spams seem to include
:0
* ^X-Spam-Status:
{
    :0 fw
    | formail -R "X-Spam-Status:" "X-False-Spam-Status:"

    :0 fw
    | formail -A "X-Nasty: Aren't we?"
}

:0
* ^X-Spam-Level
{
    :0 fw
    | formail -R "X-Spam-Level" "X-False-Spam-Level"
}

# This one is important to remove. It is used for the PerMsgStatus.pm
# bug work around.
:0
* ^X-Spam-Checker-Version:
{
    :0 fw
    | formail -R "X-Spam-Checker-Version:" "X-False-Spam-Checker-Version:"
}

##############################################################################
# run spamassassin on things not from the spamassassin list
##############################################################################
:0
* < 250000
* !^List-Id: .*(spamassassin\.apache.\org)
{
   :0 fw: spamassassin.lock
   | /usr/bin/spamc -t 150 -u jdow
}

# Did we get a PerMsgStatus.pm bug hit? If so we have scanned but
# no SA markups.
# So we did at least start processing the message. Does it have an
# SA markup, is it smaller than 250k BYTES, and is it NOT to one of
# the spamassassin lists?
:0 fw
* !^X-Spam-Checker-Version:
* < 250000
* !^List-Id: .*(spamassassin\.apache.\org)
{
   # Rescan it with raw spamassassin slightly niced.
   :0 fw
   | nice -n 1 /usr/bin/spamassassin

    # For debugging mark the message clearly for easy sorting.
#   :0 fw
#   | Formail -A "X-JdowMissed: SpamAssassin checks bombed first time."

    # Alternative subject marking for debugging.
#   :0 fw
#   | sed -e 's/Subject:/Subject: [ZZ Missed]/'

   # Place a COPY of the message in sa_failed folder. Be nice to the
   # poor thing and review the folder from time to time. {^_-}
   :0c: clone1.lock
   $HOME/mail/sa_failed
}
# Bingo - we're done.
===>8---
Note that spamc is invoked with a LONG timeout, which is not usually
needed. And I explicitly tell it to run as me. This is from MY home
directory's .procmailrc file, of course. This could probably be
generalized for a system wide /etc/procmailrc file quite easily via
"$USER". (I think I'll remove the timeout "-t 150" option. It was
needed on the old machine, 66MHz Pentium. It's not on the newer machine,
a 1GHz Athlon. {^_-}) I suspect the "-c $USER" aspect is what is missing
from your procmailrc.

{^_^} 





More information about the fedora-list mailing list