SpamAssassin Learning
There are numerous pages on how to setup SpamAssassin to “automagically” learn (using the built in Bayesian filter), but here’s my contribution.
I built two little scripts that run as cron jobs on my server.
First up is spam-learn.sh:
spamlearn=”/usr/bin/sa-learn”
LearnDirs=`find /var/vpopmail/domains/ -name .LearnAsSpam -type d`
JunkDirs=`find /var/vpopmail/domains/ -name .Junk -type d`
for dir in $LearnDirs; do
$spamlearn –spam $dir/cur > /dev/null
$spamlearn –spam $dir/new > /dev/null
rm -f $dir/cur/*
done
for dir in $JunkDirs; do
$spamlearn –spam $dir/cur > /dev/null
$spamlearn –spam $dir/new > /dev/null
rm -f $dir/cur/*
done
This will find every single user on my box that has either a Junk or LearnAsSpam folder, and attempt to feed their messages to the sa-learn as spam. Note here that it will also go ahead and once it’s fed some Junk, it will remove it — only if it’s been marked as “read.”
I am debating using a different rm call — this way accidental junk is not deleted when placed in the Junk folder by Thunderbird….
I may alter it to use:
but I get nervous with the rm -f using a find!
Then, I run ham-learn.sh
spamlearn=”/usr/bin/sa-learn”
HamUsers=”seebq.com/me jodibell.com/jodi”
for dir in $HamUsers; do
$spamlearn –ham /var/vpopmail/domains/$dir/.maildir/cur > /dev/null
done
So I simply copied this files to /etc/cron.hourly and we’re off and running.
Wallah!
References:
SpamAssassin and Procmail
Shell scripts in 20 pages
SpamAssassin
