NIU Department of Mathematical Sciences
Reducing Spam

Spam, or "unsolicited commercial email" arrives in many forms. These days there also many other kinds of unwanted mail - worms, messages designed to check if your e-mail address is valid, and so on. It takes a human brain to reliably recognize what is "wanted" and what is "unwanted".

Of course computers can help. There are increasingly sophisticated programs that attempt to use various rules to tell a "bad" message from a legitimate one. This is usually done by checking the letter (headers, character encoding, text of the body etc.) for signs of spam.

One of the best packages which does that is SpamAssassin. It works quite hard at spotting those tell-tale signs and gives the message a score indicating how likely it is that the letter is "unwanted".

Other software, better suited to process mail, can then use this "spam score" to decide what to do with the letter.

But please be careful: if automatic mail filtering software is poorly configured, you may end up losing valid email.

Activating the filter

You will need to feed all your messages into procmail, which is a versatile mail parsing program. We want to set up two rules: one, pass all messages to the spam checker first; second, if the message is passed back with an indication that it's spam, put it in a special mailbox (or delete it, or forward it somewhere, etc.)

This is done by creating a file .procmailrc in your home directory (use your favorite editor as editor ~/.procmailrc to edit it), containing

:0fw: spamassassin.lock
| /usr/local/sass/bin/spamassassin

:0:
* ^X-Spam-Status: Yes
caughtspam
The first "recipe" simply pumps each incoming letter through the SpamAssassin program. That program does its checks, and then inserts a special X-Spam-Status header into it.

The second recipe just checks whether that special header says "Yes", and if so appends the letter to a file called caughtspam.

Now you have to tell the mail server software to pass all messages on to procmail; create/edit yet another file called .forward, e.g. with vi .forward, containing simply this:

"| procmail -t"
Finally, make sure that both files have correct permissions:
chmod 644 .forward .procmailrc

Monitoring and tweaking things

If all goes well, much of your spam will now be dumped in the mailbox caughtspam inside your home directory. You want to watch it at least for a while to see if any valid e-mail is being blocked. You should also regularly delete unwanted messages from it. Depending on the mail program you normally use, do
elm -f caughtspam
mail -f caughtspam
pine -f ../caughtspam
or similar.

SpamAssassin will automatically create a `hidden' subdirectory .spamassassin with some configuration files in it. You can customize some of the settings by editing the file user_prefs in that location. For example, the spam "score" at or above which mail will be tagged as spam is set to a rather aggressive 6 right now. If you worry about losing legitimate e-mail, esp. if you correspond with people who send you letters in HTML, attachments, use foreign character encodings etc., you may want to set that treshold to a higher value, e.g. 10 or even 15. Do this by removing the comment character # from the required_hits line, and change the value.

Or, as another example, if you don't expect to ever receive PC executables in legitimate mail, you can raise the default score that such attachments contribute - from the very low default of 0.1 to something like 3 - by adding a line

score MICROSOFT_EXECUTABLE      3

A list of all tests and their default scores can be found here.

When you are confident that the filter is doing a good job and not tagging valid mail as spam, you may want to change the .procmailrc file and instead of saving the spam in a mailbox, make it disappear. Replace the "recipe" above with

:0
* ^X-Spam-Status: Yes
/dev/null
(note that the colon after the 0 disappeared). Again, please be very careful - when set up incorrectly, this can trash important messages, with no chance of restoring them from backups.

More on procmail

Full documentation can be found at www.procmail.org. It is a powerful program, and configuring it is not for the faint-hearted. We will only mention two common situations.

First, you may want to have the mail scanned for spam on our system, and then forward messages that passed the check to some other account. Add the following to the end of the .procmailrc, so it will be executed after the spam checking:

:0
! some@outside.account
In this case please use the /dev/null rule in the anti-spam recipe, so that the unwanted messages will not accumulate on our system - unless you are willing to log in every few days and purge the saved mail.

If you read mail in both places and want a copy delivered to your math mailbox as well as forwarded somewhere, add the flag "c" (for "clone" or "copy") to the recipe start:

:0c
! some@outside.account

Second, procmail can be configured to route mail to mailboxes on the basis of the "From:" or "Subject:" or pretty much anything else. For example, to make all letters from a known mailing list go to a separate mailbox for future perusal, you can add this to the end of the .procmailrc file:

:0:
* ^From bugtraq-bounce@securityfocus.com
Mail/bugtraq
And if you are sure you never want to see mail with subject "Lose weight fast", use
:0
* ^Subject: Lose weight fast
/dev/null
Again note the missing second colon; as a rule the recipe should start with :0: when saving to a real file, and with :0 when deleting or forwarding.