Linux Today: Linux News On Internet Time.

Fight Image Spam With FuzzyOCR And SpamAssassin On Debian Lenny

Apr 30, 2010, 19:32 (0 Talkback[s])
(Other stories by Falko Timme)

"This tutorial describes how to scan emails for image spam with FuzzyOCR on a Debian Lenny server. FuzzyOCR is a plugin for SpamAssassin which is aimed at unsolicited bulk mail containing images as the main content carrier. Using different methods, it analyzes the content and properties of images to distinguish between normal mails (ham) and spam mails. FuzzyOCR tries to keep the system load low by scanning only mails that have not already been categorized as spam by SpamAssassin, thus avoiding unnecessary work.

"I do not issue any guarantee that this will work for you!

"1 Preliminary Note

"In this article I will use Debian Lenny for the base system.

"I assume that SpamAssassin is already installed and working, with /etc/mail/spamassassin/ as its main configuration directory. If your directory is different (e.g. if you have ISPConfig 2 installed, the directory is /home/admispconfig/ispconfig/tools/spamassassin/etc/mail/spamassassin/), this is no problem. I will annotate where to change what.

"Please make sure that your SpamAssassin version works with FuzzyOCR. For example, the FuzzyOCR version I'm going to install here (fuzzyocr-3.5.1-devel.tar.gz) requires SpamAssassin 3.1.4 or newer."

Complete Story

Related Stories: