Fighting Comment Spam with Bad Behavior, Akismet, Spam Karma

I’ve been using Akismet since I started using WordPress in early 2006. It does a tremendous job of detecting spam. However, it’s not perfect and a few messages gets past its filter each week. Since I enforced moderation on new posters, this kept the spam from showing up on the blog. However, that measure prevented legitimate posters’ comments from appearing without a delay.

I learned about Bad Behavior as a way to fend off bad bots so they can’t even access my blog, let alone create spam comments. From BB’s Benefits page:

Bad Behavior is designed to integrate into your PHP-based Web site, running as early as possible to throw out spam bots before they have the opportunity to vandalize your site with their junk, or even to scrape your pages for e-mail addresses and forms to fill out.

Not only does Bad Behavior block actual vandalism to your site, it also blocks many e-mail address harvesters, resulting in less e-mail spam, and many automated Web site cracking tools, helping to improve your Web site’s security.

I also added Spam Karma to supplement Akismet. With its default settings, it seems to be more lenient than Akismet. I had Spam Karma process the spam that Akismet had already filtered. SK approved some of the spam, which resulted in the comments showing up in the blog, and added some of them to the moderation queue. Note that Akismet had already treated the comments as spam. So I had to train SK to recognize those messages as spam.

SK also instructed to use its moderation interface to moderate comments, instead of the default. This meant that I would be unable to train Akismet. The solution was the Akismet plugin for Spam Karma. This way, if I mark a comment as spam, it trains both SK and Akismet. An unadvertised benefit of using this plugin is that it makes SK factor in Akismet’s judgment on a comment in computing the karma. Thus, spam comments that got low spam scores before will get higher spam scores because Akismet had already detected them as such. This counteracts the leniency that I observed. It also reduces the number of obvious (according to Akismet) spam comments that I would have to moderate.

Another thing I like about Spam Karma is that it gives the user a second chance to submit the comment if it is unsure whether it is spam or ham. This immediate feedback is more helpful to real users than having their comment going to a moderation queue.

It hasn’t been a long time that I have implemented all three comment-spam fighting measures. I hope it makes it easier on real users to add comments, while preventing spam from showing up in my blog.

Things I learned:

  • Don’t reprocess processed comments. It will duplicate the karma scoring if it received a particular comment previously.
  • Don’t reprocess previously approved comments (from pre-SK usage). I cannot prove for sure, but it seems that in reprocessing already approved comments, some of the really old ones got marked as spam. That’s because one of the indicators of spam is replying to old posts.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>