Tag Archives: blogging

Fighting Comment Spam with Bad Behavior, Akismet, Spam Karma

I’ve been using Akismet since I started using WordPress in early 2006. It does a tremendous job of detecting spam. However, it’s not perfect and a few messages gets past its filter each week. Since I enforced moderation on new posters, this kept the spam from showing up on the blog. However, that measure prevented legitimate posters’ comments from appearing without a delay.

I learned about Bad Behavior as a way to fend off bad bots so they can’t even access my blog, let alone create spam comments. From BB’s Benefits page:

Bad Behavior is designed to integrate into your PHP-based Web site, running as early as possible to throw out spam bots before they have the opportunity to vandalize your site with their junk, or even to scrape your pages for e-mail addresses and forms to fill out.

Not only does Bad Behavior block actual vandalism to your site, it also blocks many e-mail address harvesters, resulting in less e-mail spam, and many automated Web site cracking tools, helping to improve your Web site’s security.

I also added Spam Karma to supplement Akismet. With its default settings, it seems to be more lenient than Akismet. I had Spam Karma process the spam that Akismet had already filtered. SK approved some of the spam, which resulted in the comments showing up in the blog, and added some of them to the moderation queue. Note that Akismet had already treated the comments as spam. So I had to train SK to recognize those messages as spam.

SK also instructed to use its moderation interface to moderate comments, instead of the default. This meant that I would be unable to train Akismet. The solution was the Akismet plugin for Spam Karma. This way, if I mark a comment as spam, it trains both SK and Akismet. An unadvertised benefit of using this plugin is that it makes SK factor in Akismet’s judgment on a comment in computing the karma. Thus, spam comments that got low spam scores before will get higher spam scores because Akismet had already detected them as such. This counteracts the leniency that I observed. It also reduces the number of obvious (according to Akismet) spam comments that I would have to moderate.

Another thing I like about Spam Karma is that it gives the user a second chance to submit the comment if it is unsure whether it is spam or ham. This immediate feedback is more helpful to real users than having their comment going to a moderation queue.

It hasn’t been a long time that I have implemented all three comment-spam fighting measures. I hope it makes it easier on real users to add comments, while preventing spam from showing up in my blog.

Things I learned:

  • Don’t reprocess processed comments. It will duplicate the karma scoring if it received a particular comment previously.
  • Don’t reprocess previously approved comments (from pre-SK usage). I cannot prove for sure, but it seems that in reprocessing already approved comments, some of the really old ones got marked as spam. That’s because one of the indicators of spam is replying to old posts.

Five Reasons I Blog

I’ve been blog-tagged by Adam (tag) and Sebastian (tag) to write about why I blog.

First, let me clarify that there have been three incarnations of my blog. When I started in 2004, I used a phpbb forum to post “blog” entries. I did it for three months but then lost interest. In early 2006, I decided to try again but installed WordPress, instead of a forum application. I’ve been hooked since then. I don’t write regularly but at least I don’t think I will stop blogging the way I did in 2004. Also, some of my blog entries came from transferring journal-like entries from my photo gallery. That would explain the discrepancy in the dates of my archive if you were following along.

Although some of my core reasons for blogging have been the same since the first incarnation, some are no longer as relevant.

  1. I love to write. In high school, I was editor of the school newspaper. I also like to read the news so that’s something else I would blog about.
  2. I like to educate and inform and give tips when I can. I learn a lot from people on the internet and I feel that I can give back by sharing what I know. I use Google Analytics to see the keywords that visitors use to find my blog. When I see the queries, it further encourages me to blog, knowing that I am providing information that people are seeking.
  3. I do less of this now, but when I started my first blog, it was mainly to rant, though I call it Venting. It was a nice release to purge the thoughts I had in my head. Now that I’m happier with life, I don’t really feel the need to rant that much, or have something bother me so much that I will remember to write about it.
  4. The reason why I started to blog again was that I actually had more to write about. In the intervening time, I became a member of Coppermine dev team where I contributed code, got two cats, new gadgets, and a Wii. Plus, Google continues to release cool products/features.
  5. You know how if you have a significant other–especially one who lives with you–you can tell them random thoughts at random times? Well, I don’t have that at home. Thus my blog is a way for me to express myself whenever and however.

Now, it’s my turn to blog-tag. I’m going to go beyond the SEO world since I don’t know any other SEO who hadn’t already been tagged (or tagged me).

Joachim Mueller: (“gaugau.de – noch eine unnötige Webseite”; aka GauGau, Coppermine’s project manager)

Dr. Tarique Sani: (“Tarique’s Travails: Shades of Darkness”; Coppermine developer)

Rich Jhong: (“The taste of Ho Ho Puffs “)

I feel like I should post a photo of my cat(s) and me the way Matt did when he wrote about why he blogged. His cats helped inspire me to get my own.

Most realistic spam comment I’ve seen

In addition to using Akismet to do the bulk of the filtering of comment spam, I moderate comments from new users. The past few days, several comments have made it through Akismet’s filters and I was notified to moderate them. Today, I saw one that seemed genuine:

I always have terrible trouble with comment-related plugins that require me to put some line in the comment loop; I can never seem to find the right spot. Can anyone tell me where I should put the php line in my comments loop? I haven not modified anything much, and I would be very grateful. Thanks!

Then, I looked at the url they listed. It was a tramadol subdomain on a free host. :P So I marked that comment as spam.

If you’re using Akismet, please make sure to check the url, too. No matter what the text says, you can discern the intention of the comment by looking at the url, because they’re trying to gain back links. We really need everyone who is using Akismet to be aware of this so that the filters don’t get tainted and let these comments through.

Update: I searched for the comment and it was a real one that showed up in February 2005: http://inner.geek.nz/archives/2005/01/12/wp-plugin-official-comments/

Man, how low will they go?