Posted in News at 12:19 pm March 17, 2007

Colorado Woman Sues To Hold Web Crawlers To Contracts

The Internet Archive, archive.org, goes around the web and stores copies of websites indefinitely. This is a useful resource when the information is no longer available and you want to view a copy of it. It’s not so great for webmasters who want to have no evidence of old content, either to maintain privacy or avoid embarrassment, etc.

Personally, I can relate to how that woman feels about archive.org archiving my site contents, of which I felt embarrassed because I was a little silly when I started out on the web. Instead of suing them, though, I did what any reasonable webmaster would do. I added the following to that site’s robots.txt:

User-agent: ia_archiver
Disallow: /

Consequently, that site can’t be found in archive.org. You can’t see my Pooh bear-adorned website.

This woman’s lawsuit is ridiculous and frivolous. I hope it gets thrown out and she is forced to pay archive.org’s legal fees.

I got this story from slashdot, where I like to read the comments: http://yro.slashdot.org/article.pl?sid=07/03/17/1455214

Someone at slashdot found the site: profane-justice.org*. A whois lookup confirmed it is the same woman.

* This marks the first time that I have manually added nofollow code to my blog post. I initially wasn’t going to hyperlink it so she didn’t get credit, but then I didn’t want to inconvenience my visitors. I decided to hyperlink it but switch to html mode and added rel=”nofollow”.

Update: it looks like archive.org has indeed removed the site: http://web.archive.org/web/*/profane-justice.org/ is showing “Blocked Site Error.” FYI, if a site did the correct thing by setting up the robots file, the message would say: “Robots.txt Query Exclusion.”

One Response to “Woman Sues Archive.org”

  1. billy (1 comments) said,

    http://thetruthistold.com/

    October 28, 2007 - Sunday at 2:23 am

Trackback URI | Comments RSS

Leave a Comment Below or  Send Private Reply

* This blog will not display your email address. It will not be used for any reason unless it is necessary to aid in communications pertaining specifically to the post.

Comments are subject to moderation. Please do not submit your comment twice -- it will appear if approved.