Colorado Woman Sues To Hold Web Crawlers To Contracts
The Internet Archive, archive.org, goes around the web and stores copies of websites indefinitely. This is a useful resource when the information is no longer available and you want to view a copy of it. It’s not so great for webmasters who want to have no evidence of old content, either to maintain privacy or avoid embarrassment, etc.
Personally, I can relate to how that woman feels about archive.org archiving my site contents, of which I felt embarrassed because I was a little silly when I started out on the web. Instead of suing them, though, I did what any reasonable webmaster would do. I added the following to that site’s robots.txt:
User-agent: ia_archiver
Disallow: /
Consequently, that site can’t be found in archive.org. You can’t see my Pooh bear-adorned website.
This woman’s lawsuit is ridiculous and frivolous. I hope it gets thrown out and she is forced to pay archive.org’s legal fees.
I got this story from slashdot, where I like to read the comments: http://yro.slashdot.org/article.pl?sid=07/03/17/1455214
Someone at slashdot found the site: profane-justice.org*. A whois lookup confirmed it is the same woman.
* This marks the first time that I have manually added nofollow code to my blog post. I initially wasn’t going to hyperlink it so she didn’t get credit, but then I didn’t want to inconvenience my visitors. I decided to hyperlink it but switch to html mode and added rel=”nofollow”.
Update: it looks like archive.org has indeed removed the site: http://web.archive.org/web/*/profane-justice.org/ is showing “Blocked Site Error.” FYI, if a site did the correct thing by setting up the robots file, the message would say: “Robots.txt Query Exclusion.”