Dale's Weblog

looking back it was easy.
Saturday 04 September

User-agent: * Disallow: /

Posted by dale on Wed, 22 Sep 2004 20:17:53 EST

I dislike web crawlers (well any crawler other than googlebot). Today for the first time http://forums.dalegroup.net has received over 1gb of data traffic JUST on msnbot+jeteye.com (for this month). Jeteye is a bigger pain as is doesn’t identify itself as a bot. Grrr. About 30 minutes ago I added
User-agent: * Disallow: / into my robots file for the forums. The following is a reading of the bots currently:

Guest 22 Sep 2004 08:10 pm 22 Sep 2004 08:10 pm Viewing profile 64.71.144.39
Guest 22 Sep 2004 08:10 pm 22 Sep 2004 08:10 pm Music 207.46.98.127
Guest 22 Sep 2004 08:09 pm 22 Sep 2004 08:09 pm Babble Board 207.46.98.127
Guest 22 Sep 2004 08:09 pm 22 Sep 2004 08:09 pm PCs/Consoles/LANs etc 64.71.144.19
Guest 22 Sep 2004 08:09 pm 22 Sep 2004 08:09 pm Art/3d Art 64.71.144.61
Guest 22 Sep 2004 08:09 pm 22 Sep 2004 08:09 pm Babble Board 207.46.98.127
Guest 22 Sep 2004 08:08 pm 22 Sep 2004 08:08 pm Art/3d Art 64.71.144.71
Guest 22 Sep 2004 08:08 pm 22 Sep 2004 08:08 pm School 207.46.98.127
Guest 22 Sep 2004 08:08 pm 22 Sep 2004 08:08 pm Art/3d Art 64.71.144.61
Guest 22 Sep 2004 08:07 pm 22 Sep 2004 08:07 pm School 207.46.98.127
Guest 22 Sep 2004 08:07 pm 22 Sep 2004 08:07 pm Art/3d Art 64.71.144.21
Guest 22 Sep 2004 08:07 pm 22 Sep 2004 08:07 pm Art/3d Art 64.71.144.71
Guest 22 Sep 2004 08:07 pm 22 Sep 2004 08:07 pm School 207.46.98.127
Guest 22 Sep 2004 08:06 pm 22 Sep 2004 08:06 pm Babble Board 207.46.98.127
Guest 22 Sep 2004 08:06 pm 22 Sep 2004 08:06 pm School 64.71.144.25
Guest 22 Sep 2004 08:06 pm 22 Sep 2004 08:06 pm Art/3d Art 64.71.144.37
Guest 22 Sep 2004 08:05 pm 22 Sep 2004 08:05 pm Babble Board 207.46.98.127
Guest 22 Sep 2004 08:05 pm 22 Sep 2004 08:05 pm Searching forums 65.54.188.101
Guest 22 Sep 2004 08:05 pm 22 Sep 2004 08:05 pm PCs/Consoles/LANs etc 64.71.144.19

Why don’t they listen to the robots.txt file?! Argh pain. Now if I was to block the ip address of these bots it wouldn’t work/be bad for the following reasons:

1) I’ve done this before and they just come back on completely different ip address/subnets :|
2) I host other websites who would want their page crawled as to have them listed on search engines.

Before there was just msnbot but now there is jeteye too. Which is just as bad. Overall outbound traffic is just about to hit 3gb for this month and about 30% of this has gone to these bots! Dodgy!

Does anyone else have this problem? :( I’m going to have to move to another server with more bandwidth I think.

This posted was edited: Sun, 26 Sep 2004 10:06:05 EST
Below are the comments for this news item

stop complaining. or something like that

1: Comment by ucosty - Wed, 22 Sep 2004 20:45:13 EST


BBCode:

urls become clickable

[b]place text in bold[/b]

[i]place text in italics[/i]

[quote]place text in a quote[/quote]

Comments? Please note that all HTML tags are removed from your post.

The URI to TrackBack this entry is: http://www.dalegroup.net//early05/archive/blog/newsid/trackback/132

Comments

Message:

Name (optional):

Email (not shown):

WWW (optional):






Copyright © Michael Dale 2004. Page generated in 0.0078 seconds FreeBSD Powered
Background on style 5 is used from squidfingers How are we going? 3 queries