Gentoo Archives: gentoo-user

From: Grant <emailgrant@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Re: Dealing with scrapers - Help!
Date: Sat, 09 Aug 2008 16:48:52
Message-Id: 49bf44f10808090948l435276aap348bc17f56f46572@mail.gmail.com
In Reply to: Re: [gentoo-user] Re: Dealing with scrapers - Help! by Mick
1 >> > My apache web server has been very slow lately and webalizer charts
2 >> > show page accesses at 5x normal with other stats normal. I'm thinking
3 >> > scrapers? How do you guys deal with this? Do you identify the IP
4 >> > (how?) and ban it (how?)?
5 >> >
6 >> > - Grant
7 >>
8 >> I used netstat to identify the IP and I see that I can use it with
9 >> "deny from" in httpd.conf. It seems to be over now, but this type of
10 >> thing happens periodically. How can I be alerted to this type of
11 >> situation when it starts so I can block the IP right away?
12 >
13 > You will need to configure quotas probably using something like:
14 >
15 > http://www.howtoforge.com/mod_cband_apache2_bandwidth_quota_throttling
16 >
17 > Not sure if it is possible to differentiate between rogue and legit clients,
18 > other than by checking your logs to see what was blocked.
19
20 Turns out it was a "legit" bot. Watch out for this one:
21
22 Mozilla/5.0 (compatible; discobot/1.0;
23 +http://discoveryengine.com/discobot.html)
24
25 It's bad that a single IP can bring down my http isn't it?
26
27 - Grant