1 |
On Wed, 23 Jun 2004, Bart Braem wrote: |
2 |
|
3 |
> Okay the situation is as follows: we have a server running that uses |
4 |
> PHP to dynamically generate images with GD. That goes pretty fast, but |
5 |
> when someone uses spiders to download our entire site it just crashes |
6 |
> away because of hundreds of processes. |
7 |
> We did manage to stop the spidering by blocking their user-agents but |
8 |
> if some spider identifies itself as Mozilla we're out of business... |
9 |
> Another thing we did was installing MMCache and that solves some |
10 |
> problems but it's just a patch not a real solution... |
11 |
> So we consider using nptl to have less processes (the php processes |
12 |
> themselves are very small) and more threading. |
13 |
> Would that be a good idea? Do you have any other suggestions? |
14 |
|
15 |
Sorry for the late reply, I was driving from LA to Chicago the |
16 |
past few days. |
17 |
|
18 |
I do believe that GD is also thread safe so moving to threads |
19 |
might work well. There's a link on Apache's site listing libs that are |
20 |
known to be thread safe. I'd check that against the functionality you |
21 |
need. |
22 |
|
23 |
Unfortunately I don't believe that this will solve your problem |
24 |
completely. You may need to play games in your html to make it difficult |
25 |
for spidering to function, use iptables to limit the number of connections |
26 |
per IP... I think that's possible, or find some sort of mod_throttle |
27 |
functionality for Apache 2.0 as another poster mentioned. |
28 |
|
29 |
The iptables idea appears to be the cleanest implementation from |
30 |
my point of view. This link has some info to get you started and you might |
31 |
have heard of the author. |
32 |
|
33 |
http://www-106.ibm.com/developerworks/library/l-fw/ |
34 |
|
35 |
kashani |