1 |
>>>> My web server's response time for http requests skyrockets every |
2 |
>>>> weekday between about 9am and 5pm. I've gone over my munin graphs and |
3 |
>>>> the only one that really correlates well with the slowdown is "TCP |
4 |
>>>> Queuing". It looks like I normally have about 400 packets per second |
5 |
>>>> graphed as "direct copy from queue" in munin throughout the day, but 2 |
6 |
>>>> to 3.5 times that many are periodically graphed during work hours. I |
7 |
>>>> don't see the same pattern at all from the graph of all traffic on my |
8 |
>>>> network interface which actually peaks over the weekend. TCP Queuing |
9 |
>>>> doesn't rise above 400 packets per second all weekend. This is |
10 |
>>>> consistent week after week. |
11 |
>>>> |
12 |
>>>> My two employees come into work during the hours in question, and they |
13 |
>>>> certainly make frequent requests of the web server while at work, but |
14 |
>>>> if their volume of requests were the cause of the problem then that |
15 |
>>>> would be reflected in the graph of web server requests but it is not. |
16 |
>>>> I do run a small MTU on the systems at work due to the config of the |
17 |
>>>> modem/router we have there. |
18 |
>>>> |
19 |
>>>> Is this a recognizable problem to anyone? |
20 |
>>> |
21 |
>>> |
22 |
>>> I'm in the midst of this. Are there certain attacks I should check for? |
23 |
>> |
24 |
>> |
25 |
>> It looks like the TCP Queuing spike itself was due to imapproxy which |
26 |
>> I've now disabled. I'll post more info as I gather it. |
27 |
> |
28 |
> |
29 |
> imapproxy was clearly affecting the TCP Queuing graph in munin but I |
30 |
> still ended up with a massive TCP Queuing spike today and |
31 |
> corresponding http response time issues long after I disabled |
32 |
> imapproxy. Graph attached. I'm puzzled. |
33 |
|
34 |
|
35 |
I just remembered that our AT&T modem/router does not respond to |
36 |
pings. My solution is to move PPPoE off of that device and onto my |
37 |
Gentoo router so that pings pass through the AT&T device to the Gentoo |
38 |
router but I haven't done that yet as I want to be on-site for it. |
39 |
Could that behavior somehow be contributing to this problem? There |
40 |
does seem to be a clear correlation between user activity at that |
41 |
location and the bad server behavior. |
42 |
|
43 |
- Grant |