1 |
> It turned out this was a combination of two problems which made it |
2 |
> much more difficult to figure out. |
3 |
> |
4 |
> First of all I didn't have enough apache2 processes. That seems like |
5 |
> it should have been obvious but it wasn't for two reasons. Firstly, |
6 |
> my apache2 processes are always idle or nearly idle, even when traffic |
7 |
> levels are high. But it must be the case that each request made to |
8 |
> nginx which is then handed off to apache2 monopolizes an apache2 |
9 |
> process even though my backend application server is the one using all |
10 |
> the CPU instead of apache2. The other thing that made it difficult to |
11 |
> track down was the way munin graphs apache2 processes. On my graph, |
12 |
> busy and free processes only appeared as tiny dots at the bottom |
13 |
> because apache2's ServerLimit is drawn on the same graph which is many |
14 |
> times greater than the number of busy and free processes. It would be |
15 |
> better to draw MaxClients instead of ServerLimit since I think |
16 |
> MaxClients is more likely to be tuned. It at least appears in the |
17 |
> default config file on Gentoo. Since busy and free apache2 processes |
18 |
> were virtually invisible on the munin graph, I wasn't able to |
19 |
> correlate their ebb and flow with my server's response times. |
20 |
> |
21 |
> Once I fixed the apache2 problem, I was sure I had it nailed. That's |
22 |
> when I emailed here a few days ago to say I think I got it. But it |
23 |
> turned out there was another problem and that was Odoo (formerly known |
24 |
> as OpenERP) which is also running in a reverse proxy configuration |
25 |
> behind nginx. Whenever someone uses Odoo on my server, it absolutely |
26 |
> destroys performance for my non-Odoo website. That would have been |
27 |
> really easy to test and I did test stopping the odoo service early on, |
28 |
> but I ruled it out when the problem persisted after stopping Odoo |
29 |
> which I now realize must have been because of the apache2 problem. |
30 |
|
31 |
|
32 |
The root of the Odoo problem was that I didn't have keepalive enabled |
33 |
between the nginx reverse proxy server and the Odoo server. nginx |
34 |
enables keepalive by default for the client side (HTTP/1.1) but not |
35 |
for the upstream side (HTTP/1.0). I still see TCP Queuing spikes in |
36 |
munin with Odoo usage, but they no longer slow down the apache2/nginx |
37 |
reverse proxy running my main site. |
38 |
|
39 |
- Grant |