1 |
On Monday 29 June 2009 19:04:44 Steve wrote: |
2 |
> Today my gentoo server that has sat happily churning my mundane (and |
3 |
> lightweight) tasks froze and I noticed when it stopped serving DNS |
4 |
> queries... and the server was even unresponsive from the command |
5 |
> prompt. I rebooted.... and was a bit taken aback at what I found. |
6 |
> |
7 |
> The server currently runs, but has a load of over 60, where I'd expect a |
8 |
> load of below 0.1. Investigations using top did not suggest that a |
9 |
> single process was using vast amounts of processing time... but there |
10 |
> were significantly more clamascan processes than I'd expect... and even |
11 |
> more procmail processes.... |
12 |
> |
13 |
> -- |
14 |
> $ ps auwx | grep clamscan | grep -v grep | wc -l |
15 |
> 42 |
16 |
> $ ps auwx | grep procmail | grep -v grep | wc -l |
17 |
> 94 |
18 |
> $ ps auwx | grep clamassassin | grep -v grep | wc -l |
19 |
> 55 |
20 |
> -- |
21 |
> |
22 |
> The first few lines from top say: |
23 |
> |
24 |
> -- |
25 |
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND |
26 |
> 15451 usr 20 0 35944 33m 872 D 2.7 3.3 0:00.60 clamscan |
27 |
> 216 root 15 -5 0 0 0 S 0.7 0.0 0:03.80 kswapd0 |
28 |
> 15116 usr 20 0 76136 15m 668 D 0.7 1.6 0:03.30 clamscan |
29 |
> 15299 usr 20 0 2584 1224 840 R 0.7 0.1 0:04.36 top |
30 |
> 15428 usr 20 0 61288 57m 872 D 0.7 5.7 0:01.38 clamscan |
31 |
> 1 root 20 0 1648 196 172 S 0.0 0.0 0:00.64 init |
32 |
> 2 root 15 -5 0 0 0 S 0.0 0.0 0:00.00 kthreadd |
33 |
> -- |
34 |
> |
35 |
> The procmail configuration I've adopted hasn't changed in years... |
36 |
> -- |
37 |
> DEFAULT=$HOME/.maildir/ |
38 |
> SHELL=/bin/sh |
39 |
> MAILDIR=$HOME/.maildir |
40 |
> |
41 |
> :0fw |
42 |
> |
43 |
> * < 1024000 |
44 |
> |
45 |
> | /usr/bin/clamassassin | /usr/bin/spamc -f |
46 |
> |
47 |
> -- |
48 |
> |
49 |
> I'm assuming that my suddenly starting to have problems with this is |
50 |
> something to do with an update to clamd/clamassassin... I've a vague |
51 |
> recollection that one or the other of them might have been updated when |
52 |
> I last synchronised and emerged updates... but I can't remember. |
53 |
> |
54 |
> Any ideas? This isn't a heavily loaded server usually - I've more |
55 |
> procmail processes than I usually receive in emails in an hour. |
56 |
> Something's wrong - can anyone offer any hints? Has anyone else run |
57 |
> into this problem? Is there a known 'quick fix'? |
58 |
|
59 |
Looks like you have 200 processes sitting there blocking I/O. Is there |
60 |
anything related in the logs? |
61 |
|
62 |
Your best bet is to examine emerge.log (better still - genlop) and find all |
63 |
recent upgrades that might affect this. Then roll them back one by one till |
64 |
the problem goes away. Once you know the errant package, we can start to |
65 |
examine diffs and see why it might behave like that. |
66 |
|
67 |
-- |
68 |
alan dot mckinnon at gmail dot com |