1 |
I've been experiencing spontaneous reboots on one gentoo machine |
2 |
lately. Looking thru /var/log/messages... I see the restarts but |
3 |
looking above that... I'm not seeing anything I recognize as being a |
4 |
culprit. |
5 |
|
6 |
Its been happening for a few weeks... but I've been busy and only now |
7 |
digging into it ( The machine is no kind of server ). |
8 |
|
9 |
It appears to only happen in X (I'm using xfce4) and I've only noticed |
10 |
it since I started running 2.6.28 kernels. Although I couldn't say |
11 |
that it seemed to be directly related. |
12 |
|
13 |
I mean I didn't boot into 2.6.28 and suddenly notice spontaneous |
14 |
rebooting. |
15 |
|
16 |
It does not appear to be heat realated... but I am only now using |
17 |
lm_sensors to keep an accurate record and see if there appears to be a |
18 |
relationship. |
19 |
|
20 |
I've had two today so either its happening more often or I'm just |
21 |
spending more time on that machine. |
22 |
|
23 |
It may also be on the first or second time its happened while I as |
24 |
actually right at the keyboard. |
25 |
|
26 |
I'm sorry to be so vague about it, but in truth, I've been pretty lazy |
27 |
about it... since no real harm comes of an unexpected reboot on that |
28 |
machine (so far anyway). But clearly something that has to be figured |
29 |
out. |
30 |
|
31 |
The only things I've checked so far... |
32 |
1) browsing thru /var/log/messages (Having trouble recognizing any |
33 |
thing that looks suspicious. |
34 |
|
35 |
I have noticed what appears to be a time/date anomaly where the |
36 |
progression of time is suddenly irregular. That is, an earlier |
37 |
time shows up amongst some later times. |
38 |
|
39 |
It appears to have been me sudoing to visudo. And apparently |
40 |
having /etc/sudoers open long enough for the closing of it to be |
41 |
earlier than other events taking place. |
42 |
|
43 |
Again ... I'm not real sure exactly what happened there but it |
44 |
does not appear to coincide with a reboot anyway. |
45 |
|
46 |
2) checking how hot the cpu is getting (Doesn't appear to be a |
47 |
problem) But now running a cron job recording temperatures every 10 |
48 |
minutes. So that may turn up something. |
49 |
|
50 |
3) checking for overfilled disks. (none show in df -h) |