1 |
On 05/23/15 20:52, Zhu Sha Zang wrote: |
2 |
>On 05/23/2015 06:53 PM, Joseph wrote: |
3 |
>> On 05/23/15 18:08, Zhu Sha Zang wrote: |
4 |
>>> On 05/23/2015 05:24 PM, Joseph wrote: |
5 |
>>>> I have a box in a remote location (8-core CPU) and it turn itself off |
6 |
>>>> during compiling |
7 |
>>>> |
8 |
>>>> The box it connected to UPS. Is it power supply? |
9 |
>>>> |
10 |
>>> |
11 |
>>> Maybe. I have a problem like that when using high processing simulation |
12 |
>>> with nvidia-cuda and the power supply protection was unable to keep a |
13 |
>>> safe energy level then the system goes off. |
14 |
>>> |
15 |
>>> But, if the failure happens during compilation time can be a heat |
16 |
>>> problem. Install lm_sensors and use something like that: "watch -n 1 |
17 |
>>> sensors". |
18 |
>>> |
19 |
>>> If not, if the temperature stay at safe levels, maybe you have a RAM |
20 |
>>> corruption. In this case, you'll need to use memtest86++ to check. |
21 |
>>> |
22 |
>>> Good Luck |
23 |
>> |
24 |
>> I tried to read the lm-sensors again and the compupter turn crash with |
25 |
>> the readings: |
26 |
>> |
27 |
>> fan1: 0 RPM (min = 10 RPM) ALARM |
28 |
>> fan2: 0 RPM (min = 0 RPM) |
29 |
>> fan3: 0 RPM (min = 0 RPM) |
30 |
>> fan5: 0 RPM (min = 0 RPM) |
31 |
>> temp1: +47.0°C (low = +127.0°C, high = +127.0°C) sensor = |
32 |
>> thermistor |
33 |
>> temp2: +106.0°C (low = +127.0°C, high = +70.0°C) sensor = |
34 |
>> thermal diode |
35 |
>> temp3: +106.0°C (low = +127.0°C, high = +127.0°C) sensor = |
36 |
>> thermistor |
37 |
>> cpu0_vid: +1.250 V |
38 |
>> |
39 |
>> I'm suspecting it is power supply. |
40 |
>> |
41 |
> |
42 |
>Hey, did you run "sensors-detect" and "/etc/init.d/lm_sensors" as root |
43 |
>before use "sensors"? |
44 |
> |
45 |
>As was said, maybe you're using wrong kernel modules. |
46 |
|
47 |
I went to pickup the remote box and look at it; the CPU fan stop working. The CPU heat sink is big so in idle mode it could keep up with cooling it but under heavy |
48 |
load "compiling anything" the CPU was overheating. |
49 |
|
50 |
-- |
51 |
Joseph |