1 |
On Friday, 20 April 2018 12:55:13 BST Corbin Bird wrote: |
2 |
> Oak Ridge National Laboratory uses these processors ( Rhea Cluster ) and |
3 |
> has numerous heat failures. |
4 |
> |
5 |
> Due to poor cooling ... surprised? |
6 |
> |
7 |
> The cooling is not working right. Something is still wrong. |
8 |
> |
9 |
> On 04/19/2018 09:33 PM, R0b0t1 wrote: |
10 |
> > Dell Precision T7600, two 16 thread Xeons, 192GB of RAM, two Quadro |
11 |
> > cards and a Tesla card. |
12 |
> > |
13 |
> > The system is a few years old at this point. Old enough that the |
14 |
> > thermal compound could have hardened, which is why I replaced it. |
15 |
|
16 |
If the problem started suddenly, rather than getting progressively worse over |
17 |
time, it may have something to do with kernel drivers, or some change in |
18 |
firmware. |
19 |
|
20 |
If the cause is mechanical, I'd also suggest checking the heat sink contact |
21 |
surface. Some heat sinks are poorly manufactured and require flattening with |
22 |
wet 'n dry sandpaper to get a flat enough surface and improve their contact |
23 |
with the CPU. I've seen 15°C improvement in a Zalman CPU cooler after excess |
24 |
metal was removed from copper pipes, which were manufactured proud. Hardcore |
25 |
O/C's flatten the CPU too, but I'd avoid anything as radical because it can go |
26 |
badly wrong if you remove more than the surface varnish from the chip. |
27 |
|
28 |
In the interim, opening the side panel may also help in hot weather. |
29 |
|
30 |
-- |
31 |
Regards, |
32 |
Mick |