Gentoo Archives: gentoo-dev

From: Denis Shcherbakov <deniss@×××××××××.EDU>
To: gentoo-dev@g.o
Subject: [gentoo-dev] SMP system hard-halts, weird crashes - revisited... tests...
Date: Tue, 31 Dec 2002 02:27:13
Message-Id: Pine.GSO.4.44.0212302012040.16493-100000@yuma.Princeton.EDU
1 Hello, folks,
2
3 For those of you who remember my threads from before regarding X halts,
4 SMP kernel halts, etc, etc... I think I was able to track down the issue,
5 with Riyad's kind guidance :)
6
7 It turned out to be an issue with my power supply, I believe.
8
9 I did, however, run a lot of tests, about which some of you might be
10 interested to know. First, I thought it could be my IDE drives or the IDE
11 controller. So, I put a good Seagate SCSI drive in my system, did a fresh
12 install of Gentoo, and tried several versions of the kernels -
13 2.4.19-gentoo-r7, 2.4.19-gentoo-r10, 2.4.20-vanilla, and
14 2.5.52-development. The crash was reproducible with either kernel and the
15 development kernel was no more unstable than the others. After
16 reproducing the crashes without any devices hooked up to IDE or Floppy
17 controllers, it was clear that it wasn't a controller or the drive
18 problem. I also tested with each kernel whether it was a hyperthreading
19 issue or not, and it wasn't. This may seem like obvious junk, but it does
20 take quite a bit of time to test all of these configs out. By the end of
21 the week (yes, a week), it was clear that it's either the power supply or
22 the board itself.
23
24 There was an idea from others that my nVidia GeForce3 card could be
25 conflicting with the Tyan board, or drawing too much power or something,
26 but I proved that untrue by reproducing all the crashes in the text-only
27 mode.
28
29 I use a Tyan Thunder i860 S2603 dual board with 2.2-gig Intel Xeons, and
30 this board has received very good reviews in all the reviews I've seen.
31 Actually, Tyan makes excellent boards... Period. Power supply was an
32 easier and less expensive (and a more probable) thing to try :) I had a
33 460-W Zippy Emacs (Taiwanese) supply in my box, which came from the guys
34 who sold me this machine. I decided to go for the best this time and
35 purchase Antec's True550 EPS12V power supply for dual Xeon boards. My
36 Tyan board requires a 24-pin main Molex power connector and an auxiliary
37 8-pin Molex power connector. An EPS supply accomodates the number of
38 pins, but as I found out when I hooked it up, the pins are all in the
39 wrong places!! So I plug it in, turn it on... Silence. A few mins
40 of Googling yield a big DUH as to why I didn't do the search before.
41 S2603 is a non-EPS board, which means it needs an EPS-to-nonEPS converter,
42 which is sold my Enhance Electronics (out of California,
43 www.enhanceusa.com). Surely I could make one myself (I had schematics in
44 front of me), but most people who could have the tools (electrical
45 engineering department) were gone for Christmas! I didn't have any
46 Molex connectors or crimpers on hand, but Enhance Electronics gave a
47 nice schematic of the adapter, if anyone is curious. So I had to buy this
48 converter and now things look (and sound) pretty sweet. The system is up
49 and running. The problem seems to be gone, unless it was something else
50 (i.e. the board).
51
52 One more note for some of you who have run into such mysterious crashes
53 before. There's not a great deal of material about this on the net.
54 Apparently, these are mostly caused by low voltages or noise on the +5VSB
55 line from the power supply to the motherboard, which is a "standby"
56 voltage line. This certainly explains why my system wouldn't awaken from
57 sleeping in Windoze. :) This also explains why the system would halt
58 after performing a string of strenuous operations. I wonder why it halted
59 in the middle of strenuous operations, if it's really a VSB problem.
60 Maybe the power supply wasn't too good and the voltages would droop at
61 high load on other channels as well. I didn't check it with a meter.
62 No time.
63
64 It's never good to go cheap on power (or ram, or anything for that
65 matter). If one really doesn't want to replace a power supply, they can
66 put a capacitor between Common and +5VSB lines, which stabilizes the board
67 by eliminating such voltage droops and noise on the VSB line. The
68 capacitor I've seen used on the web was an electrolyte 6.3V rated 1000
69 micro-farad capacitor (although I wouldn't bank on it, it was hard to
70 tell from the photo what Farad units those were, but numbers were pretty
71 clear :)). The power cable extension with such a capacitor built-in is
72 sold by www.highpowersupply.com (JDResearch). They only sell this for the
73 usual 20-pin power connectors, not the 24-pin ones. I ordered it just to
74 see for myself which exactly capacitor is on it ;) THe principle is the
75 same for 24-pin lines, and one could trivially make this if they had a
76 soldering iron on hand and the right capacitor.
77
78 So, here are the pearls... Enjoy! :) This was a result of lots of
79 searches. So - if you get mysterious crashes and system halts that point
80 to other things than I/O devices, replace the power supply or try putting
81 a capacitor between +5VSB and Common to stabilize the board.
82
83 Gentoo rocks. :) Jeez, the installs are soo damn fast now that I have
84 half-the-clue as to what I'm doing in Gentoo :)
85
86 A brief note on the 2.5 dev kernel. It's real cool!! It compiles in a
87 flash, it loads in a flash, and I haven't run into any instabilities with
88 it yet!! It's absolutely blazing compared to 2.4.20 vanilla (or
89 2.4.19-gentoo, sorry :)) The only thing is, the nVidia kernel modules
90 don't compile with this kernel. The modules are now called *.ko rather
91 than *.o :)
92
93 Alright, this is all for now. Sorry for making this so long, but there's
94 so much to share. You all guru's have probably been thru most of this
95 already, so forgive me for insulting your intelligence with this, but it's
96 pretty exciting stuff to a novice like me!
97
98 All the best to everyone for the Holiday Season!
99 Denis
100
101 P.S. Riyad - Many thanks!! You rock!!
102
103 _________
104
105 Graduate Student
106 Chemical Engineering
107 Princeton University
108 Princeton, NJ 08544-5263
109
110
111
112 --
113 gentoo-dev@g.o mailing list