Gentoo Archives: gentoo-sparc

From: Ferris McCormick <fmccor@g.o>
To: gentoo-sparc@g.o
Subject: [gentoo-sparc] Fw: NMI watchdog...
Date: Fri, 30 Jan 2009 03:05:51
Message-Id: 20090130003409.6a1c493d@anaconda.krait.us
1 For those who do not read the sparclinux list. This is very nice, and
2 many thanks to David Miller for providing it.
3
4 Begin forwarded message:
5
6 Date: Thu, 29 Jan 2009 15:54:12 -0800 (PST)
7 From: David Miller <davem@×××××××××.net>
8 To: sparclinux@×××××××××××.org
9 Subject: NMI watchdog...
10
11
12
13 I just wanted to let folks know what I've been working on, sparc wise.
14
15 I have this reocurring issue where one of my workstations hangs
16 completely, no keyboard input, no console messages, nothing.
17
18 Since we have pseudo-NMI support in oprofile via performance counters
19 in the current tree I worked on rearchitecting this so that a nice NMI
20 watchdog layer could be added.
21
22 It is modelled after the x86 NMI watchdog, with the major difference
23 being that it is enabled by default. The cost is one interrupt per
24 second, and the payback is enormous wrt. the ability to debug complete
25 system hangs.
26
27 Basically how it works is if we see no timer interrupts processed for
28 5 seconds we print a message, dump registers, and optionally panic the
29 system.
30
31 This will be supported on any system that has profiling counter
32 overflow interrupt support. That essentially means any cpu from
33 UltraSPARC-III onward (including Niagara chips).
34
35 Another nice side effect of this work is that it gives us some of the
36 framework necessary for whatever generic performance counter layer
37 gets merged into the tree in the future (Ingo Molnar's work, perfmon3,
38 whatever).
39
40 I noticed while doing these changes that we need some work in the
41 handling of OOPSes and other errors. In particular we need to start
42 using the existing generic infrastructure the kernel provides, such as
43 oops_enter(), oops_exit(), bust_spinlocks(), etc. I do intend to work
44 on this.
45
46 I'm currently busy doing testing to make sure that the NMI watchdog
47 and oprofile work as expected.
48
49 I'll post the patches when I check them in. I intend to push this
50 into the current stable tree because there are entire classes of bugs
51 people run into which can't be analyzed at all without this kind of
52 facility.
53 --
54 To unsubscribe from this list: send the line "unsubscribe sparclinux" in
55 the body of a message to majordomo@×××××××××××.org
56 More majordomo info at http://vger.kernel.org/majordomo-info.html
57
58 ======================================================
59
60 Regards,
61 Ferris
62 --
63 Ferris McCormick (P44646, MI) <fmccor@g.o>
64 Developer, Gentoo Linux (Sparc, Userrel, Trustees)

Attachments

File name MIME type
signature.asc application/pgp-signature