Gentoo Logo
Gentoo Spaceship




Note: Due to technical difficulties, the Archives are currently not up to date. GMANE provides an alternative service for most mailing lists.
c.f. bug 424647
List Archive: gentoo-sparc
Navigation:
Lists: gentoo-sparc: < Prev By Thread Next > < Prev By Date Next >
Headers:
To: gentoo-sparc@g.o
From: Ferris McCormick <fmccor@g.o>
Subject: Fw: NMI watchdog...
Date: Fri, 30 Jan 2009 00:34:09 +0000
For those who do not read the sparclinux list.  This is very nice, and
many thanks to David Miller for providing it.

Begin forwarded message:

Date: Thu, 29 Jan 2009 15:54:12 -0800 (PST)
From: David Miller <davem@...>
To: sparclinux@...
Subject: NMI watchdog...



I just wanted to let folks know what I've been working on, sparc wise.

I have this reocurring issue where one of my workstations hangs
completely, no keyboard input, no console messages, nothing.

Since we have pseudo-NMI support in oprofile via performance counters
in the current tree I worked on rearchitecting this so that a nice NMI
watchdog layer could be added.

It is modelled after the x86 NMI watchdog, with the major difference
being that it is enabled by default.  The cost is one interrupt per
second, and the payback is enormous wrt. the ability to debug complete
system hangs.

Basically how it works is if we see no timer interrupts processed for
5 seconds we print a message, dump registers, and optionally panic the
system.

This will be supported on any system that has profiling counter
overflow interrupt support.  That essentially means any cpu from
UltraSPARC-III onward (including Niagara chips).

Another nice side effect of this work is that it gives us some of the
framework necessary for whatever generic performance counter layer
gets merged into the tree in the future (Ingo Molnar's work, perfmon3,
whatever).

I noticed while doing these changes that we need some work in the
handling of OOPSes and other errors.  In particular we need to start
using the existing generic infrastructure the kernel provides, such as
oops_enter(), oops_exit(), bust_spinlocks(), etc.  I do intend to work
on this.

I'm currently busy doing testing to make sure that the NMI watchdog
and oprofile work as expected.

I'll post the patches when I check them in.  I intend to push this
into the current stable tree because there are entire classes of bugs
people run into which can't be analyzed at all without this kind of
facility.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html

======================================================

Regards,
Ferris
--
Ferris McCormick (P44646, MI) <fmccor@g.o>
Developer, Gentoo Linux (Sparc, Userrel, Trustees)
Attachment:
signature.asc (PGP signature)
Navigation:
Lists: gentoo-sparc: < Prev By Thread Next > < Prev By Date Next >
Previous by thread:
tcunha is now the Sparc AT Subproject Lead
Next by thread:
Fw: NMI watchdog...
Previous by date:
tcunha is now the Sparc AT Subproject Lead
Next by date:
Fw: NMI watchdog...


Updated Jun 17, 2009

Summary: Archive of the gentoo-sparc mailing list.

Donate to support our development efforts.

Copyright 2001-2013 Gentoo Foundation, Inc. Questions, Comments? Contact us.