Gentoo Archives: gentoo-amd64

From: Rich Freeman <rich0@g.o>
To: gentoo-amd64@l.g.o
Subject: Re: [gentoo-amd64] Capturing hard hang info?
Date: Sun, 20 Oct 2013 01:18:49
Message-Id: CAGfcS_ne11hSHZCSezHVJrkmTKLXnfU0Mu-mA8_bNkpwdMdtdg@mail.gmail.com
In Reply to: Re: [gentoo-amd64] Capturing hard hang info? by Mark Knecht
1 On Sat, Oct 19, 2013 at 8:25 PM, Mark Knecht <markknecht@×××××.com> wrote:
2 > OK, it's a good idea just to have a Konsole terminal open. That might
3 > catch something.
4
5 I'm not sure if panics show up in konsole. With a virtual console the
6 kernel actually outputs the message. Konsole under X11 is entirely
7 user-mode and I'm not sure that ANY user-mode code can ever run after
8 a panic.
9
10 I think a virtual console is a better bet.
11
12 > OK, so I remember years ago debugging something for Ingo Molnar using
13 > the serial console, but in those days it was a real serial console on
14 > a real serial port. None of my machine have those ports anymore. There
15 > must be a more modern version of doing that. I'll go look for info.
16 > Ethernet? USB? We've recently moved and the only other machine I've
17 > got here at the apartment is a Gentoo laptop.
18
19 That you'd have to look into. I'm not sure if the kernel can handle a
20 serial console on a PL2302/etc. It might - it is all kernel-mode I
21 think. You'd have to attach it to another device running a terminal
22 emulator, assuming you don't have a vt100/etc lying around.
23
24 > There's a gentoo.wiki.org page here:
25 >
26 > http://wiki.gentoo.org/wiki/Kernel_Crash_Dumps
27 >
28 > The setup looks reasonably straight forward so I've reconfigured
29 > 3.10.17 following those instructions.
30
31 Yeah, I forgot - that was actually started based on my blog entry,
32 actually. It may very well have been improved on since.
33
34 >
35 > One question for now. In the Kernel Hacking section there's an option
36 > for "Detect Hard and Soft Lockups" which on the surface looks like a
37 > good thing to turn on but it's not mentioned in these instructions.
38
39 Probably not a bad idea.
40
41
42 > When turned on it has options for Panic (Reboot) for both types. Seems
43 > like I probably want that all turned on?
44
45 You could try setting it to no and see if you actually can capture any
46 meaningful logs that way - there is a chance you could recover your
47 system without rebooting. However, a panic would be the only real
48 sure way to ensure a dump.
49
50 Oh, and don't forget that there is a magic sysrq that triggers a
51 panic. Only issue with that is that you'll have to hunt around for
52 whatever caused the actual hangup because it won't be in the panic
53 backtrace (that will just lead you to the sysrq code).
54
55 > As I expected about the logs. If the machine's dead then I don't want
56 > stuff getting written to disk anyway. kdump sounds like the best
57 > solution going right now. I'll try and see if I can get it working.
58
59 Yeah - one of these days I'll see if I can get kdump working again.
60 What it really needs is an initramfs that will automatically capture
61 the dump and reboot. That's how other distros handle it. The dumps
62 are pretty big though - the size of your RAM.
63
64 If you get a dump there are a bunch of tools that can be used to analyze it.
65
66 Rich

Replies

Subject Author
Re: [gentoo-amd64] Capturing hard hang info? Mark Knecht <markknecht@×××××.com>