Gentoo Archives: gentoo-amd64

From: Mark Knecht <markknecht@×××××.com>
To: Gentoo AMD64 <gentoo-amd64@l.g.o>
Subject: Re: [gentoo-amd64] Capturing hard hang info?
Date: Sun, 20 Oct 2013 00:25:30
Message-Id: CAK2H+ec8Z=iVDH7k2=OV50281u9wG1F1z8TNLk7AKv9ySC9c0Q@mail.gmail.com
In Reply to: Re: [gentoo-amd64] Capturing hard hang info? by Rich Freeman
1 On Sat, Oct 19, 2013 at 4:49 PM, Rich Freeman <rich0@g.o> wrote:
2 > On Sat, Oct 19, 2013 at 6:01 PM, Mark Knecht <markknecht@×××××.com> wrote:
3 >> No magic sys request keys, keyboard and
4 >> mouse are dead, cannot shell in or even ping from another machine on
5 >> the network.
6 >
7 > These types of situations are really annoying to debug. Do you get
8 > anything on the console? Try leaving at a text console with no screen
9 > saver so that you have a chance to see any panic message/etc that
10 > might be left there. If you have something set to put your monitor to
11 > sleep then after the panic your system will not wake up.
12 >
13
14 OK, it's a good idea just to have a Konsole terminal open. That might
15 catch something. Only issue is I'm running KDE, 6 desktops, 2
16 monitors, so I need to make sure it's always visible and always on
17 top.
18
19 > Serial console is another option, albeit not exactly convenient.
20 >
21
22 OK, so I remember years ago debugging something for Ingo Molnar using
23 the serial console, but in those days it was a real serial console on
24 a real serial port. None of my machine have those ports anymore. There
25 must be a more modern version of doing that. I'll go look for info.
26 Ethernet? USB? We've recently moved and the only other machine I've
27 got here at the apartment is a Gentoo laptop.
28
29 > I have on my blog somewhere instructions for setting up kdump, but to
30 > be honest with recent kernel versions it hasn't been working (that
31 > could have changed). You can configure your kernel to auto-reboot to
32 > a panic kernel which you can then use to dump core to disk, then you
33 > can reboot back into your normal system to examine it at your leisure.
34 > That should tell you what was going on when it crashed, but only if
35 > the kernel actually detected a panic (usually it does).
36 >
37
38 There's a gentoo.wiki.org page here:
39
40 http://wiki.gentoo.org/wiki/Kernel_Crash_Dumps
41
42 The setup looks reasonably straight forward so I've reconfigured
43 3.10.17 following those instructions.
44
45 One question for now. In the Kernel Hacking section there's an option
46 for "Detect Hard and Soft Lockups" which on the surface looks like a
47 good thing to turn on but it's not mentioned in these instructions.
48 When turned on it has options for Panic (Reboot) for both types. Seems
49 like I probably want that all turned on?
50
51 Comments?
52
53
54 > Note that logs are useless in a panic (unless you're using kdump) as
55 > the kernel will not write anything to disk following a panic. If you
56 > get an oops/bug you might or might not get anything in your logs
57 > depending on whether it affected the filesystem/disk/etc subsystems.
58 > If the kernel knows its internals are scrambled the last thing you
59 > want it doing is trying to write to your filesystems. With kdump it
60 > does a reboot into a new kernel which fully re-initializes everything
61 > and then dumps ram safely to disk.
62 >
63 > Rich
64 >
65
66 As I expected about the logs. If the machine's dead then I don't want
67 stuff getting written to disk anyway. kdump sounds like the best
68 solution going right now. I'll try and see if I can get it working.
69
70 Thanks very much Rich! Great ideas.
71
72 Cheers,
73 Mark

Replies

Subject Author
Re: [gentoo-amd64] Capturing hard hang info? Rich Freeman <rich0@g.o>