Gentoo Archives: gentoo-user

From: Matt Garman <garman@××××××××××.net>
To: gentoo-user <gentoo-user@l.g.o>
Subject: [gentoo-user] random, hard lockups
Date: Thu, 07 Jul 2005 19:51:33
Message-Id: 20050707194438.GA9708@raw-sewage.net
1 My system has been experiencing random, hard (must physically
2 reboot) lockups over the last year or so. The lockups are thus far
3 completely unpredictable, and it always occurs when I'm not at my
4 computer (during the night, at work, etc). When the computer goes
5 into this hard lock up state, the monitor is blank (but not in power
6 save mode); the computer will respond to pings; I cannot ssh into
7 the computer.
8
9 I just ran 14 hours of memtest86+ and found no errors.
10
11 I also checked the logs---nothing unusual there (I can't even
12 pinpoint exactly when the lockups occur).
13
14 Even worse, my computer may be fine for weeks or even months (i.e.
15 completely stable), then suddently start locking up about once a
16 day.
17
18 Does anyone have any idea what the problem may be? For what it's
19 worth, I have a very high ERR count in /proc/interrupts:
20
21 # uptime
22 08:58:35 up 1:29, 12 users, load average: 1.22, 1.28, 1.20
23
24 # cat /proc/interrupts
25 CPU0
26 0: 5391962 XT-PIC timer
27 1: 3486 XT-PIC i8042
28 2: 0 XT-PIC cascade
29 5: 481356 XT-PIC sym53c8xx, NVidia nForce2, ohci1394
30 8: 2 XT-PIC rtc
31 9: 0 XT-PIC acpi
32 10: 0 XT-PIC ohci_hcd
33 11: 534284 XT-PIC sym53c8xx, ohci_hcd, ehci_hcd, eth0, nvidia
34 12: 115771 XT-PIC i8042
35 14: 473 XT-PIC ide0
36 15: 11 XT-PIC ide1
37 NMI: 0
38 LOC: 5391944
39 ERR: 33336
40 MIS: 0
41
42
43 Note that the machine has only been up for 90 minutes and it's
44 already logged 33k ERRs (though I don't exactly know what that
45 means, my other to nforce2 boards have a zero ERR count).
46
47 For what it's worth, this computer has the following hardware: Asus
48 A7N8X Deluxe, AMD Athlon XP 2500 (Barton core), 2x512 MB RAM,
49 GeForce4 ti4200 AGP 8x video card, LSI Logic SCSI controller,
50 Fujitsu SCSI Drive, Samsung IDE drive.
51
52 Another idea, I see the following in my dmesg:
53
54
55 PCI: Using ACPI for IRQ routing
56 ** PCI interrupts are no longer routed automatically. If this
57 ** causes a device to stop working, it is probably because the
58 ** driver failed to call pci_enable_device(). As a temporary
59 ** workaround, the "pci=routeirq" argument restores the old
60 ** behavior. If this argument makes the device work again,
61 ** please email the output of "lspci" to bjorn.helgaas@××.com
62 ** so I can fix the driver.
63
64 In my kernel config, I have Processor Type and Features -> Local
65 APIC support on unicprocessors and IO-APIC support on unicprocessors
66 both enabled. However, as you can see above, the kernel is still
67 using XT-PIC. My other two nforce2 boards (with the same kernel
68 config) use IO-APIC. I'm not sure exactly what all this means, but
69 it may mean something to somebody. :)
70
71 Thanks for any help or suggestions!
72 Matt
73
74 p.s. I'd be happy to post my complete dmesg if anyone would like to
75 see it. --MG
76
77 --
78 Matt Garman
79 email at: http://raw-sewage.net/index.php?file=email
80 --
81 gentoo-user@g.o mailing list

Replies

Subject Author
Re: [gentoo-user] random, hard lockups "Brett I. Holcomb" <brettholcomb@×××××××××.net>
Re: [gentoo-user] random, hard lockups Richard Fish <bigfish@××××××××××.org>
Re: [gentoo-user] random, hard lockups "W.Kenworthy" <billk@×××××××××.au>