Gentoo Archives: gentoo-server

From: Kerin Millar <kerin@×××××××××××××××.net>
To: gentoo-server@l.g.o
Subject: Re: [gentoo-server] kernel oops
Date: Thu, 30 Sep 2004 13:41:32
Message-Id: 1096551617.17917.104.camel@kerfy.r2r.local
In Reply to: [gentoo-server] kernel oops by Alex Efros
1 On Thu, 2004-09-30 at 15:54 +0300, Alex Efros wrote:
2 > Hi!
3 >
4 >
5 > My server hangs every 3-14 days without storing kernel oops message in logs
6 > (this is dedicated server at hosting, so I've no physical access to console).
7 > I've set up netconsole, and catch kernel oops by network on second server
8 > (error message below).
9 >
10 > These hangs happens on different kernel versions (current is 2.6.8-gentoo-r3).
11 > "SpiderAuto" process is my perl script which running using usual user account
12 > and 24x7 downloading websites (there number (3-7) of such scripts running
13 > doing parallel download of different websites).
14 >
15 > I suppose this is sort of "race condition" error related to huge number of
16 > simultaneous download requests...
17 >
18 > Any ideas how to fix/workaround this error? Maybe try another kernel source
19 > (I'm usually using gentoo-dev-sources)?
20
21 [snip]
22
23 Ouch, that's a nasty one. I suspect dabbling in sources variations will
24 not help a great deal because the gentoo-dev-sources are so lightly
25 patched in the first place. If anything, try 2.6.9-rc3. I performed a
26 cusory glance over the ChangeLog for "[NETFILTER]" and, while a few
27 patches have been applied, there was nothing that immediately suggested
28 that it would alleviate your problem.
29
30 Have you compiled in the ipchains/ipfwadm
31 (CONFIG_IP_NF_COMPAT_IPCHAINS / CONFIG_IP_NF_COMPAT_IPFWADM) support by
32 any chance? Apparently, it's rather buggy. For instance, this post
33 mentions a bug in find_appropriate_src() which only occurs when the
34 backward compatibility options are available:
35 http://www.gelato.unsw.edu.au/linux-ia64/0310/7353.html. See this also:
36 http://lists.netfilter.org/pipermail/netfilter-devel/2003-
37 October/012872.html.
38
39 Here's a description of the purpose of find_appropriate_src():
40 http://lists.netfilter.org/pipermail/netfilter-devel/2004-
41 March/014418.html.
42
43 If things persist, try stripping down you kernel to a bare-bones
44 configuration. Enable CONFIG_DEBUG_KERNEL, CONFIG_FRAME_POINTER and
45 CONFIG_MAGIC_SYSRQ. Avoid estoric options such as CONFIG_4KSTACKS,
46 CONFIG_REGPARM etc if possible. A futher comparative could be to also
47 avoid using modules where you know you need something (such as
48 iptable_nat and e1000).
49
50 You can emerge and use ksymoops to decode an oops message (such as the
51 one you provided in your post). I believe it works best if the system is
52 still operable after the oops, otherwise you can reboot and then decode
53 the message. This is the sort of thing that any hardcore kernel hacker
54 would need to see!
55
56 If nothing seems to resolve the problem then it might be best if you
57 prepare a slightly more detailed post for the netfilter mailing list.
58 You might also want to review the lists for related posts:
59 http://news.gmane.org/search.php?match=netfilter. You can use a
60 newsreader on news.gmane.org also I believe.
61
62 Good luck,
63
64 --Kerin Francis Millar

Replies

Subject Author
Re: [gentoo-server] kernel oops Alex Efros <powerman@×××××××.ua>